Full Paper View Go Back

Manipulation of Email Data Using Machine Learning and Data Visualization

Vishal Verma1 , Anurag Sinha2

Section:Research Paper, Product Type: Journal-Paper
Vol.8 , Issue.5 , pp.54-63, Oct-2020


Online published on Oct 31, 2020


Copyright © Vishal Verma, Anurag Sinha . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
 

View this paper at   Google Scholar | DPI Digital Library


XML View     PDF Download

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Vishal Verma, Anurag Sinha, “Manipulation of Email Data Using Machine Learning and Data Visualization,” International Journal of Scientific Research in Computer Science and Engineering, Vol.8, Issue.5, pp.54-63, 2020.

MLA Style Citation: Vishal Verma, Anurag Sinha "Manipulation of Email Data Using Machine Learning and Data Visualization." International Journal of Scientific Research in Computer Science and Engineering 8.5 (2020): 54-63.

APA Style Citation: Vishal Verma, Anurag Sinha, (2020). Manipulation of Email Data Using Machine Learning and Data Visualization. International Journal of Scientific Research in Computer Science and Engineering, 8(5), 54-63.

BibTex Style Citation:
@article{Verma_2020,
author = {Vishal Verma, Anurag Sinha},
title = {Manipulation of Email Data Using Machine Learning and Data Visualization},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {10 2020},
volume = {8},
Issue = {5},
month = {10},
year = {2020},
issn = {2347-2693},
pages = {54-63},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=2104},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=2104
TI - Manipulation of Email Data Using Machine Learning and Data Visualization
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - Vishal Verma, Anurag Sinha
PY - 2020
DA - 2020/10/31
PB - IJCSE, Indore, INDIA
SP - 54-63
IS - 5
VL - 8
SN - 2347-2693
ER -

241 Views    464 Downloads    82 Downloads
  
  

Abstract :
E-mail has become one of the essential economic for all forms of communication in today`s life. The rise in the users of email has drastically increased the data set of the email available on the one tap over the internet. In this paper we will propose an algorithm based on machine learning which will classify the email based on its subject. We have used several machines learning algorithms classifier Such as SVM classifier, neural network classifier. However people mostly prefer email to be as a communication for business and other personal purposes. Application of the emails has been used everywhere in education, corporate, business and so on. With the Rise of the data set of the email it’s generate a Corpus with itself which can be used as a different categorization through which we will classify the email based on its subject matter. The rise in the number of the data sets of the email it brings some more features along with it through which we can extract some features with it and we can implement opinion mining and sentiment analysis and thereby we can extract spam ham detections of the email data out of it. We have used supervised machine learning algorithm for the implementation of the data sets that we have used and converted the unlabeled and unstructured data set of email into the labeled and structured datasets and then we have extracted the features from it. Moreover various public data sets, feature sets, classification techniques, performance measures are examined and use in each in identified application area. In this paper we have used several datasets of email for the subject based classification and we have also proposed algorithm for spam detection for this method we have employed several machine learning algorithm.

Key-Words / Index Term :
Email classification, spam detection, opinion mining, Machine learning

References :
[1] Ketan Sarvakar, Urvashi K Kuchara, "Sentiment Analysis of movie reviews: A new feature-based sentiment classification", International Journal of Scientific Research in Computer Science and Engineering, Vol.6, Issue.3, pp.8-12, 2018
[2] B. Pang, L. Lee, and S. Vaithyanathain,“Thumbs up? Sentiment classification using machine learning techniques,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 79–86, 2002.
[3] B. Pang and L. Lee, “A sentimental education: sentimental analysis using subjectivity summarization based on minimum cuts,” in Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, pp. 271–278, 2004.
[4] S.-M. Kim and E. Hovy, “Determining the sentiment of opinions,” in Proceedings of the 20th International Conference on Computational Linguistics, pp. 1367–1373, Association for Computational Linguistics, 2004.
[5] T. Mullen and N. Collier, “Sentiment analysis using support vector machines with diverse information sources,” in Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 412–418, Barcelona, Spain, 2004.
[6] J. Wiebe, R. Bruce, M. Martin, T. Wilson, and M. Bell, Learning subjective language,”Computational Linguistics, vol. 30, no. 3, pp. 277–308, 2004.
[7] C. Zhang, W. Zuo, T. Peng, and F. He, “Sentiment classification for Chinese reviews using machine learning methods based on string kernel,” in Proceedings of the 3rd International Conference on Convergence and Hybrid Information Technology, pp. 909–914, November 2008.
[8] F. Smadja, H. Tumblin, "Automatic spam detection as a text classification task", in: Proc. of Workshop on Operational Text Classification Systems, 2002.
[9] Ann Nosseir , Khaled Nagati and Islam Taj-Eddin, “Intelligent Word-Based Spam Filter Detection Using Multi-Neural Networks”, IJCSI International Journal of Computer Science Issues, Vol. 10, Issue 2, No 1, March 2013 ISSN (Print): 1694- 0814 | ISSN (Online): 1694-0784.
[10] R. Kishore Kumar, G. Poonkuzhali, P. Sudhakar,” Comparative Study on Email Spam Classifier using Data Mining Techniques”, Proceedings of the International MultiConference of Engineers and Computer Scientists 2012 Vol I, IMEC2012, March 14- 16,2012, Hong Kong, ISBN: 977-988-19251-1-4.
[11] Rafiqul Islam and Yang Xiang, member IEEE, “Email Classification Using Data Reduction Method” created June 16, 2010.
[12] Asmeeta Mali, “Spam Detection Using Baysian with Pattren Discovery”, International Journal of Recent Technology and Engineering (IJRTE) ISSN: 2277-3878, Volume-2, Issue-3, July 2013.
[13] Vandana Jaswal, “ Spam Detection System Using Hidden Markov Model”, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 3, Issue 7, July 2013 ISSN: 2277 128X.
[14] Saadat Nazirova, “Survey on Spam Filtering Techniques”, Communications and Network, 2011, 3, 153 160, doi:10.42 36/cn.2011.33019 Published Online August 2011 (http: //www.SciRP.org /journal/cn).
[15] Neha Singh,”Dendritic Cell Algorithm and Dempster Belief Theory Using Improved Intrusion Detection System “, International Journal of Advanced Research in Computer Science and Software Engineering, Volume 3, Issue 7, July 2013 ISSN: 2277 128X.
[16] Julie Greensmith, “The Dendritic Cell Algorithm”, Thesis submitted to the University of Nottingham for the degree of Doctor of Philosophy October 2007.
[17] D. K. Renuka and P. Visalakshi, "Latent Semantic Indexing Based SVM Model for Email Spam Classification," Journal of Scientific & Industrial Research, vol. 73, pp. 437-442, Jul 2014.
[18] S. Youn, "SPONGY (SPam ONtoloGY): Email classification using two-level dynamic ontology," Scientific World Journal, vol. 2014, 2014.
[19] Y.Meng, W. Li, and L.F. Kwok, "Enhancing email classification using data reduction and disagreement-based semi-supervised learning," in 2014 1st IEEE International Conference on Communications, ICC 2014, Sydney, NSW, 2014, pp. 622-627.
[20] N. O. F. Elssied, O. Ibrahim, and W. Abu-Ulbeh, "An improved of spam E-mail classification mechanism using K-means clustering," Journal of Theoretical and Applied Information Technology, vol. 60, pp. 568-580, 2014.
[21] M. H. Song, "E-Mail Classification based Learning Algorithm Using Support vector machine," in Materials, Mechanical Engineering and Manufacture, Pts 1-3. vol. 268-270, H. Liu, Y. Yang, S. Shen, Z. Zhong, L. Zheng, and P. Feng, Eds., ed StafaZurich: Trans Tech Publications Ltd, 2013, pp. 1844-1848.
[22] C. Jou, "Spam E-Mail Classification Based on the IFWB Algorithm," in Intelligent Information and Database Systems. vol. 7802, A. Selamat, N. T. Nguyen, and H. Haron, Eds., ed Berlin: Springer-Verlag Berlin, 2013, pp. 314-324.
[23] Lifan, T. Ma, and H. Xu, "The research on email classification based on q-Gaussian kernel SVM," Journal of Theoretical and Applied Information Technology, vol. 48, pp. 1292-1299, 2013.
[24] J. R. Mendez, M. Reboiro-Jato, F. Diaz, E. Diaz, and F. FdezRiverola, "Grindstone4Spam: An optimization toolkit for boosting e-mail classification," Journal of Systems and Software, vol. 85, pp. 2909-2920, Dec 2012.
[25] A. Borg, N. Lavesson, and Ieee, "E-mail Classification using Social Network Information," 2012 Seventh International Conference on Availability, Reliability and Security (Ares), pp. 168-173, 2012.
[26] N. Perez-Diaz, D. Ruano-Ordas, J. R. Mendez, J. F. Galvez, and F. Fdez-Riverola, "Rough sets for spam filtering: Selecting appropriate decision rules for boundary e-mail classification," Applied Soft Computing, vol. 12, pp. 3671-3682, Nov 2012.

Authorization Required

 

You do not have rights to view the full text article.
Please contact administration for subscription to Journal or individual article.
Mail us at  support@isroset.org or view contact page for more details.

Go to Navigation