Comparative Analysis of Selected Text Mining Classifiers

Omokanye S.O., Abikoye O.C., Aro T.O., Akande H.B., Aregbesola K.M.

Comparative Analysis of Selected Text Mining Classifiers

Omokanye S.O.¹ , Abikoye O.C.² , Aro T.O.³ , Akande H.B.⁴ , Aregbesola K.M.⁵

Section:Research Paper, Product Type: Journal-Paper
Vol.9 , Issue.1 , pp.37-42, Feb-2021

Online published on Feb 28, 2021

Copyright © Omokanye S.O., Abikoye O.C., Aro T.O., Akande H.B., Aregbesola K.M. . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: Omokanye S.O., Abikoye O.C., Aro T.O., Akande H.B., Aregbesola K.M., “Comparative Analysis of Selected Text Mining Classifiers,” International Journal of Scientific Research in Computer Science and Engineering, Vol.9, Issue.1, pp.37-42, 2021.

MLA Style Citation: Omokanye S.O., Abikoye O.C., Aro T.O., Akande H.B., Aregbesola K.M. "Comparative Analysis of Selected Text Mining Classifiers." International Journal of Scientific Research in Computer Science and Engineering 9.1 (2021): 37-42.

APA Style Citation: Omokanye S.O., Abikoye O.C., Aro T.O., Akande H.B., Aregbesola K.M., (2021). Comparative Analysis of Selected Text Mining Classifiers. International Journal of Scientific Research in Computer Science and Engineering, 9(1), 37-42.

BibTex Style Citation:
@article{S.O._2021,
author = {Omokanye S.O., Abikoye O.C., Aro T.O., Akande H.B., Aregbesola K.M.},
title = {Comparative Analysis of Selected Text Mining Classifiers},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {2 2021},
volume = {9},
Issue = {1},
month = {2},
year = {2021},
issn = {2347-2693},
pages = {37-42},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=2272},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=2272
TI - Comparative Analysis of Selected Text Mining Classifiers
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - Omokanye S.O., Abikoye O.C., Aro T.O., Akande H.B., Aregbesola K.M.
PY - 2021
DA - 2021/02/28
PB - IJCSE, Indore, INDIA
SP - 37-42
IS - 1
VL - 9
SN - 2347-2693
ER -

262 Views

306 Downloads

78 Downloads

Bar Line

Abstract :
Text classification is a method of knowledge engineering in which expert level knowledge on classifying documents such that similar documents can be arranged in their respective categories. The field has been a growing area of research as there has been an increase in the availability of textual information. In this paper, three classifiers k-Nearest Neighbour (KNN), Multinomial Naïve Bayes (MNB), and Support Vector Machine (SVM) algorithms were compared based on accuracy and time taken to build a model using five datasets. Experimental results showed that with an increment in the number of training documents, the classification accuracy of the algorithms also increased and algorithms performed better in binary classification tasks than in multiclass classification tasks. The best accuracy of 98.3495% was recorded in SVM using SMS Spam collection dataset compared with other classifiers and datasets, for binary classification, the best classification accuracy of 98.3495% was obtained in SVM using SMS Spam collection dataset, and the best accuracy of 88.28% was obtained in SVM using Reuters 50-50 dataset. The lowest time of 0.01s which was also considered as the best time taken to build a model was recorded in MNB classifier.

Key-Words / Index Term :
Naïve Bayes; Nearest Neighbor; Support Vector Machine; Text mining

References :
[1] P. Davcheva, “Text Mining Mental Health Forums-Learning from User Experiences,” in Twenty-Sixth European Conference on Information Systems, 2018, pp. 1–11.
[2] K. J. Kowsari, K, Meimandi and D. Heidarysafa, M., Mendu, S., Barnes, L., and Brown, “Text Classification Algorithms: A Survey,” Information, pp. 1–68, 2019.
[3] S. Yogapreethi, N. & Maheeswari, “A Reviewing on Text Mining in Data Mining,” Int. J. Soft Comput., vol.7, no. 2, pp. 1–8, 2016.
[4] N. Sailaja, N. V., Padmasree, L., & Mangathayaru, “Survey of Text Mining Techniques, Challenges and their Applications,” Int. J. Comput. Appl., vol. 146, no.11, pp. 30–35, 2016.
[5] D. Soni, A., Kumar, V., Kaur, R., & Hemayathi, “Predicting Student Performance Using Data Mining Techniques,” Int. J. Pure Appl. Math., vol. 119, no. 12, pp. 221–227, 2018.
[6] R. Shaw and H. Bosworth, “Short message service (SMS) text messaging as an intervention medium for weight loss: A literature review,” Health Informatics J., vol. 18, no. 4, pp. 235–250, 2012.
[7] M. Dietterich, T. Bishop, C. Heckerman, D. Jordan, M. and Kearns, “Introduction to Machine Learning Second Edition Adaptive Computation and Machine Learning,” Massachusetts Institute of Technology, 2015.
[8] A. Kaushik and S. Naithani, “A Comprehensive Study of Text Mining Approach,” Int. J. Comput. Sci. Netw. Security, vol. 16, no. 2, pp. 69–76, 2016
[9] K. & Nikhath, A. K., Subrahmanyam and R. Vasavi, “Building a K-Nearest Neighbor Classifier for Text Categorization,” Int. J.Comput. Sci. Inf. Technol., vol.7, no. 1, pp.254–256, 2016.
[10] M. Moosavian, A.Ahmadi, H. Tabatabaeefar, A. and Khazaee, “Comparison of two classifiers; K-nearest neighbour and artificial neural network, for fault diagnosis on the main engine journal-bearing,” Shock Vib., vol. 20, no. 2, pp. 263–272, 2013.
[11] K. Matsushita et al., “Estimated glomerular filtration rate and albuminuria for prediction of cardiovascular outcomes: A collaborative meta-analysis of individual participant data,” Lancet Diabetes Endocrinol., vol. 3, no. 7, pp.514–525, Jul. 2015.
[12] A. K. S. Banu and S. H. Ganesh, “A Hybrid Approach for an Efficient Classification Using Decision Tree and SVM,” Int. J. Comput. Sci.Mob. Comput., vol. 7, no. 2, pp. 42–48, 2018.
[13] R. Mohana and S. Sumathi, “Document classification using Multinomial Document classification using Multinomial Naïve Bayesian Classifier,” Int. J. Sci. Eng. Technol.Res., vol. 3, no. 5, pp. 1557–1563, 2014.
[14] A. Jovic, K. Brkic, and N. Bogunovic, “A Review of Feature Selection Methods with Applications,” in 2015 38thInternational Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), 2015, pp. 1200–1205.
[15] A. Abbas, M., Kamran, A. M., Abdul, A. J., Memon, S., & Ahmed, “Multinomial Naïve Bayes Classification Model for Sentiment Analysis,” IJCSNS Int. J. Comput. Sci. Netw.Secur., vol. 19, no. 3, p. 62, 2019.
[16] M. Naz, K. Zafar, and A. Khan, “Ensemble-based classification of sentiments using a forest optimization algorithm,” Data, vol. 4, no. 2, pp. 1–13, 2019.
[17] M. Thangaraj and M. Sivakami, “Text classification techniques: A literature review,” Interdiscip. J. Information, Knowledge, Manag., vol. 13, pp. 117–135, 2018.
[18] B. C. and Wongso, R. Luwinda, F. A. Trisnajaya and O. R. Rusli, “News Article Text Classification in Indonesian Language,” Procedia Comput. Sci., vol.116, pp. 137–143, 2017.
[19] D. D. A. Bui, G. Del Fiol, and S. Jonnalagadda, “PDF text classification to leverage information extraction from publication reports,” J. Biomed. Inform., vol. 61, pp. 141–148, 2016.
[20] X. Yang, Z., Yang, D., Dyer, C., He and E. Smola, A.,& Hovy, “Hierarchical Attention Networks for Document Classification,” 2016 Conf. North Am. Chapter Assoc. Comput.Linguist. Hum. Lang. Technol. NAACL HLT 2016 - Proc. Conf., pp. 1480–1489, 2016.
[21] N. Goudjil, M. Koudil, M. Bedda, M. and Ghoggali, “A Novel Active Learning Method Using SVM for Text Classification,” Int. J.Autom. Comput., vol. 15, no. 3, pp. 290–298, 2018.
[22] J. Dey Sarkar, S. Goswami, S. Agarwal, A. and Akhtar, “A Novel Feature Selection Technique for Text Classification Using Naïve Bayes,” Int. Sch. Res. Not., pp. 1–10, 2014.
[23] A. Dey, “Machine Learning Algorithms: A Review,” Int. J. Comput. Sci. Inf. Technol., vol. 7, no. 3, pp. 1174–1179, 2016.
[24] T. H. Nguyen and K. Shirai, “Text classification of technical papers based on text segmentation,” Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7934 LNCS. pp. 278–284, 2013.
[25] R. Mccue, “A Comparison of the Accuracy of Support Vector Machine and Nave Bayes Algorithms In Spam Classification,” Santa Cruz, CA, 2009.
[26] P. D. Shahare and R. N. Giri, “Comparative Analysis of Artificial Neural Network and Support Vector Machine Classification for Breast Cancer Detection,” pp. 2114–2119, 2015.

Full Paper View Go Back

Main Menu

Journals Contents

Information

Download

Publication Certificate

Contact Us

Use full Link