Full Paper View Go Back
Improving Clustering Accuracy using Feature Extraction Method
T. SenthilSelvi1 , R. Parimala2
- Department of Computer Science, Periyar E.V.R College, Trichy-23, India.
- Department of Computer Science, Periyar E.V.R College, Trichy-23, India.
Correspondence should be addressed to: senthilselvikumar@yahoo.co.in.
Section:Research Paper, Product Type: Isroset-Journal
Vol.6 ,
Issue.2 , pp.15-19, Apr-2018
CrossRef-DOI: https://doi.org/10.26438/ijsrcse/v6i2.1519
Online published on Apr 30, 2018
Copyright © T. SenthilSelvi, R. Parimala . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
View this paper at Google Scholar | DPI Digital Library
How to Cite this Paper
- IEEE Citation
- MLA Citation
- APA Citation
- BibTex Citation
- RIS Citation
IEEE Style Citation: T. SenthilSelvi, R. Parimala, “Improving Clustering Accuracy using Feature Extraction Method,” International Journal of Scientific Research in Computer Science and Engineering, Vol.6, Issue.2, pp.15-19, 2018.
MLA Style Citation: T. SenthilSelvi, R. Parimala "Improving Clustering Accuracy using Feature Extraction Method." International Journal of Scientific Research in Computer Science and Engineering 6.2 (2018): 15-19.
APA Style Citation: T. SenthilSelvi, R. Parimala, (2018). Improving Clustering Accuracy using Feature Extraction Method. International Journal of Scientific Research in Computer Science and Engineering, 6(2), 15-19.
BibTex Style Citation:
@article{SenthilSelvi_2018,
author = {T. SenthilSelvi, R. Parimala},
title = {Improving Clustering Accuracy using Feature Extraction Method},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {4 2018},
volume = {6},
Issue = {2},
month = {4},
year = {2018},
issn = {2347-2693},
pages = {15-19},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=600},
doi = {https://doi.org/10.26438/ijcse/v6i2.1519}
publisher = {IJCSE, Indore, INDIA},
}
RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v6i2.1519}
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=600
TI - Improving Clustering Accuracy using Feature Extraction Method
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - T. SenthilSelvi, R. Parimala
PY - 2018
DA - 2018/04/30
PB - IJCSE, Indore, INDIA
SP - 15-19
IS - 2
VL - 6
SN - 2347-2693
ER -
Abstract :
Clustering is the technique employed to group documents containing related information into clusters, which facilitates the allocation of relevant information. Clustering performance is mostly dependent on the text document features. The first challenge concerns difficulty with identifying significant term features to represent original content by considering the hidden knowledge. The second challenge is related to reducing data dimensionality without losing essential information. Clustering techniques were proposed to use feature extraction methods Principal Component Analysis (PCA) and Kernel Principal Component Analysis (KPCA) to improve the clustering efficiency and quality. Documents are pre-processed, converted to vector space model and then clustered using the proposed algorithm. The goal of this work is to design a suitable model for clustering text document that is capable of improving clustering performance. In this paper, the problems are discussed with empirical evidence. Experimental results show that the proposed method is effective for the text clustering task.
Key-Words / Index Term :
Clustering; Euclidean Distance; Document frequency; Dimensionality reduction; Principal components
References :
[1] C. Boutsidis, M.W. Mahoney, P. Drineas,"Unsupervised Feature Selection for the k-Means clustering problem", In the NIPS`09 Proceedings of the 22nd International Conference on Neural Information Processing Systems, Canada, pp.153-161, 2009.
[2] D. Greene, P. Cunningham, "Practical Solutions to the Problem of Diagonal Dominance in Kernel Document Clustering", In the 23rd International Conference on Machine Learning, Pittsburgh, PA, pp.377-384, 2006.
[3] Z. Miner, L. Csat, "Kernel PCA Based Clustering for Inducing Features in text Categorization", In the ESANN`2007 Proceedings- European Symposium on Artificial Neural Networks, Bruges,Belguim., pp.349-354, 2007.
[4] R. Mall, J.A.K.Suykens,"Kernel Spectral Document Clustering Using Unsupervised Precision-Recall Metrics.", 2015 International Joint Conference on Neural Network,Killarney, Ireland, pp. 1-7, 2015.
[5] R. Jenssen, T.Eltoft, M.Girolami and D. Erdogmus ,"Kernel Maximum Entropy Data Transformation and an Enhanced Spectral Clustering Algorithm.", In the NIPS`06 Proceedings of the 19th International Conference on Neural Information Processing Systems, Canada, pp.633-640, 2006.
[6] T. Shi, M. Belkin, B. Yu, “Data spectroscopy: eigenspaces of convolution operators and clustering”, The Annals of Statistics, Vol. 37, No.6B, pp.3960-3984, 2009.
[7] L.Kaufmann, “Advances in Kernel Methods — Support Vector Learning -Solving the quadratic programming problem arising in support vector classification, MIT Press, Cambridge, MA, pp.147–168, 1999.
[8] Y.Yang, J.O. Pedersen, "A Comparative study of feature selection in Text Categorization", In the Proceedings of the Fourteenth International Conference on Machine Learning (ICML`97), USA, pp.412-420, 1997.
[9] I. Feinerer, K.Hornik, D. Meyer, "Text Mining Infrastructure in R”, Journal of Statistical Software, Vol.25, Issue 5, pp.1-54, 2008..
[10] A. Karatzoglou, A. Smola, K. Hornik, A. Zeileis, "kernlab - An S4 Package for Kernel Methods in R", Journal of Statistical Software Vol.11, Issue 9, pp.1-20, 2004.
You do not have rights to view the full text article.
Please contact administration for subscription to Journal or individual article.
Mail us at support@isroset.org or view contact page for more details.