Full Paper View Go Back

Semantics Based Document Clustering

Apurva Dube1 , Pradnya Gotmare2

  1. Dept.of Computer Engineering, K.J.Somaiya College of Engineering, Mumbai, India.
  2. Dept.of Computer Engineering, K.J.Somaiya College of Engineering, Mumbai, India.

Correspondence should be addressed to: apurva.dube@somaiya.edu.


Section:Research Paper, Product Type: Isroset-Journal
Vol.5 , Issue.4 , pp.26-31, Aug-2017


Online published on Aug 30, 2017


Copyright © Apurva Dube, Pradnya Gotmare . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
 

View this paper at   Google Scholar | DPI Digital Library


XML View     PDF Download

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Apurva Dube, Pradnya Gotmare, “Semantics Based Document Clustering,” International Journal of Scientific Research in Computer Science and Engineering, Vol.5, Issue.4, pp.26-31, 2017.

MLA Style Citation: Apurva Dube, Pradnya Gotmare "Semantics Based Document Clustering." International Journal of Scientific Research in Computer Science and Engineering 5.4 (2017): 26-31.

APA Style Citation: Apurva Dube, Pradnya Gotmare, (2017). Semantics Based Document Clustering. International Journal of Scientific Research in Computer Science and Engineering, 5(4), 26-31.

BibTex Style Citation:
@article{Dube_2017,
author = {Apurva Dube, Pradnya Gotmare},
title = {Semantics Based Document Clustering},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {8 2017},
volume = {5},
Issue = {4},
month = {8},
year = {2017},
issn = {2347-2693},
pages = {26-31},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=433},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=433
TI - Semantics Based Document Clustering
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - Apurva Dube, Pradnya Gotmare
PY - 2017
DA - 2017/08/30
PB - IJCSE, Indore, INDIA
SP - 26-31
IS - 4
VL - 5
SN - 2347-2693
ER -

402 Views    254 Downloads    194 Downloads
  
  

Abstract :
Document clustering is a technique used to organize large datasets of documents into meaningful groups. The associated documents are described by the relevant words which serve as cluster labels. The traditional approach for document clustering uses bag-of-words representation. This representation often ignores the semantic relations between the words. Therefore ontology-based document clustering is proposed. One of the ways to deal with reusability and remix of learning objects in context of e-learning is via the use of appropriate ontologies. The more appropriate use of ontology the better will be the annotation of learning material. To couple document clustering with ontology will help in producing better clusters which will not ignore the semantic relation between the words. The proposed system uses “an ontology-based document clustering” approach based on two-step clustering algorithm. Since it is two step clustering, it uses both partitioning as well as hierarchical clustering algorithms. Ontology is introduced through defining a weighting scheme. This weighing scheme integrates traditional scheme of co-occurrences of words paired with weights of relations between words in ontology. The algorithm used from partition clustering technique is K-means whereas from hierarchical clustering technique is hierarchical agglomerative algorithm. Thus we can say that the clustering approach that uses the semantics of the documents for term weighting produces better results than the approach without semantics.

Key-Words / Index Term :
Document Clustering, Ontology-based Clustering, eLearning, Ontology Generation, Semantic Relation, eLearning Concept

References :
[1] Sara Alaee and Fattaneh Taghiyareh, “A semantic ontology based document organizer to cluster E-Learning documents”, 2016 Second international conference on web research(ICWR), 2016 IEEE.
[2] Nadana Ravishankar. T and Shriram. R, “Ontology based clustering algorithm for information retrieval”, 4th ICCNT, July 2013, IEEE.
[3] Hongwei Yang, “A document clustering algorithm for web search engine retrieval system”,2010 International conference on e-education, e-business, e-management and e-learning,2010 IEEE.
[4] XiQuan Yang, DiNa Guo, XueYa Cao and JianYuan Zhou, “Research on Ontology-based Text Clustering”, 2008 Third International Workshop on Semantic Media Adaptation and Personalization, 2008 IEEE.
[5] Enrico G. Caldarola and Antonio M. Rinaldi, “An Approach to Ontology Integration for Ontology Reuse”, IEEE 17th International Conference on Information Reuse and Integration, 2016.
[6] Apra Mishra and Santosh Vishwakarma, “Analysis of TF-IDF Model and its Variant for Document Retrieval”, International Conference on Computational Intelligence and Communication Networks, 2015 IEEE.
[7] Sanket S.Pawar,Abhijeet Manepatil,Aniket Kadam and Prajakta Jagtap, “Keyword Search in Information Retrieval and Relational Database System: Two Class View, International Conference on Electrical”, Electronics, and Optimization Techniques (ICEEOT) , 2016 IEEE.
[8] Dorian Kokoshi and Betim Çiço, “Integration of Semantic WEB in an eLearning Environment”, Fourth Balkan Conference in Informatics, 2009 IEEE.
[9] Jaskaranjit Kaur and Harpreet Singh “Performance Evaluation of a Novel Hybrid Clustering Algorithm using Birch and K-Means”, IEEE, 2015.
[10] Li Jun Tao, Liu Yin Hong and Hao Yan “The Improvement and Application of a K-Means Clustering Algorithm”, International Conference on Cloud Computing and Big Data Analysis, IEEE, 2016.
[11] 11.Yusuke TAMURA and Sadaaki MIYAMOTO, “A Method of Two Stage Clustering Using Agglomerative Hierarchical Algorithms with One-Pass 𝑘-Means++ or 𝑘-Median++”, IEEE International Conference on Granular Computing (GrC), 2014 IEEE.
[12] I. Bedini and B. Nguyen, "Automatic Ontology Generation: State of the Art," PRiSM Laboratory Technical Report, University of Versailles, Versailles, 2007.
[13] Elizabeth D. Liddy, “Document Retrieval Automatic”, Syracuse university, 2005.

Authorization Required

 

You do not have rights to view the full text article.
Please contact administration for subscription to Journal or individual article.
Mail us at  support@isroset.org or view contact page for more details.

Go to Navigation