Monday 25th of September 2017
 

Clustering and Classification Augmented with Semantic Similarity for Text Mining


S.Revathi and T.Nalini

Semantic similarity is a way of analyzing the perfect synonym that exists between word-pairs. This measure is necessary to detect the degree of relationship that persists within word-pairs. To compute the semantic similarity that lies between a word-pair, clustering and classification augmented with semantic similarity (CCASS) was developed. CCASS is a novel method that uses page counts and text snippets returned by search engine. Several similarity measures are defined using the page counts of word-pairs. Lexical pattern clustering is applied on text snippets, obtained from search engine. These are fed to the support vector machine (SVM) which computes the semantic similarity that exists between word-pairs. Based on this value obtained from the support vector machine, Simple KMeans clustering algorithm is used to form clusters. Upcoming word-pairs can be classified, after computation of its semantic similarity measure. If it does match with the existing clusters, a new cluster may be created.

Keywords: Semantic Similarity, Similarity measure, Clustering, Classification, Text mining.

Download Full-Text


ABOUT THE AUTHORS

S.Revathi
PG Scholar pursuing Masters in Computer Science. Has 3 years of teaching experience.

T.Nalini
Pursued Doctorate in Computer Science. Works as Professor in the Department of Computer Science, Bharath University.


IJCSI Published Papers Indexed By:

 

 

 

 
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »