Friday 19th of April 2024
 

Enhanced Hierarchical Clustering for Genome Databases



Clustering techniques find interesting and previously unknown patterns in large scale data embedded in a large multi dimensional space and are applied to a wide variety of problems like customer segmentation, Biology, data mining techniques, machine Learning and geographical information systems. Clustering algorithms are used efficiently to scale up with the dimensionality of the data sets and the data base size. Hierarchical clustering methods in particular are widely used to find patterns in multi dimensional data. In this paper, we design an enhanced hierarchical clustering algorithm which scans the dataset and calculates distance matrix only once. Our main contribution is to reduce time, even when a large database is analyzed. Also, the results of hierarchical clustering are represented as a binary tree which gives clarity in grouping and further helps to find clustered objects easily. Our algorithm is able to retrieve number of clusters with the help of cut distance and measures the quality with validation index in order to obtain the best one; does not require initial parameter like number of clusters.

Keywords: Micro array, Hierarchical clustering, Gene expression data, Binary Tree

Download Full-Text

IJCSI Published Papers Indexed By:

 

 

 

 
+++
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »