International Journal of Computer Science Issues

Employing Ontology Enrichment Algorithm in Classifying Biomedical Text Abstracts

Rozilawati Binti Dollah and Masaki Aono

The application of text classification systems on biomedical literature aims to select articles relevant to a specific issue from large corpora. As the amount of online biomedical literature grows, the task of finding relevant information becomes very complicated, due to the difficulties in browsing and searching the relevant information through the web. Ontology is useful for organizing and navigating the Web sites and also for improving the accuracy of Web searches. It provides a shared understanding of domain, to overcome differences in terminology such as synonym, term variants and terms ambiguity. However, one of the problems raised in ontology is the maintenance of these bases of concepts. Therefore, we investigate and propose ontology enrichment algorithm as one of the methods to modify an existing ontology. In this research, we present a new ontology enrichment algorithm for assigning or associating each concept in the training ontology with the relevant and informative features from biomedical information sources. Experiments are conducted to extract and select the meaningful features from different information sources such as the OHSUMED dataset, Medical Subject Heading (MeSH) terms and heart disease glossaries. Then, we expand these features into the training ontology. Finally, we evaluate the performance of our proposed ontology enrichment algorithm in classifying biomedical text abstracts. The results demonstrate that the macro-average for precision, recall and F measure are improved by employing ontology enrichment algorithm.

Keywords: MeSH, OHSUMED, Ontology Enrichment, Text Classification, Text Mining

Download Full-Text

ABOUT THE AUTHORS

Rozilawati Binti Dollah
She received her B.Sc. and M.Sc. degrees from Universiti Teknologi Malaysia, both in Computer Science in 1998 and 2001. She is a lecturer in the Department of Information Systems, Universiti Teknologi Malaysia. She is currently a Ph.D. candidate at the Graduate School of Electronic and Information Engineering, in Toyohashi University of Technology. Her research interests are data mining, text mining and semantic web.

Masaki Aono
He received his B.Sc. and M.Sc. degrees from University of Tokyo in 1981 and 1984. He completed his Ph.D. from Rensselaer Polytechnic Institute, New York in 1994. He has joined the IBM Research, Tokyo Research Laboratory from 1984 until 2003. Since 2003, he is a Professor of Information and Computer Sciences Department, in Toyohashi University of Technology. His research interest are massive multimedia datasets, data mining, web mining, semantic web include feature extraction, classification, clustering, segmentation and information retrieval. He is a member of ACM, IEEE Computer Society, IPSJ, IEICE, JSAI and NLP.

International Journal of Computer Science Issues More than a traditional journal...

Employing Ontology Enrichment Algorithm in Classifying Biomedical Text Abstracts

International Journal of Computer Science Issues

More than a traditional journal...