International Journal of Computer Science Issues

Building an Automatic Thesaurus to Enhance Information Retrieval

Essam Said Hanandeh

One of the major problems of modern Information Retrieval (IR) systems is the vocabulary Problem that concerns with the discrepancies between terms used for describing documents and the terms used by the researcher to describe their information need. We have implemented an automatic thesurs, the system was built using Vector Space Model (VSM). In this model, we used Cosine measure similarity. In this paper we use selected 242 Arabic abstract documents. All these abstracts involve computer science and information system. The main goal of this paper is to design and build automatic Arabic thesauri using term-term similarity that can be used in any special field or domain to improve the expansion process and to get more relevance documents for the user\'s query. The study concluded that the similarl thesaurus improved the recall and precision more than traditional information retrieval system in terms of recall and precision level.

Keywords: Information Retrieval ,Similarity Thesurs,Query Expansion

Download Full-Text

ABOUT THE AUTHOR

Essam Said Hanandeh
I am a member in computer information system section in zarqa university in jordan

International Journal of Computer Science Issues More than a traditional journal...

Building an Automatic Thesaurus to Enhance Information Retrieval

International Journal of Computer Science Issues

More than a traditional journal...