Saturday 20th of April 2024
 

Comparative Analysis of IDF Methods to Determine Word Relevance in Web Document


Jitendra Nath Singh

Inverse document frequency (IDF) is one of the most useful and widely used concepts in information retrieval. When it is used in combination with the term frequency (TF), the result is a very effective term weighting scheme (TF-IDF) that has been applied in information retrieval to determine the weight of the terms. Terms with high TF-IDF values imply a strong relationship with the document they appear in. If that term appears in a query, the document can be of most interest to the user. Term frequency is computed as the number of occurrences of a term in a document whereas there are various methods for measuring the value of IDF; one of the most famous derivations follows from the Robertson-Spark Jones relevance weight. Besides the most famous method for computation of IDF, there are also various methods for computation of inverse document frequency that affects the relevance of a document. In this paper, we have discussed and compared different derivations of inverse document frequency to measure the weight of terms.

Keywords: Information Retrieval, Term-Frequency, IDF, Vector space model.

Download Full-Text


ABOUT THE AUTHOR

Jitendra Nath Singh
Research Scholar in Department of computer Science at Babasaheb Bhimrao Ambedkar University, Lucknow - 226025 (U.P.) India. His research interest is search engines and its performance evaluation, and web technology.


IJCSI Published Papers Indexed By:

 

 

 

 
+++
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »