A New Weight Function for Constructing Field Association Terms using Concurrent Words
Field Association (FA) words or phrases are serving to
identify document fields by reading only some specific
words. Document fields can be decided efficiently if there
are many rank 1 FA words (words that direct connect to
terminal fields) and if the frequency rate is high. This paper
proposes a new method for increasing rank 1 FA words
using declinable words and concurrent words which relate
to narrow association categories and eliminate FA word
ambiguity. Concurrent words become Concurrent Field
Association Words (CFA words) if there is a little field
overlap. Usually, efficient CFA words are difficult to extract
using only frequency, so this paper proposes weighting
according to degree of importance of concurrent words.
The new weighting method causes Precision and Recall to
be significantly increased by 30% and 40% than by using
frequency alone. Moreover, combining CFA words with FA
words allow our new system to append automatically
around 28% of CFA words to the existence FA word
Dictionary. Furthermore, Recall is improved by 21% over
the recall of the traditional method.
Keywords: FA Words, Declinable Words, Concurrent Words, CFA words, Recall, Precision
Download Full-Text