Multiple Pattern Matching Algorithm using Pair-count
Pattern matching occurs in various applications, ranging from simple
text searching in word processors to identification of common motifs
in DNA sequences in computational biology. The problem of exact
pattern matching has been well studied and a number of efficient
algorithms already exist. However these exact pattern matching
algorithms are of little help when they are applied to finding patterns
in DNA sequences. Pattern matching in a DNA sequence or pattern
searching from a large data base is a major research area in
computational biology. To extract pattern from a large sequence it
takes more time, in order to reduce searching time we have proposed
an approach that reduces the search time with accurate retrieval of the
matched pattern from the given sequence of any size of a file.
Executing patterns from a large DNA or protein data is a
computationally intensive task. As performance plays a major role in
extracting patterns from a given DNA sequence or from a large
database independent of the size of the sequence. More efficient
approaches related to multiple pattern matching techniques are
becoming more important for finding the functional as well as the
structural properties of the proteins and genes. One of the major
problems in genomic field is to perform pattern comparison on DNA
and protein sequences. In the current approach we explore a new
technique which avoids unnecessary comparisons in the DNA
sequence and gives the accurate retrieval of the pattern called a
multiple pattern matching algorithm using pair count. The proposed
technique gives very good performance related to DNA sequence
analysis for querying of publicly available genome sequence data. By
using this method the number of comparisons gradually decreases and
comparison per character ratio of the proposed algorithm reduces
accordingly when compared to the some of the existing popular
methods. The experimental results show that there is considerable
amount of performance improvement due to this the overall
performance increases.
Keywords: Count, Index, Pair, Sequence
Download Full-Text