Tuesday 23rd of April 2024
 

Sequential Pattern Mining Using Formal Language Tools


Sunil Joshi, R. S. Jadon and R. C. Jain

In present scenario almost every system and working is computerized and hence all information and data are being stored in Computers. Huge collections of data are emerging. Retrieval of untouched, hidden and important information from this huge data is quite tedious work. Data Mining is a great technological solution which extracts untouched, hidden and important information from vast databases to investigate noteworthy knowledge in the data warehouse. An important problem in data mining is to discover patterns in various fields like medical science, world wide web, telecommunication etc. In the field of Data Mining, Sequential pattern mining is one of the method in which we retrieve hidden pattern linked with instant or other sequences. In sequential pattern mining we extract those sequential patterns whose support count are greater than or equal to given minimum support threshold value. In current scenario users are interested in only specific and interesting pattern instead of entire probable sequential pattern. To control the exploration space users can use many heuristics which can be represented as constraints. Many algorithms have been developed in the fields of constraint mining which generate patterns as per user expectation. In the present work we will be exploring and enhancing the regular expression constraints .Regular expression is one of the constraint and number of algorithm developed for sequential pattern mining which uses regular expression as a constraint. Some constraints are neither regular nor context free like cross-serial pattern anbmcndm used in Swiss German Data. We cannot construct equivalent deterministic finite automata (DFA) or Push down automata (PDA) for such type of patterns. We have proposed a new algorithm PMFLT (Pattern Mining using Formal Language Tools) for sequential pattern mining using formal language tools as constraints. The proposed algorithm finds only user specific frequent sequence in efficient optimized way as compared to other existing algorithm. Our experimental results clearly show that proposed algorithm is quite enhanced and improved and generates optimum frequent sequences as per user expectation.

Keywords: Sequential Pattern Mining, Regular Expressions, Context Free Grammars, Formal Language Tools, Deterministic Finite Automata, Push Down Automata, Turing Machine

Download Full-Text


ABOUT THE AUTHORS

Sunil Joshi
Assistant professor

R. S. Jadon
Professor & Head

R. C. Jain
Director


IJCSI Published Papers Indexed By:

 

 

 

 
+++
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »