Friday 26th of April 2024
 

Change detection in Migrating Parallel Web Crawler: A Neural Network Based Approach


Md Faizan Farooqui, Md Rizwan Beg and Md Qasim Rafiq

Search engines are the tools for Web site navigation and search. Search engines maintain indices for web documents and provide search facilities by continuously downloading Web pages for processing. This process of downloading web pages is known as web crawling. In this paper we propose A neural network based change detection method in migrating parallel web crawler. This method for Effective Migrating Parallel Web Crawling approach will detect changes in the content and structure using neural network. This crawling strategy makes web crawling system more effective and efficient. The major advantages of migrating parallel web crawler are that the analysis portion of the crawling process is done locally at the residence of data rather than inside the Web search engine repository. This significantly reduces network load and traffic which in turn improves the performance, effectiveness and efficiency of the crawling process. The another advantage of migrating parallel crawler is that as the size of the Web grows, it becomes necessary to parallelize a crawling process, in order to finish downloading web pages in a comparatively shorter time. Neural network based change detection method in migrating parallel web crawler will yield high quality pages and detect for changes will always download fresh pages.

Keywords: Web crawling, parallel migrating web crawler, search engine, neural network

Download Full-Text


ABOUT THE AUTHORS

Md Faizan Farooqui
Department of Computer Application Integral University Lucknow

Md Rizwan Beg
Department of Computer Science and Engineering Integral University Lucknow

Md Qasim Rafiq
Department of Computer Engineering Aligarh Muslim University Aligarh


IJCSI Published Papers Indexed By:

 

 

 

 
+++
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »