Saturday 20th of April 2024
 

Semantic Extraction from List Web Pages


Ismail Jellouli and Mohammed El Mohajir

Extracting structured information from web pages is a problem that has many applications and that gained increased interest in recent years. We propose an approach that can achieve extraction and semantic description of data contained in a list web page. Our approach is fully automatic and is based on a \seed\ ontology that contains minimal information about the domain. It uses an instance-based classifier to characterize the attributes of the ontology. In opposition to existing methods, our approach does not make any assumption on the design of web pages ; it is totally layout independent. Experimental results obtained from different web pages of different web sites from different domains show that our approach is effective.

Keywords: Web Information Extraction; list web pages, probablistic model, ontology

Download Full-Text


ABOUT THE AUTHORS

Ismail Jellouli
DESA degree in computer science in 2007, currently a Ph D student in computer science. He is an IEEE student member and his research interests include information extraction, semantic web and reference reconciliation.

Mohammed El Mohajir
MOHAJIR is European Master in Environmental System Modeling (1992) and Doctor of Science (1997). He is Professor at the department of computer sciences at the Faculty of Science Dhar Mahraz. He is the vice-chair of the IEEE Morocco Section. His main research is about conceptual modeling, design and development of decision-support Information Systems, ETL processes for datawarehouse and SOLAP, Distributed and Parallel Processing Systems and semantic web.


IJCSI Published Papers Indexed By:

 

 

 

 
+++
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »