Wednesday 24th of April 2024
 

Information Extraction from Arabic News


Hala Elsayed and Tarek Elghazaly

Information Extraction IE is finding of specific facts from collections of the vast unstructured texts in web and large documents. Named Entity Recognition NER is a sub-problem of information extraction. Recent researches in information extraction are growing also interest in NER that is help to extract desired information from massive texts therefore extracting entities is important tasks in Natural Language Processing NLP. Arabic language need to more researches in information extraction domain therefore we introduce this research. The experiment is concerned of extraction entities and the entities relation from the Arabic text. We use the Arabic news from Egyptian Arabic newswire. The paper introduce a method for extract numerous unknown using entity and entities relation from Arabic Corpus that is generated from Egyptian Arabic newswire to extract Information using the Named Entities and Entities Relation in Arabic language. The experiment contained nearly 625368 entries, the number of sentences 36423, the selecting sample about 3400 sentences represent the crimes news. In the results we obtained some information that is consider a tool for decision maker in analyze the text.

Keywords: Information Extraction IE, Natural Language Processing NLP, Named Entities Recognition, Corpus, Gazetteers.

Download Full-Text


ABOUT THE AUTHORS

Hala Elsayed
Computer and Information Sciences Dept., ISSR, Cairo University Cairo, Egypt

Tarek Elghazaly
Computer and Information Sciences Dept., ISSR, Cairo University Cairo, Egypt


IJCSI Published Papers Indexed By:

 

 

 

 
+++
About IJCSI

IJCSI is a refereed open access international journal for scientific papers dealing in all areas of computer science research...

Learn more »
Join Us
FAQs

Read the most frequently asked questions about IJCSI.

Frequently Asked Questions (FAQs) »
Get in touch

Phone: +230 911 5482
Email: info@ijcsi.org

More contact details »