A Elsebai
A Rules Based System for Named Entity Recognition in Modern Standard Arabic
Elsebai, A
Authors
Contributors
F Meziane F.Meziane@salford.ac.uk
Supervisor
Abstract
The amount of textual information available electronically has made it difficult for
many users to find and access the right information within acceptable time. Research
communities in the natural language processing (NLP) field are developing tools and
techniques to alleviate these problems and help users in exploiting these vast resources.
These techniques include Information Retrieval (IR) and Information Extraction (IE). The
work described in this thesis concerns IE and more specifically, named entity extraction in
Arabic. The Arabic language is of significant interest to the NLP community mainly due to
its political and economic significance, but also due to its interesting characteristics.
Text usually contains all kinds of names such as person names, company names,
city and country names, sports teams, chemicals and lots of other names from specific
domains. These names are called Named Entities (NE) and Named Entity Recognition
(NER), one of the main tasks of IE systems, seeks to locate and classify automatically
these names into predefined categories. NER systems are developed for different
applications and can be beneficial to other information management technologies as it can
be built over an IR system or can be used as the base module of a Data Mining application.
In this thesis we propose an efficient and effective framework for extracting Arabic NEs
from text using a rule based approach. Our approach makes use of Arabic contextual and
morphological information to extract named entities. The context is represented by means
of words that are used as clues for each named entity type. Morphological information is
used to detect the part of speech of each word given to the morphological analyzer.
Subsequently we developed and implemented our rules in order to recognise each position
of the named entity. Finally, our system implementation, evaluation metrics and
experimental results are presented.
Citation
Elsebai, A. A Rules Based System for Named Entity Recognition in Modern Standard Arabic. (Thesis). University of Salford
Thesis Type | Thesis |
---|---|
Deposit Date | Jul 21, 2011 |
Publicly Available Date | Jul 21, 2011 |
Award Date | Jan 1, 2009 |
Files
521502.pdf
(15.5 Mb)
PDF
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search