Skip to main content

Research Repository

Advanced Search

A text mining approach for Arabic question answering systems

Sadek, J

Authors

J Sadek



Abstract

As most of the electronic information available nowadays on the web is stored as text,
developing Question Answering systems (QAS) has been the focus of many individual
researchers and organizations. Relatively, few studies have been produced for extracting
answers to “why” and “how to” questions. One reason for this negligence is that when going
beyond sentence boundaries, deriving text structure is a very time-consuming and complex
process. This thesis explores a new strategy for dealing with the exponentially large space
issue associated with the text derivation task. To our knowledge, to date there are no systems
that have attempted to addressing such type of questions for the Arabic language.
We have proposed two analytical models; the first one is the Pattern Recognizer which
employs a set of approximately 900 linguistic patterns targeting relationships that hold within
sentences. This model is enhanced with three independent algorithms to discover the
causal/explanatory role indicated by the justification particles. The second model is the Text
Parser which is approaching text from a discourse perspective in the framework of Rhetorical
Structure Theory (RST). This model is meant to break away from the sentence limit. The
Text Parser model is built on top of the output produced by the Pattern Recognizer and
incorporates a set of heuristics scores to produce the most suitable structure representing the
whole text.
The two models are combined together in a way to allow for the development of an Arabic
QAS to deal with “why” and “how to” questions. The Pattern Recognizer model achieved an
overall recall of 81% and a precision of 78%. On the other hand, our question answering
system was able to find the correct answer for 68% of the test questions. Our results reveal
that the justification particles play a key role in indicating intrasentential relations.

Citation

Sadek, J. A text mining approach for Arabic question answering systems. (Thesis). University of Salford

Thesis Type Thesis
Deposit Date Apr 9, 2015
Publicly Available Date Apr 9, 2015

Files






Downloadable Citations