A Hybrid Approach for Optimizing Arabic Semantic Query Expansion
Abstract
Nowadays, information retrieval systems face significant challenges in providing users with accurate information due to the enormous growth of information. To better reformulate the query and narrow its results, semantic query expansion techniques add semantically related terms to the original query. However, semantic query expansion for Arabic queries is still a challenge due to the lack of rich semantic sources. Most of the existing solutions rely on using either English sources or specific-domain Arabic ontologies. Using English sources requires a translation phase which may lead to query drift, thus providing unrelated expansion terms. In this paper, we provide an overview of the query expansion approaches. Besides, we propose a hybrid comprehensive reference framework for Arabic semantic query expansion that overcomes the lack of Arabic semantic sources by using rich English ontologies to complement the limited Arabic sources (Arabic Wordnet) currently available. It ensures the Arabic-English translation process using a customized machine learning translation model to avoid query drifting. It also transforms natural language to SPARQL (an ontology query language) to easily query English sources (e.g., DBpedia). For enhanced accuracy, it provides an optimization module where meta-heuristics can be used for pertinent terms selection. This work represents a step forward in combining English sources and AI to design a practical Arabic semantic expansion.