Graph-based methods for significant concept selection
Abstract
It is well known in information retrieval area that one important issue is the gap between the query and document vocabularies. Concept-based representation of both the document and the query is one of the most effective approaches that lowers the effect of text mismatch and allows the selection of relevant documents that deal with the shared semantics hidden behind both. However, identifying the best representative concepts from texts is still challenging. In this paper, we propose a graph-based method to select the most significant concepts to be integrated into a conceptual indexing system. More specifically, we build the graph whose nodes represented concepts and weighted edges represent semantic distances. The importance of concepts are computed using centrality algorithms that levrage between structural and contextual importance. We experimentally evaluated our method of concept selection using the standard ImageClef2009 medical data set. Results showed that our approach significantly improves the retrieval effectiveness in comparison to state-of-the-art retrieval models.