Abstract:
Concepts in ontologies can be used in many scenarios, including annotation of online resources, automatic ontology population, and document classification to improve web search results. Collectively, tens of millions of concepts have been defined in a large number of ontologies that cover many overlapping domains. The scale, duplication and ambiguity makes concept search a challenging problem. We present a novel concept search approach that exploits structures present in ontologies and constructs contexts to effectively filter the noise in concept search results. The three key components of our approach are (1) a context for each concept extracted from relevant properties and axioms, (2) query interpretation based on the extracted context and (3) result ranking using learning to rank algorithms. We evaluate our approach on a large dataset from BioPortal. Our comprehensive evaluation is performed on 2,062,080 concepts and more than 2,000 queries, using two widely-employed performance metrics: normalized discounted cumulative gain (NDCG) and mean reciprocal rank (MRR). Our approach outperforms BioPortal significantly for multitoken queries that make up a large percentage of total queries.