Semi-automatic dictionary curation for domain-specific ontologies

Gavankar, Chetana

Please use this identifier to cite or link to this item: http://dspace.bits-pilani.ac.in:8080/jspui/handle/123456789/8645

Full metadata record

DC Field	Value	Language
dc.contributor.author	Gavankar, Chetana	-
dc.date.accessioned	2023-01-23T06:20:32Z	-
dc.date.available	2023-01-23T06:20:32Z	-
dc.date.issued	2013	-
dc.identifier.uri	https://ieeexplore.ieee.org/document/6735323	-
dc.identifier.uri	http://dspace.bits-pilani.ac.in:8080/xmlui/handle/123456789/8645	-
dc.description.abstract	Within the broad area of information extraction, we study the problem of effective dictionary curation in an enterprise setting. Equipped with an ontology, representative of the domain of an enterprise, our approach populates the attributes of leaf nodes of the ontology with instances extracted from the enterprise corpus. For an attribute of interest, given a few seed examples or indicative features for the attribute, we first obtain a ranked list of 'list pages' potentially containing additional dictionary terms. Our ranking model ranks pages from the enterprise corpus based on their 'list' content using several visual and lexical features. We gather users' judgement of the result pages and the model continuously learns from this feedback. We compare different techniques of dictionary curation using rule based extractors and visual features of pages. Based on rule writing exercise, we show the benefit of dictionaries for leaf node attributes, in writing rule based extractors for higher level nodes in an ontology. We have implemented a dictionary curation system based on these ideas. Experimental analysis using academic domain ontology and universities corpora, reveal (in the context of enterprise analytics) (i) the merit of dictionary support in rule based information extraction (ii) the viability and effectiveness of an interactive approach for dictionary creation.	en_US
dc.language.iso	en	en_US
dc.publisher	IEEE	en_US
dc.subject	Computer Science	en_US
dc.subject	Information extraction	en_US
dc.subject	Dictionary curation	en_US
dc.subject	Ontology Population	en_US
dc.title	Semi-automatic dictionary curation for domain-specific ontologies	en_US
dc.type	Article	en_US
Appears in Collections:	Department of Computer Science and Information Systems

Files in This Item:

There are no files associated with this item.

Show simple item record