Automatic ontology augmentation: evaluation issues

Lead Research Organisation: University of Cambridge
Department Name: Computer Science and Technology

Abstract

New scientific knowledge in areas such as biology and biochemistry is being discovered at an unprecedented rate. In some cases, structured knowledge sources are available, in the form of semantic web markup or databases, but most scientific discoveries are still reported in thetraditional way, as journal articles or conference proceedings. Managing this vast amount of information, whether structured or unstructured, requires mapping between disparate knowledge sources, involving different nomenclature and relationships. Ontologies have played a critical role in addressing the challenge of semantic integration of such knowledge. Constructing ontologies is an extremely laborious effort. Not only must researchers agree on the concepts and relationships needed for a domain of knowledge, but they must also do so in a way that minimizes errors and is easy to update and maintain. There is therefore considerable interest in creating or augmenting ontologies automatically by analysing text. However, none of the research in this area has yet had a significant impact on the process of creating ontologies for scientific domains. In this short project, we intend to collaborate with a visiting researcher, Dr Inderjeet Mani, who is a leader in this field to look specifically at the issue of evaluatingautomatically created ontologies. Research in language processing in general requires good evaluation techniques to be agreed on by the relevant community. Without such techniques, it is impossible to replicate results and build on previous work in a motivated fashion. Evaluation of automatically created ontologies is in its infancy, which is hampering research in the area. The most effective way to make progress is by intensive discussion between different groups,backed up with small scale experimentation. Dr Mani's visit will allow us to improve on existing evaluation practice.

Publications

10 25 50