Advance computational methods for extracting, classifying and linking information from art-historical texts.

Lead Research Organisation: University of Sheffield
Department Name: Computer Science

Abstract

Much art-historical knowledge is contained in unstructured texts such as catalogues and technical reports. Historically, this information has rarely been converted to structured forms and stored in databases, making it difficult to find information needed for research, which may require complex querying (e.g. paintings which use a mixture of lead white and azurite bound in egg tempera; paintings which were in Paris during Rubens' visit in 1625), or to present the information to the public through innovative and engaging interfaces such as maps, timelines, etc. The project is embedded within the National Gallery's exciting Digital Dossiers Project, a cornerstone of its 2024 Bicentenary celebrations.

Research to advance computational methods for extracting and linking information from art-historical texts, focusing on a corpus of National Gallery publications. While NLP techniques in this area have evolved considerably with the advent of modern deep learning methods, they are typically either too general or highly specific, and thus not well adapted to the specific vocabularies and literary conventions in this domain. The research will advance the state of the art in developing novel methods and tools to perform entity recognition, classification and linking specifically for this domain.

The work will be incorporated into the Gallery's software tools as a practical project outcome, thereby ensuring high impact as these will be available for widespread use in a public setting. The Gallery has incredibly rich documentation about its paintings, going back for well over a century and a half. Your research will enable them to effectively index new aspects of the collection and present it to the public in new and engaging ways.

Publications

10 25 50