Classifying Images Regardless of Depictive Style

Lead Research Organisation: University of Bath

Department Name: Computer Science

Abstract

Computer's today can recognise objects in photographs. This ability has formed the basis of many familiar applications such as Facebook tagging, Google Image Search, Google Goggles, and automated passport checking at UK borders.

Yet a significant restriction remains: computers can only recognise objects in photographs. At least, their ability to recognise objects in
drawings and paintings - in artwork of any kind - is strictly limited. If this limitation can be overcome then many more applications will
become possible.

One is a new way to search the web for images in which a drawing (say) is dragged from the desktop into a search bar,
and paintings and photographs are given back to the user (at the moment a user gets the same sort of image back as was dragged into the
search bar).

Another is the automated production of catalogues for taxonomy - which is important to scientists faces with tens of thousands of
microscopic creatures; species catalogues are hand-drawn right now so automation would be a significant advance for them.

The output of the programme would also allow ordinary photographs to be converted to icons. This is not as dry as it sounds, but could help the visually impaired to gain access to photographic content. If photographs and drawings can be linked in the way this project has in mind, then
objects in photographs could be turned into icons rendered by a set of raised pins. So there would be a symbol for car, say, not unlike that which might be drawn by a child - and in fact this is very close to the icons blind artists draw. This would allow the visually impaired to read photographs in newspapers, or in text books, and allow them to share the holiday snaps of family and friends.

This proposal is about building the basic technology that underpins these applications, and quite possibly others too. Key to it is lifting the barrier that computers of today face - allowing them to recognise objects no matter how they are depicted.

Planned Impact

The impact of this proposal will show in the near, medium, and longer terms.
It will also show in the academic sector, the industrial sector, in society, and is media friendly.
We plan to pursue impact in all of these.

The academic beneficiaries are detailed in another section of this form and so will not be detailed here,
expect to say that medium and longer impact requires further input from the academic communities.

Impact in the industrial sector has an obvious route via sophisticated search engines, as built by companies such as Google and Microsoft.
Internet search will benefit because the total of all images includes not just photographs but also artwork of all kinds.
The ability to pose queries as sketches and retrieve photographs is a current research topic; this proposal would not contribute directly
but we do plan to build a prototype as our application (objective 4). We note that Google offer PhD studentships, a route we intend to take advantage of as this proposal progresses, with a view to constructing a more substantive web-search application, and so we regard it as a medium term aim.

In terms of wider society, we find the possibility of allowing the visually impaired access to photographic content particularity appealing. In fact, because our philosophy is to not discriminate between depictive styles, our approach should allow the visually impaired access to a mush wider variety of images than at present. Access is possible if we can summarise visual content so as to remove unwanted clutter; recognising objects and synthesising icons that can be felt allows people to read newspapers, share photos, and generally raises their standard of living.
This would a medium to long-term aim.

The proposal is media friendly: the ability to drag a photograph of the Queen onto a search bar and have all of her portraits returned is something that would be of general interest. It is to the benefit of the research community as a whole and to research councils in particular to receive favourable publicity.

Funded Value:

£241,836

Funded Period:

Jun 13 - Jun 16

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/K015966/1

Principal Investigator:

Peter Hall

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Image & Vision Computing (100%)

Organisations

People	ORCID iD
Peter Hall (Principal Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Cai H (2015) Beyond Photo Domain Object Recognition: Benchmarks for the Cross Depiction Problem

Hall P (2015) Cross-depiction problem: Recognition and synthesis of photographs and artwork in Computational Visual Media

Westlake N (2016) Computer Vision - ECCV 2016 Workshops - Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part I

Wu Q (2014) Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part VII

Key Findings
Impact Summary
Research Databases and Models
Collaboration
Engagement Activities


Description	All current methods for object recognition, including ours, are able to recognise objects in photographs very well. However, no current method except ours is able to recognise objects in both photographs and art . Our key findings + Released two cross-depiction datasets. Currently no such dataset exists for cross-depiction problem. + Conducted extensive experiments to benchmark classification, domain adaptation, detection and deep learning methods on the cross-depiction task. It greatly helps the computer vision community to understand the performance of leading techniques on this new field and gives insights for potential solutions. + Developed a multi-labeled graph model with learned discriminative weights which is able to model object classes over a broad range of depictive styles. + Adapted DPM with cross-depiction expansion to bridge the gap between photo and art domains, leading to a significant rise in performance. + Adapted the state-of-the-art deep learning method -- fast-RCNN (Regional Convolutional Neural Networks) to detect people in all kinds of art images. + Designed dual convolutional neural networks to simultaneously minimise the classification error and the domain discrepancy.
Exploitation Route	This is a growing area in Computer Vision. Web search engines, commercial companies requiring advanced indexing, even converting photos to icons for (eg) blind people. We plan also the extend the work to 3D objects and to video. The work is of broad value, with expressions of interest from the RNIB, British Library, car component manufacturers, CAD companies, and the creative sector. We plan to meet BL and RNIB to discuss joint ways forward - in fact several grants with them are under development. --- Update --- We have now submitted an EPSRC proposal in the related area of Style Transfer, with Art Historians who curate and research. We are developing a large scale proposal in Assistive Computing, with RNIB involved. We are supervising a CSC student in Style Transfer, with simulataneously advances our agenda for UK/China collaboration.
Sectors	Communities and Social Services/Policy Creative Economy Education Healthcare Culture Heritage Museums and Collections Other
URL	http://opus.bath.ac.uk/41062/2/main.pdf


Description	Interested new industrial partners include the RNIB, charity Designability, and art historians. The work has supported on-going projects with two companies: Ninja Theory and Disney. It has helped develop relations with Chinese, Canadian, and German academics. -- update --- A major proposal is now under development pulling together two leading UK universities, and a true cross-discipline collaboation of Computer Science, Psychology, Electronic Engineering, and Education as well and national bodies including RNIB. We will partner with UK software houses, and EU hardware manufacturers.
First Year Of Impact	2019
Sector	Creative Economy,Education,Culture, Heritage, Museums and Collections,Other
Impact Types	Cultural Societal Economic


Title	People-Art
Description	A collection of 4000+ images, each showing at least one human figure. The images come in a broad range of artistic styles.
Type Of Material	Database/Collection of data
Year Produced	2015
Provided To Others?	Yes
Impact	The database shows the failure of all contemporary computer vision methods to detect people in artwork. It has motivated our current research direction.


Title	Photo-Art-50
Description	A collection of images; artwork of all kinds. This augments the famous CalTech-256 dataset with artwork it was previously lacking.
Type Of Material	Database/Collection of data
Year Produced	2015
Provided To Others?	Yes
Impact	The data was used to show a failure mode for all contemporary methods in computer vision for recognition; to explain the failure empirically; and the to address it. A URL will be published shortly.


Description	Art History Through Recognition
Organisation	University of Tuzla
Department	Philosophy Faculty
Country	Bosnia and Herzegovina
Sector	Academic/University
PI Contribution	Plans to work on recognition in art historical databases. To couthor publications. We provide techncial developments.
Collaborator Contribution	Plans to work on recognition in art historical databases. To couthor publications. Provides access to databases, to European groups, and to critical assessement.
Impact	Early stage of collaboration
Start Year	2018


Description	Assessing Automatci Art by Appreciation
Organisation	Carleton University
Country	Canada
Sector	Academic/University
PI Contribution	We provide datasets, experiments, IP etc
Collaborator Contribution	They provide datasets, experiments, IP etc
Impact	Just started
Start Year	2019


Description	Tactile Images for the Visually Imparied
Organisation	Royal National Institute for Blind People
Country	United Kingdom
Sector	Charity/Non Profit
PI Contribution	We provide techical development
Collaborator Contribution	RNIB provide access to user groups
Impact	Just started
Start Year	2019


Description	Tactile Images for the Visually Imparied
Organisation	University of Bath
Department	Designability (Bath Institute of Medical Engineering)
Country	United Kingdom
Sector	Academic/University
PI Contribution	We provide technical developments
Collaborator Contribution	Designabiilty will conduct field tests
Impact	Just started
Start Year	2019


Description	workshop organisation
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Other audiences
Results and Impact	We have organised and helped to organise the "Expressive" triad of workshops, and the VisArt workshop. Our work in the area of cross depiction and NPR is therefore helping bring together computer graphics, computer vision, cultural history, and have impact with the British Librayr, the RNIB, as well ad CAD companies and others.
Year(s) Of Engagement Activity	2014,2016

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications