Seebibyte: Visual Search for the Era of Big Data
Lead Research Organisation:
University of Oxford
Department Name: Engineering Science
Abstract
The Programme is organised into two themes.
Research theme one will develop new computer vision algorithms to enable efficient search and description of vast image and video datasets - for example of the entire video archive of the BBC. Our vision is that anything visual should be searchable for, in the manner of a Google search of the web: by specifying a query, and having results returned immediately, irrespective of the size of the data. Such enabling capabilities will have widespread application both for general image/video search - consider how Google's web search has opened up new areas - and also for designing customized solutions for searching.
A second aspect of theme 1 is to automatically extract detailed descriptions of the visual content. The aim here is to achieve human like performance and beyond, for example in recognizing configurations of parts and spatial layout, counting and delineating objects, or recognizing human actions and inter-actions in videos, significantly superseding the current limitations of computer vision systems, and enabling new and far reaching applications. The new algorithms will learn automatically, building on recent breakthroughs in large scale discriminative and deep machine learning. They will be capable of weakly-supervised learning, for example from images and videos downloaded from the internet, and require very little human supervision.
The second theme addresses transfer and translation. This also has two aspects. The first is to apply the new computer vision methodologies to `non-natural' sensors and devices, such as ultrasound imaging and X-ray, which have different characteristics (noise, dimension, invariances) to the standard RGB channels of data captured by `natural' cameras (iphones, TV cameras). The second aspect of this theme is to seek impact in a variety of other disciplines and industry which today greatly under-utilise the power of the latest computer vision ideas. We will target these disciplines to enable them to leapfrog the divide between what they use (or do not use) today which is dominated by manual review and highly interactive analysis frame-by-frame, to a new era where automated efficient sorting, detection and mensuration of very large datasets becomes the norm. In short, our goal is to ensure that the newly developed methods are used by academic researchers in other areas, and turned into products for societal and economic benefit. To this end open source software, datasets, and demonstrators will be disseminated on the project website.
The ubiquity of digital imaging means that every UK citizen may potentially benefit from the Programme research in different ways. One example is an enhanced iplayer that can search for where particular characters appear in a programme, or intelligently fast forward to the next `hugging' sequence. A second is wider deployment of lower cost imaging solutions in healthcare delivery. A third, also motivated by healthcare, is through the employment of new machine learning methods for validating targets for drug discovery based on microscopy images
Research theme one will develop new computer vision algorithms to enable efficient search and description of vast image and video datasets - for example of the entire video archive of the BBC. Our vision is that anything visual should be searchable for, in the manner of a Google search of the web: by specifying a query, and having results returned immediately, irrespective of the size of the data. Such enabling capabilities will have widespread application both for general image/video search - consider how Google's web search has opened up new areas - and also for designing customized solutions for searching.
A second aspect of theme 1 is to automatically extract detailed descriptions of the visual content. The aim here is to achieve human like performance and beyond, for example in recognizing configurations of parts and spatial layout, counting and delineating objects, or recognizing human actions and inter-actions in videos, significantly superseding the current limitations of computer vision systems, and enabling new and far reaching applications. The new algorithms will learn automatically, building on recent breakthroughs in large scale discriminative and deep machine learning. They will be capable of weakly-supervised learning, for example from images and videos downloaded from the internet, and require very little human supervision.
The second theme addresses transfer and translation. This also has two aspects. The first is to apply the new computer vision methodologies to `non-natural' sensors and devices, such as ultrasound imaging and X-ray, which have different characteristics (noise, dimension, invariances) to the standard RGB channels of data captured by `natural' cameras (iphones, TV cameras). The second aspect of this theme is to seek impact in a variety of other disciplines and industry which today greatly under-utilise the power of the latest computer vision ideas. We will target these disciplines to enable them to leapfrog the divide between what they use (or do not use) today which is dominated by manual review and highly interactive analysis frame-by-frame, to a new era where automated efficient sorting, detection and mensuration of very large datasets becomes the norm. In short, our goal is to ensure that the newly developed methods are used by academic researchers in other areas, and turned into products for societal and economic benefit. To this end open source software, datasets, and demonstrators will be disseminated on the project website.
The ubiquity of digital imaging means that every UK citizen may potentially benefit from the Programme research in different ways. One example is an enhanced iplayer that can search for where particular characters appear in a programme, or intelligently fast forward to the next `hugging' sequence. A second is wider deployment of lower cost imaging solutions in healthcare delivery. A third, also motivated by healthcare, is through the employment of new machine learning methods for validating targets for drug discovery based on microscopy images
Planned Impact
The proposed programme encompasses new methodology and applied research in computer vision that will impact not only the imaging field, but other non-imaging disciplines, and it will encourage end-user uptake of imaging technologies and commercial interest in embedding imaging technologies in products. These are the main beneficiaries of programme research.
We have carefully chosen members of our Programme Advisory Board (PAB) and User Group to represent a comprehensive and diverse range of academic and industry interests and expect them to challenge us to ensure that the impact of the Programme is realised. We will ensure that both the PAB and the User Group are constantly refreshed with appropriate representatives.
The Programme will have Economic and Societal impact by
1. Developing new and improved computer vision technologies for commercialisation by a wide range of companies;
2. Enhancing the Big Data capabilities and knowledge base of UK industries.
3. Enhancing quality of life by improving, for instance, healthcare capabilities, surveillance, environmental monitoring of roads, and new means of enjoying digital media in the home. Other engineering advances will aim to make a large impact "behind the scenes", for instance to underpin better understanding of biological effects at the individual cell level and characterisation of advanced materials.
4. Training the next generation of computer vision researchers who will be equipped to support the imaging needs of science, technology and wider society for the future;
Impact on Knowledge includes
1. Realisation of new approaches to essential computer vision technology, and the dissemination of research findings through publications and conference presentations and the distribution of open source software and image databases.
2. Sharing knowledge with collaborators via Transfer and Application Projects (TAPs) and other activities leading to adoption of advanced computer vision methods across many disciplines of science, engineering and medicine that currently do no use them.
3. Communication of advances to a public audience through website articles and other co-ordinated public understanding activities.
We have carefully chosen members of our Programme Advisory Board (PAB) and User Group to represent a comprehensive and diverse range of academic and industry interests and expect them to challenge us to ensure that the impact of the Programme is realised. We will ensure that both the PAB and the User Group are constantly refreshed with appropriate representatives.
The Programme will have Economic and Societal impact by
1. Developing new and improved computer vision technologies for commercialisation by a wide range of companies;
2. Enhancing the Big Data capabilities and knowledge base of UK industries.
3. Enhancing quality of life by improving, for instance, healthcare capabilities, surveillance, environmental monitoring of roads, and new means of enjoying digital media in the home. Other engineering advances will aim to make a large impact "behind the scenes", for instance to underpin better understanding of biological effects at the individual cell level and characterisation of advanced materials.
4. Training the next generation of computer vision researchers who will be equipped to support the imaging needs of science, technology and wider society for the future;
Impact on Knowledge includes
1. Realisation of new approaches to essential computer vision technology, and the dissemination of research findings through publications and conference presentations and the distribution of open source software and image databases.
2. Sharing knowledge with collaborators via Transfer and Application Projects (TAPs) and other activities leading to adoption of advanced computer vision methods across many disciplines of science, engineering and medicine that currently do no use them.
3. Communication of advances to a public audience through website articles and other co-ordinated public understanding activities.
Organisations
- University of Oxford, United Kingdom (Collaboration, Lead Research Organisation)
- University of Cambridge, United Kingdom (Collaboration)
- University of Leeds, United Kingdom (Collaboration)
- Continental AG (Collaboration)
- University of Sheffield, United Kingdom (Collaboration)
- University of Manchester, Manchester, United Kingdom (Collaboration)
- King's College London, United Kingdom (Collaboration)
- Intelligent Ultrasound (Project Partner)
- Oxford University Hospitals NHS Trust, United Kingdom (Project Partner)
- The Wellcome Trust Sanger Institute (Project Partner)
- Max Planck, Germany (Project Partner)
- Skolkovo Inst of Sci and Tech (Skoltech) (Project Partner)
- British Broadcasting Corporation - BBC, United Kingdom (Project Partner)
- MirriAd (Project Partner)
- Yotta Ltd (Project Partner)
- GE Global Research, Germany (Project Partner)
- Mirada Medical UK (Project Partner)
- BP British Petroleum, United Kingdom (Project Partner)
- Qualcomm Incorporated (Project Partner)
- Microsoft Research Ltd, United Kingdom (Project Partner)
Publications

Fouhey D
(2016)
3D Shape Attributes

Wiles O.
(2018)
3D Surface Reconstruction by Pointillism

Maraci MA
(2017)
A framework for analysis of linear ultrasound videos to detect fetal presentation and heartbeat.
in Medical image analysis

Nketia TA
(2017)
Analysis of live cell images: Methods, tools and opportunities.
in Methods (San Diego, Calif.)

Bridge CP
(2017)
Automated annotation and quantitative description of ultrasound videos of the fetal heart.
in Medical image analysis

Schofield D
(2019)
Chimpanzee face recognition from videos in the wild using deep learning
in Science Advances

Lu, E.
(2018)
Class-Agnostic Counting

Bengani H
(2017)
Clinical and molecular consequences of disease-associated de novo mutations in SATB2.
in Genetics in medicine : official journal of the American College of Medical Genetics

Zhong Y.
(2018)
Compact Deep Aggregation for Set Retrieval

Chung J
(2017)
Computer Vision - ACCV 2016

Chung J
(2017)
Computer Vision - ACCV 2016 Workshops

Arteta C
(2016)
Computer Vision - ECCV 2016





Feichtenhofer C
(2016)
Convolutional Two-Stream Network Fusion for Video Action Recognition

Arteta, C
(2016)
Counting in The Wild

Liotti E
(2018)
Crystal nucleation in metallic alloys using x-ray radiography and machine learning.
in Science advances

Afouras T
(2018)
Deep Audio-visual Speech Recognition.
in IEEE transactions on pattern analysis and machine intelligence



Afouras T
(2018)
Deep Lip Reading: A Comparison of Models and an Online Application

Feichtenhofer C.
(2017)
Detect to Track and Track to Detect

Feichtenhofer C
(2017)
Detect to Track and Track to Detect
Description | Our two-stream approach of basic research and dissemination seems to be working. On the first, we are publishing our research at the principal conferences and winning prizes. We conduct dissemination by engaging other communities through our Show-and-Tell events and Transfer Application Projects. We also make available our software, datasts, and publications that have emerged from our research. |
First Year Of Impact | 2016 |
Sector | Healthcare,Culture, Heritage, Museums and Collections,Retail,Transport |
Impact Types | Cultural,Economic |
Description | Royal Society Privacy Enhancing Technologies Working Group |
Geographic Reach | National |
Policy Influence Type | Participation in a advisory committee |
Description | AWS Machine Learning Research Awards Program |
Amount | $225,000 (USD) |
Organisation | Amazon.com |
Sector | Private |
Country | United States |
Start | 02/2018 |
End | 01/2020 |
Description | Big Data Science in Medicine and Healthcare |
Amount | £55,000 (GBP) |
Organisation | University of Oxford |
Department | Oxford Martin School |
Sector | Academic/University |
Country | United Kingdom |
Start | 04/2017 |
End | 03/2020 |
Description | CALOPUS - Computer Assisted LOw-cost Point-of-case UltraSound |
Amount | £1,013,662 (GBP) |
Funding ID | EP/R013853/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Academic/University |
Country | United Kingdom |
Start | 02/2018 |
End | 01/2021 |
Description | ERC Advanced Grant |
Amount | € 2,500,000 (EUR) |
Organisation | European Research Council (ERC) |
Sector | Public |
Country | European Union (EU) |
Start | 11/2016 |
End | 10/2021 |
Description | ERC Starting Grant |
Amount | € 1,500,000 (EUR) |
Organisation | European Research Council (ERC) |
Sector | Public |
Country | European Union (EU) |
Start | 08/2015 |
End | 09/2020 |
Description | End to End Translation of British Sign Language |
Amount | £971,921 (GBP) |
Funding ID | EP/R03298X/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Academic/University |
Country | United Kingdom |
Start | 07/2018 |
End | 06/2021 |
Description | Fellowship for Oxford Student Arsha Nagrani |
Amount | $60,000 (USD) |
Organisation | |
Sector | Private |
Country | United States |
Start | 10/2018 |
End | 09/2020 |
Description | GCRF: Growing Research Capability Call |
Amount | £8,000,000 (GBP) |
Funding ID | MR/P027938/1 |
Organisation | Medical Research Council (MRC) |
Sector | Academic/University |
Country | United Kingdom |
Start | 10/2017 |
End | 09/2021 |
Description | IARPA BAA-16-13 |
Amount | $1,196,818 (USD) |
Organisation | Intelligence Advanced Research Projects Activity |
Sector | Public |
Country | Unknown |
Start | 09/2017 |
End | 09/2021 |
Description | Research Collaboration relating to DNN-based Face Recognition for Surveillance |
Amount | £200,000 (GBP) |
Organisation | Toshiba |
Sector | Private |
Country | Japan |
Start | 10/2017 |
End | 09/2019 |
Description | Scholarship for Andrea Vedaldi's Students |
Amount | £1,000,000 (GBP) |
Organisation | |
Sector | Private |
Country | United States |
Start | 10/2018 |
End | 09/2025 |
Description | Studenships for VGG Oxford |
Amount | £320,000 (GBP) |
Organisation | |
Sector | Private |
Country | United States |
Start | 10/2018 |
End | 09/2022 |
Description | Visual Recognition |
Amount | £308,823 (GBP) |
Organisation | Continental AG |
Sector | Private |
Country | Germany |
Start | 11/2016 |
End | 04/2019 |
Title | 3D Shape Attributes and the CMU-Oxford Sculpture Dataset |
Description | The CMU-Oxford Sculpture dataset contains 143K images depicting 2197 works of art by 242 artists. Each image comes with 12 labels for each of the 3D Shape Attributes defined in our CVPR paper. We additionally provide sample MATLAB code that illustrates reading the data and evaluating a method. |
Type Of Material | Database/Collection of data |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | "We have shown that 3D attributes can be inferred directly from images at quite high quality. These attributes open a number of possibilities of applications and extensions. One immediate application is to use this system to complement metric reconstruction: shape attributes can serve as a top-down cue for driving reconstruction that works even on unknown objects. Another area of investigation is explic¬itly formulating our problem in terms of relative attributes: many of our attributes (e.g., planarity) are better modeled in relative terms. Finally, we plan to investigate which cues (e.g., texture, edges) are being used to infer these attributes." A publication titled "3D Shape Attributes" authored by D.F. Fouhey, A.Gupta and A.Zisserman resulted from this research and was presented at IEEE CVPR 2016. |
URL | http://www.robots.ox.ac.uk/~vgg/data/sculptures/ |
Title | BBC-Oxford Lip Reading Dataset |
Description | The dataset consists of up to 1000 utterances of 500 different words, spoken by hundreds of different speakers. All videos are 29 frames (1.16 seconds) in length, and the word occurs in the middle of the video. |
Type Of Material | Database/Collection of data |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | Publications have resulted form this research and an award has been won: [1] J. S. Chung, A. Zisserman Lip Reading in the Wild - Best Student Paper Award Asian Conference on Computer Vision, 2016 [2] J. S. Chung, A. Zisserman Out of time: automated lip sync in the wild Workshop on Multi-view Lip-reading, ACCV, 2016 |
URL | http://www.robots.ox.ac.uk/~vgg/data/lip_reading/ |
Title | Celebrity in Places Dataset |
Description | The dataset contains over 38k images of celebrities in different types of scenes. There are 4611 celebrities and 16 places involved. The images were obtained using Google Image Search and verified by human annotation. |
Type Of Material | Database/Collection of data |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | Publications have resulted from this research based on this dataset. Y. Zhong, R. Arandjelovic, A. Zisserman Faces in Places: Compound Query Retrieval British Machine Vision Conference, 2016 |
URL | http://www.robots.ox.ac.uk/~vgg/data/celebrity_in_places/ |
Title | LAOFIW Dataset: Labeled Ancestral Origin Faces in the Wild |
Description | LAOFIW is a dataset of 14,000 images divided into four equally sized classes: sub-Saharan Africa, East Asia, Indian subcontinent, Western Europe. |
Type Of Material | Database/Collection of data |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | A publication titled "Turning a Blind Eye: Explicit Removal of Biases and Variation from Deep Neural Network Embeddings " authored by M. Alvi, A. Zisserman, C. Nellaker resulted from this research and was presented at the Workshop on Bias Estimation in Face Analytics, ECCV 2018. |
URL | http://www.robots.ox.ac.uk/~vgg/data/laofiw/ |
Title | Lip Reading Sentences 3 (LRS3) Dataset |
Description | The dataset consists of thousands of spoken sentences from TED and TEDx videos. There is no overlap between the videos used to create the test set and the ones used for the pre-train and trainval sets. The dataset statistics are given in the table below. |
Type Of Material | Database/Collection of data |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | A publication has resulted in the research: T. Afouras, J. S. Chung, A. Zisserman LRS3-TED: a large-scale dataset for visual speech recognition |
URL | http://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs3.html |
Title | THE SHERLOCK TV SERIES DATASET |
Description | We provide data for all the three episodes of Season 1 of the BBC TV series "Sherlock". Each episode is almost an hour long. The DVDs can be purchased online, for example from Amazon. Face detections, tracks, shots and ground truth annotation for the character's identity are provided in csv format here (README). This folder also contains a few example synchronisation frames for each episode. |
Type Of Material | Database/Collection of data |
Year Produced | 2017 |
Provided To Others? | Yes |
Impact | In using images of actors to recognize characters, we make the following three contributions: We demonstrate that an automated semi-supervised learning approach is able to adapt from the actor's face to the character's face, including the face context of the hair. By building voice models for every character we provide a bridge between frontal faces (for which there is plenty of actor-level supervision) and profile (for which there is very little or none). We use a CNN model pretrained on the VoxCeleb dataset . The model can be downloaded here. By combining face context and speaker identification, we are able to identify characters with partially occluded faces and extreme facial poses. A paper titled "From Benedict Cumberbatch to Sherlock Holmes: Character Identification in TV series without a Script " authored by Arsha Nagrani and Andrew Zisserman resulted from this research and was presented at the British Machine Vision Conference, 2017. |
URL | http://www.robots.ox.ac.uk/~vgg/data/Sherlock/ |
Title | Text Localisation Dataset |
Description | This is a synthetically generated dataset, in which word instances are placed in natural scene images, while taking into account the scene layout. The dataset consists of 800 thousand images with approximately 8 million synthetic word instances. Each text instance is annotated with its text-string, word-level and character-level bounding-boxes |
Type Of Material | Database/Collection of data |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | A publication has resulted from this research: A. Gupta, A. Vedaldi, A. Zisserman Synthetic Data for Text Localisation in Natural Images IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016 |
URL | http://www.robots.ox.ac.uk/~vgg/data/scenetext |
Title | The 'Celebrity Together' Dataset |
Description | The 'Celebrity Together' dataset has 194k images containing 546k faces in total, covering 2622 labeled celebrities (same identities as VGGFace Dataset). 59% faces correspond to these 2622 celebrities, and the rest faces are considered as 'unknown' people. The images in this dataset were obtained using Google Image Search and verified by human annotation. Further details of the dataset collection procedure are explained in the paper. |
Type Of Material | Database/Collection of data |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | This dataset contains images that have multiple labeled celebrities per image (see example images above). It therefore can be used as an evaluation benchmark for retrieving a set of identities. Namely, given a query of a set of identities (and one or several face images are provided for each identity), the system should return a ranked list of the dataset images, such that images containing all the query identities are ranked first, followed by images containing all but one, etc. A publication titled "Compact Deep Aggregation for Set Retrieval" authored by Y.Zhong, R. Arandjelovic and A. Zisserman resulted from this research and won the Best Paper Award at ECCV 2018. |
URL | http://www.robots.ox.ac.uk/~vgg/data/celebrity_together/ |
Title | The Oxford-BBC Lip Reading Sentences 2 (LRS2) Dataset |
Description | The dataset consists of thousands of spoken sentences from BBC television. Each sentences is up to 100 characters in length. The training, validation and test sets are divided according to broadcast date. |
Type Of Material | Database/Collection of data |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | A publication has resulted as the research of this dataset: T. Afouras, J. S. Chung, A. Senior, O. Vinyals, A. Zisserman Deep Audio-Visual Speech Recognition |
URL | http://www.robots.ox.ac.uk/~vgg/data/lip_reading/lrs2.html |
Title | VoxCeleb 2: A large scale audio-visual dataset of human speech |
Description | VoxCeleb is an audio-visual dataset consisting of short clips of human speech, extracted from interview videos uploaded to YouTube with 7,000 + speakers, over 1 million utterances. over 2000 hours of recording. VoxCeleb contains speech from speakers spanning a wide range of different ethnicities, accents, professions and ages. All speaking face-tracks are captured "in the wild", with background chatter, laughter, overlapping speech, pose variation and different lighting conditions. Each segment is at least 3 seconds long. |
Type Of Material | Database/Collection of data |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | A publication titled "VoxCeleb2: Deep Speaker Recognition " authored by J.S.Chung, A.Nagrani and A.Zisserman resulted from this research and was presented at Interspeech 2018. |
URL | http://www.robots.ox.ac.uk/~vgg/data/voxceleb/ |
Title | VoxCeleb: a large-scale speaker identification dataset |
Description | VoxCeleb contains over 100,000 utterances for 1,251 celebrities, extracted from videos uploaded to YouTube. The dataset is gender balanced, with 55% of the speakers male. The speakers span a wide range of different ethnicities, accents, professions and ages. There are no overlapping identities between development and test sets. |
Type Of Material | Database/Collection of data |
Year Produced | 2017 |
Provided To Others? | Yes |
Impact | "We provide a fully automated and scalable pipeline for audio data collection and use it to create a large-scale speaker identification dataset called VoxCeleb, with 1,251 speakers and over 100,000 utterances. In order to establish benchmark performance, we develop a novel CNN architecture with the ability to deal with variable length audio inputs, which out¬performs traditional state-of-the-art methods for both speaker identification and verification on this dataset." A publication titled "VoxCeleb: a large-scale speaker identification dataset " authored by Arsha Nagrani, Joon Son Chung and Andrew Zisserman resulted from this research and was presented at Interspeech 2017. |
URL | http://www.robots.ox.ac.uk/~vgg/data/voxceleb/ |
Description | 2017TAP1 - Dante Editions |
Organisation | University of Manchester |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | This project will use the SeebiByte image software to undertake a preliminary investigation of the design features of early printed editions of Dante's Divine Comedy, published between 1472 and 1491, held in and digitized by the John Rylands Library, University of Manchester. By focusing on a single iconic literary text in the first twenty years of its print publication, Manchester can investigate the evolution of the page design, from the first editions which contain the text of the poem only, to later ones of increasing visual and navigational sophistication, as elements such as titles, author biographies, commentaries, rubrics, summaries, page numbers, illustrations, and devotional material are introduced into the object. The use of computer vision techniques will allow Manchester to approach these books and the study of Dante in an entirely new way and will add greatly to our knowledge of early modern book technologies and information design. |
Collaborator Contribution | Manchester will supply data for analysis. |
Impact | This collaboration is a cross disciplinary work between Visual Geometry and Humanities. |
Start Year | 2017 |
Description | 2017TAP2 - Visual Design |
Organisation | University of Leeds |
Department | School of Languages, Cultures and Societies |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | This project looks at how graphic resources are used in the wild - in specific text genres and locales (languages / cultures / regions). Rather than doing so on the basis of hand-picked examples, intended to illustrate a particular phenomenon, it allows us to ask whether a particular feature or combination of features is found in a particular document. More significantly, it allows us to ask whether the frequency of features varies across corpora of documents - i.e. whether a given feature is more or less common in a given genre or locale. |
Collaborator Contribution | Leeds will provide data for the project. |
Impact | This project is a cross disciplinary collaboration between computer vision and Arts and Humanities. |
Start Year | 2017 |
Description | 2017TAP3 - DigiPal (Text) |
Organisation | King's College London |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The project has two main objectives: to develop a tool to automatically count lines on a medieval manuscript page and to test the potential for image segmentation of phrases (and possibly even letter-forms) on a corpus of medieval Scottish charters written in Latin. |
Collaborator Contribution | KCL will supply data to be analysed. |
Impact | This project is a cross-disciplinary collaboration between computer vision and humanities. |
Start Year | 2017 |
Description | 2017TAP4 - DegiPal (Tiling) |
Organisation | King's College London |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | This project will develop a tool to analyse thousands of images of medieval manuscript and sort them according to agreed criteria (e.g. 'does the image contain an illustration?'). The objective is to eliminate material that is not relevant to researchers and to automatically detect the regions of images which are of interest. |
Collaborator Contribution | KCL will provide images to be analysed. |
Impact | This project is a cross-disciplinary collaboration between computer vision and humanities. |
Start Year | 2017 |
Description | 2017TAP5 - 19C Books (Matcher) |
Organisation | University of Sheffield |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Rather than matching a number of illustrations with one specific illustration, it is hoped that by using machine learning, clusters of matches can be found without the need to provide the software with visual attributes of one illustration but to be able to attribute different visual attributes to different clusters of illustrations. By doing this, it will allow the researcher to get to know more about their data and has the potential to lead to unexpected clusters of matches that can initiate further research. Researchers with substantial datasets may not always have particular illustrations in mind that they wish to find matches for. Using machine learning in this way will allow researchers to ask more general questions about their data and provide further lines of enquiry. |
Collaborator Contribution | Sheffield will provide dataset for the project. |
Impact | This project is a cross-disciplinary collaboration between computer vision and humanities. |
Start Year | 2017 |
Description | 2017TAP6 - 19C Books (Classifier) |
Organisation | University of Sheffield |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | It is the main objective of this project to use machine learning in order to be able to identify the main print processes that were used to produce illustrations in the eighteenth and nineteenth centuries. Rather than focussing upon the iconographic details of the illustration, the aim is to understand the style of the illustration and whether machine learning techniques are a viable way in which to classify style and method as opposed to visual content. |
Collaborator Contribution | Sheffield will provide dataset. |
Impact | This project is a cross-disciplinary collaboration between computer vision and humanities. |
Start Year | 2017 |
Description | 2017TAP7 - Cylinder Seals |
Organisation | University of Oxford |
Department | Faculty of Oriental Studies |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | This project will seek to answer the question: why has it proven almost impossible to find any matches between physical seals preserved in collections and seal impressions left on tablets or other clay objects? A number of hypotheses readily present themselves. Were seals continuously re-carved so that the number of possible matches is almost nil? Were those seals used to seal documents and objects deposited differently from those worn as amulets and jewellery? Or have more matches not been found simply because the data has been published in a way that does not facilitate answering this question? None of these questions can be answered without fundamentally changing the way seals and seal impressions are ordered, published, and studied. And none of them can be answered through studies of single seals or small collections, they can only be addressed through a large-scale project relying on innovative, data-driven, and, for the most part, computational analysis. |
Collaborator Contribution | The Faculty of Oriental Studies will provide dataset. |
Impact | This project is a cross-disciplinary collaboration between computer vision and humanities. |
Start Year | 2017 |
Description | 2017TAP8 - Fleuron (Matcher) |
Organisation | University of Cambridge |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | 'Fleuron' was created by automatically extracting images of printers' ornaments and small illustrations from Eighteenth-Century Collections Online (ECCO), a database of 36 million images of pages from eighteenth-century books. Approximately 1.6 million images were extracted, consisting chiefly of printers' ornaments, arrangements of ornamental type, small illustrations, and diagrams. Some extraneous material such as library stamps and chunks of text were extracted, but most of these were filtered out at an early stage. The extracted images have all of the metadata associated with the original images supplied by ECCO, i.e.: the author and date of the book, the place of publication, the printer(s) and/or publishers(s), the genre and language of the book. Image matching will also help us to remove any remaining extraneous material in the database (i.e. images falsely identified as non-textual material). |
Collaborator Contribution | Cambridge will provide the dataset. |
Impact | This project is a cross-disciplinary collaboration between computer vision and humanities. |
Start Year | 2017 |
Description | 2017TAP9 - Fleuron (Classifier) |
Organisation | University of Cambridge |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | 'Fleuron' was created by automatically extracting images of printers' ornaments and small illustrations from Eighteenth-Century Collections Online (ECCO), a database of 36 million images of pages from eighteenth-century books. Approximately 1.6 million images were extracted, consisting chiefly of printers' ornaments, arrangements of ornamental type, small illustrations, and diagrams. Some extraneous material such as library stamps and chunks of text were extracted, but most of these were filtered out at an early stage. Currently, the keyword searches available to users of 'Fleuron' do not allow the subject matter of the images to be discovered. The keyword searches are useful for the study of ornaments owned particular printers or use in works by particular authors, but they do not significantly advance the study of the ornaments for their own sake (other than by speeding up the process of browsing). Classification would allow users to find particular types of images within the database, and to investigate the history of certain images and themes. |
Collaborator Contribution | Cambridge will provide the dataset. |
Impact | This project is a cross-disciplinary collaboration between computer vision and humanities. |
Start Year | 2017 |
Description | Graphene Defect Detection |
Organisation | University of Oxford |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We provide software algorithm. |
Collaborator Contribution | The partner provide dataset and interpretation of the computer analysis. |
Impact | Project paper is in Progress. |
Start Year | 2016 |
Description | Metal Crystal Counting |
Organisation | University of Oxford |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We provide software algorithm. |
Collaborator Contribution | The partner provide dataset and interpretation of the data. |
Impact | Software given to collaborator. Project paper is in progress. |
Start Year | 2016 |
Description | Micrograph Defect Detection |
Organisation | University of Oxford |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We provide software algorithm. |
Collaborator Contribution | The partner provide dataset and interpretation of the computer analysis. |
Impact | Software has been given to the collaborator. |
Start Year | 2016 |
Description | Penguin Counting |
Organisation | University of Oxford |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We provide software algorithm. |
Collaborator Contribution | The collaboration provide dataset and specialised analysing methods. |
Impact | Paper: counting in the Wild, by Carlos Arteta, Victor Lempitsky and Andrew Zisserman. This collaboration is between Information Engineering and Zoology disciplines. |
Start Year | 2016 |
Description | Video Recognition from the Dashboard |
Organisation | Continental AG |
Country | Germany |
Sector | Private |
PI Contribution | Working with research engineers to develop recognition in road scenes and human gestures. |
Collaborator Contribution | Supplying data. |
Impact | N/A |
Start Year | 2016 |
Title | Class-Agnostic Counting |
Description | Our General Matching Network (GMN), pretrained on video data, can count an object, e.g. windows or columns, specified as an exemplar patch (in red), without additional training. The heat maps indicate the localizations of the counted objects. This image is unseen during training. |
Type Of Technology | Software |
Year Produced | 2018 |
Open Source License? | Yes |
Impact | A publication has resulted from this research: Class-Agnostic Counting Erika Lu, Weidi Xie, and Andrew Zisserman Asian Conference on Computer Vision (ACCV) 2018 |
URL | http://www.robots.ox.ac.uk/~vgg/publications/ |
Title | Convnet Human Action Recognitio |
Description | This is a model to recognize human actions in video. |
Type Of Technology | Software |
Year Produced | 2016 |
Open Source License? | Yes |
Impact | A publication has resulted from this research: Convolutional Two-Stream Network Fusion for Video Action Recognition. C. Feichtenhofer, A. Pinz, A. Zisserman, CVPR, 2016. |
URL | http://www.robots.ox.ac.uk/~vgg/software/two_stream_action/ |
Title | Convnet Keypoint Detection |
Description | It is a model based on convolution neural network to automatically detect keypoints (like head, elbow, ankle, etc.) in a photograph of a human body. |
Type Of Technology | Software |
Year Produced | 2016 |
Open Source License? | Yes |
Impact | A paper has resulted from this research: V. Belagiannis, A. Zisserman Recurrent Human Pose Estimation arXiv:1605.02914 |
URL | http://www.robots.ox.ac.uk/~vgg/software/keypoint_detection/ |
Title | Convnet text spotting |
Description | This is a model based on convolution neural network to automatically detect English text in a natural images |
Type Of Technology | Software |
Year Produced | 2016 |
Open Source License? | Yes |
Impact | A publication has resulted from this research: A. Gupta, A. Vedaldi, A. Zisserman Synthetic Data for Text Localisation in Natural Images IEEE Conference on Computer Vision and Pattern Recognition, 2016 |
URL | http://www.robots.ox.ac.uk/~vgg/software/textspot/ |
Title | Lip Synchronisation |
Description | This is an sudio-to-video synchronisation network which can be used for audio-visual synchronisation tasks including: (1) removing temporal lags between the audio and visual streams in a video, and (2) determining who is speaking amongst multiple faces in a video. |
Type Of Technology | Software |
Year Produced | 2016 |
Open Source License? | Yes |
Impact | A publication has resulted from this research: J. S. Chung, A. Zisserman Out of time: automated lip sync in the wild Workshop on Multi-view Lip-reading, ACCV, 2016 |
URL | http://www.robots.ox.ac.uk/~vgg/software/lipsync/ |
Title | MatConvNet |
Description | MatConvNet is a MATLAB toolbox implementing Convolutional Neural Networks (CNNs) for computer vision applications. It is simple, efficient, and can run and learn state-of-the-art CNNs. Many pre-trained CNNs for image classification, segmentation, face recognition, and text detection are available. |
Type Of Technology | Software |
Year Produced | 2016 |
Open Source License? | Yes |
Impact | The MatConvNet toolbox is widely employed in the researches conducted by researchers in the Visual Geometry Group in the University of Oxford including Text Spotting, Penguin Counting and Human Action Recognition. The researcher Andrea Vedaldi has taught this software in following Summer Schools: Medical Imaging Summer School (MISS), Favignana (Sicily), 2016: (Somewhat) Advanced Convolutional Neural Networks [slides]; Understanding CNNs using visualisation and transformation analysis [slides]; All video lectures from the summer school. iV&L Net Training School 2016. Malta. |
URL | http://www.vlfeat.org/matconvnet/ |
Title | Seebibyte Visual Tracker (SVT) |
Description | SVT is a tool to track multiple objects in a video. |
Type Of Technology | Webtool/Application |
Year Produced | 2018 |
Open Source License? | Yes |
Impact | This software does not require any training or fine tuning and all the components to track objects in a video are included with this software. |
URL | http://seebibyte.org/ |
Title | Self-supervised Learning from Watching Faces |
Description | Fab-Net is a self-supervised framework that learns a face embedding which encodes facial attributes, such as head pose, expression and facial landmarks. It is trained in a self-supervised manner by leveraging video data. Given two frames from the same face track, FAb-Net learns to generate the target frame from a source frame. |
Type Of Technology | Software |
Year Produced | 2018 |
Open Source License? | Yes |
Impact | A presentation resulted from this research: Self-supervised learning of a facial attribute embedding from video. Wiles, O.*, Koepke, A.S.*, Zisserman, A. In BMVC, 2018. (Oral) |
URL | http://www.robots.ox.ac.uk/~vgg/publications/ |
Title | VGG Face Finder (VFF) |
Description | VFF is a web application that serves as a web engine to perform searches for faces over an user-defined image dataset. It is based on the original application created by VGG to perform visual searchers over a large dataset of images from BBC News. |
Type Of Technology | Webtool/Application |
Year Produced | 2018 |
Open Source License? | Yes |
Impact | Features Performs queries by entering a text or an image Automatically downloads training images from Google Performs automatic training, classification and ranking of results Automatically caches query results Provides an user management interface Allows further query refinement Enables users to create curated queries using their own training images Enables users to create queries based on the metadata of their images Is capable of data ingestion, i.e., users can search their own dataset and define their own metadata Can be executed with GPU support October 2018: The new VFF v1.1 now uses a more accurate CNN for face feature extraction |
URL | http://seebibyte.org/ |
Title | VGG Image Annotator |
Description | VGG Image Annotator is a standalone application, with which you can define regions in an image and create a textual description of those regions. Such image regions and descriptions are useful for supervised training of learning algorithms. |
Type Of Technology | Software |
Year Produced | 2016 |
Open Source License? | Yes |
Impact | The VIA tool has been employed to annotate large volume of scanned images of 15th Century books in the Faculty of Medieval and Modern Languages in the University of Oxford for the 15th Century Booktrade project (http://15cbooktrade.ox.ac.uk/). |
URL | http://www.robots.ox.ac.uk/~vgg/software/via/ |
Title | VGG Image Annotator (VIA) |
Description | VGG Image Annotator (VIA) is an image annotation tool that can be used to define regions in an image and create textual descriptions of those regions. |
Type Of Technology | Webtool/Application |
Year Produced | 2018 |
Open Source License? | Yes |
Impact | Here is a list of some salient features of VIA: based solely on HTML, CSS and Javascript (no external javascript libraries) can be used off-line (full application in a single html file of size < 400KB) requires nothing more than a modern web browser (tested on Firefox, Chrome and Safari) supported region shapes: rectangle, circle, ellipse, polygon, point and polyline import/export of region data in csv and json file format supports bulk update of annotations in image grid view quick update of annotations using on-image annotation editor keyboard shortcuts to speed up annotation |
URL | http://seebibyte.org/ |
Title | VGG Image Classification (VIC) Engine |
Description | VIC is a web application that serves as a web engine to perform image classification queries over an user-defined image dataset. It is based on the original application created by VGG to perform visual searchers over a large dataset of images from BBC News. |
Type Of Technology | Webtool/Application |
Year Produced | 2017 |
Open Source License? | Yes |
Impact | This software performs following functions: -Performs queries by entering a text or an image -Automatically downloads training images from Google -Performs automatic training, classification and ranking of results -Automatically caches query results -Provides a user management interface -Allows further query refinement -Enables users to create curated queries using their own training images Is capable of data ingestion, i.e., users can search their own dataset and define their own metadata |
URL | http://www.robots.ox.ac.uk/~vgg/software/vic/ |
Title | VGG Image Search Engine (VISE) |
Description | VISE is a tool that can be used to search a large dataset for images that match any part of a given image. |
Type Of Technology | Webtool/Application |
Year Produced | 2017 |
Open Source License? | Yes |
Impact | This standalone application can be used to make a large collection of images searchable by using image regions as a query. |
URL | http://www.robots.ox.ac.uk/~vgg/software/vise/ |
Title | You Said That |
Description | The software provides a method for generating a video of a talking face using deep learning. The method takes still images of the target face and an audio speech segment as inputs, and generates a video of the target face lip synched with the audio. The method runs in real time and is applicable to faces and audio not seen at training time. |
Type Of Technology | Software |
Year Produced | 2017 |
Open Source License? | Yes |
Impact | A publication has resulted from this research: J. S. Chung, A. Jamaludin, A. Zisserman You said that? British Machine Vision Conference, 2017 |
URL | http://www.robots.ox.ac.uk/~vgg/publications/ |
Description | AVinDH workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | DH is the largest Digital Humanities conference, and attracts a largely academic audience, at all levels. It's diverse, and gives a good sense of what people are up to in all fields of the humanities that involve computers. |
Year(s) Of Engagement Activity | 2017 |
URL | https://avindhsig.wordpress.com/workshop-2017-montreal/ |
Description | Bodleian Conservators |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Other audiences |
Results and Impact | Bodleian conservators work on books, prints, photographs, papyri and other media. They often take digital pictures for the purposes of recording condition or analysis. |
Year(s) Of Engagement Activity | 2017 |
Description | Oxford Digital Humanities Summer School 2017 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | OXDHSS is the second-largest digital humanities summer school in the world and the largest in Europe. Now based in Engineering Science (through OeRC) It attracts c.250 students to Oxford to take one of several week-long courses, together with lectures and posters. We presented on the general 'Introduction to Diigital Humanities' course, which is the biggest and broadest, and intended to give an introduction to the field(s) for managers, librarians, IT staff or academics who are interested in knowing more or gettiing their institution involved. |
Year(s) Of Engagement Activity | 2017 |
URL | http://digital.humanities.ox.ac.uk/dhoxss/2017/ |
Description | Oxford Digital Humanities Summer School 2018 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | The Summer School offers training to anyone with an interest in using digital technologies in the Humanities, including academics at all career stages, students, project managers, and people who work in IT, libraries, and cultural heritage. Delegates select one week-long workshop, supplementing their training with expert guest lectures and a busy social programme. |
Year(s) Of Engagement Activity | 2018 |
URL | https://digital.humanities.ox.ac.uk/dhoxss |
Description | Oxford Humanities Division Poster Showcase |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Postgraduate students |
Results and Impact | This was a showcase of posters run by the Training Officer of the University's Humanities Division, aimed particularly at ECRs. |
Year(s) Of Engagement Activity | 2018 |
Description | UCL Digital Humanities Seminar |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Department of Information Studies research seminar |
Year(s) Of Engagement Activity | 2017 |
Description | AI@Oxford |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | A unique opportunity to see the state-of-the-art in artificial intelligence and machine learning at one of the world's great universities, and meet Oxford's AI experts one-to-one. This event will show you the reality of AI today: what is possible and where the technology is going. |
Year(s) Of Engagement Activity | 2018 |
URL | https://www.mpls.ox.ac.uk/upcoming-events/artificial-intelligence-oxford |
Description | Automated Tagging of Image and Video Collections using Face Recognition |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | This was a presentation to British Library staff working on data collection and information/digital curation. |
Year(s) Of Engagement Activity | 2018 |
Description | Automated Tagging of Image and Video Collections using Face Recognition at the event No Time to Wait! 3 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The presenter demonstrated how to use Seebibyte developed face recognition software to tag images and video collections. The event was widely attended by 80-100 people from international professions of open media, open standards and digital audiovisual preservation. The event attracted much publicity in social media. |
Year(s) Of Engagement Activity | 2018 |
Description | Automated tagging of the BFI archive using face recognition |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | This was a presentation to BFI which was attended by 20 BFI staff working on data collections and information/digital curation. |
Year(s) Of Engagement Activity | 2018 |
Description | BL GLAM Machine Learning Meetup |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | It was a meetup for galleries, libraries, archives and museums (GLAM) IT staff, researchers and suppliers. |
Year(s) Of Engagement Activity | 2018 |
Description | Blocks, Plates Stones Conference |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Possibly the first-ever conference on printing surfaces (blocks, plates and stones) dealing with historical research, conservation issues and artistic possibilities with collections. |
Year(s) Of Engagement Activity | 2017 |
URL | https://www.ies.sas.ac.uk/events/conferences/previous-conferences/blocks-plates-stones-conference |
Description | Blocks, Plates Stones ECR training day |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Training day for ECRs in printing history |
Year(s) Of Engagement Activity | 2017 |
URL | http://www.academia.edu/33139617/CALL_FOR_APPLICATIONS_ECR_Training_Day_Using_Historical_Matrices_an... |
Description | Bodleian Digital Scholarship Research Uncovered lecture |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | Bodleian Centre for Digital Scholarship hosts a lecture series, open to all. |
Year(s) Of Engagement Activity | 2017 |
Description | British Library Digital Labs Symposium |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Showcase of the British Library's digital collections and projects, in the form of presentations and posters. |
Year(s) Of Engagement Activity | 2017 |
URL | http://blogs.bl.uk/digital-scholarship/2017/09/bl-labs-symposium-2017-mon-30-oct-book-your-place-now... |
Description | CERL Seminar - Visual Approaches to Cultural Heritage |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | CERL is the largest forum for European libraries. Our researcher engaged in Q&A and networked with potential collaborators who expressed interest in using our software in their research. |
Year(s) Of Engagement Activity | 2018 |
URL | https://www.cerl.org/services/seminars/powerpoint_presentations_zurich |
Description | European Conference on Computer Vision (ECCV) 2020 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Andrea Vedaldi is co-organising the European Conference on Computer Vision 2020 as program chair. ECCV is one of the top three international conference in the are. We project and attendance of more than 5K individuals. The organisation is a 2 years effort, which is why this entry is listed this year. |
Year(s) Of Engagement Activity | 2019 |
URL | http://eccv2020.eu |
Description | Iberian books workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Small workshop for digital humanities project |
Year(s) Of Engagement Activity | 2017 |
Description | Inspirational Engineer Talk - University of Cambridge |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Postgraduate students |
Results and Impact | Invited talk on my career and research given in at in invited lecture at the Department of Engineering, University of Cambridge. |
Year(s) Of Engagement Activity | 2019 |
Description | Invited talk at CVPR 2017 workshop |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Gave an invited talk as part of a workshop at CVPR 2017 (Hawaii) which aimed to give an overview to computer vision researchers or problems and state-of-the-art research in medical image analysis. A little follow-up but discussion was quite passive (we might have been competing with the weather on the last day of the meeting!) |
Year(s) Of Engagement Activity | 2017 |
Description | Invited talk at CVPR 2019 workshop |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Invited talk given at CVPR 2019 workshop |
Year(s) Of Engagement Activity | 2019 |
Description | Invited talk at International Ultrasonics Symposium |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Invited key note talk in the inaugural session on machine learning in ultrasonics. The session was packed reflecting the interest in not only my group's work but the interest in machine learning as well. |
Year(s) Of Engagement Activity | 2017 |
Description | Keynote speaker |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Over 230 Academics and Industry Experts attended MEIbioeng 16 to meet, share, debate and learn from their peers. The annual conference supported the discussion of newly developing Biomedical Engineering research areas alongside established work that contribute towards the common goal of improving human health and well-being via development of new healthcare technologies. |
Year(s) Of Engagement Activity | 2016 |
URL | http://www.ibme.ox.ac.uk/news-events/events/meibioeng-16 |
Description | London Rare Books School |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | LRBS is a summer school aimed at academics, librarians, art historians and others interested in rare books and special collections. The presentation was part of a week-long intensive course on historical printing surfaces that included hands-on printing, metal-casting and other skills as well as lectures and library work. |
Year(s) Of Engagement Activity | 2018 |
URL | https://www.ies.sas.ac.uk/study-training/study-weeks/london-rare-books-school/blocks-and-plates-towa... |
Description | MISS Summer School |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Invited lecturer at international summer school. |
Year(s) Of Engagement Activity | 2016 |
Description | McGill University Library |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | This was a private presentation to Special Collections librarians, library IT staff and a couple of academics interested in some collections of early printed material, and rare printers' woodblocks. |
Year(s) Of Engagement Activity | 2017 |
Description | Meeting at British Library |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Industry/Business |
Results and Impact | Demonstrated how to use our software to British library curators and other staff. |
Year(s) Of Engagement Activity | 2018 |
Description | Microsoft Postgraduate Summer School |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Invited talk at Microsoft Summer school |
Year(s) Of Engagement Activity | 2016 |
Description | Oxford Humanities Division Poster Workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Postgraduate students |
Results and Impact | This was a showcase of posters run by the Training Officer of the University's Humanities Division, aimed particularly at ECRs. |
Year(s) Of Engagement Activity | 2017 |
Description | Oxford Traherne Project Meeting |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | The meeting was attended by around 20 book historians and researchers. The editorial team of the Oxford Traherne project were impressed with our digital collator software and said that it has the potential to revolutionise the field of collation and scholarly editing in general. |
Year(s) Of Engagement Activity | 2018 |
Description | PRAIRIE AI summer school, France |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | The PRAIRIE AI summer school comprises lectures and practical sessions conducted by renowned experts in different areas of artificial intelligence. Andrew Zisserman taught on "Self-supervised Learning" |
Year(s) Of Engagement Activity | 2018 |
URL | https://project.inria.fr/paiss/ |
Description | Printing Revolution and Society 1450 - 1500 -- Fifty Years that Changed Europe |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | PRINTING REVOLUTION AND SOCIETY 1450-1500 - Fifty Years that Changed Europe was an international conference that was attended by around 100 academics, researchers, librarians and historians. New researchers are now preparing to contribute images and annotations to 15cILLUSTRATION website. |
Year(s) Of Engagement Activity | 2018 |
URL | http://15cbooktrade.ox.ac.uk/printing-revolution-and-society-conference-video-recordings/ |
Description | Queen Elizabeth Prize schools event at the Science Museum |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Schools |
Results and Impact | I was a panel member, along with the awardees of the 2017 Queen Elizabeth Prize (Eric Fossum, Michael Tompsett and Nobukazu Teranishi) discussing their inventions related to digital sensors/imaging and how the digital imaging world has changed with a schools audience. The event was held at the Science Museum. My invitation stemmed from involvement in the nominations panel for the QEP as well as research interest in digital image analysis. |
Year(s) Of Engagement Activity | 2017 |
Description | Royal Society Digital Archive workshop |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Industry/Business |
Results and Impact | This was a workshop organised by MPLS division of Oxford University to Lloyds Foundation on how to digitise their archive. |
Year(s) Of Engagement Activity | 2018 |
Description | Royal Society Digital Archive workshop |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The Royal Society has recently finished a project to digitise part of their archive and journal backlist: this event, organised by Louisiane Ferlier, was aimed at encouraging researchers to use it, and to archivists at the Society to gain ideas. The event was attended by historians of science and archivists. |
Year(s) Of Engagement Activity | 2018 |
URL | https://blogs.royalsociety.org/publishing/digitising-the-royal-society-journals/ |
Description | SEAHA 2017 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Postgraduate students |
Results and Impact | SEAHA is a doctoral training partnership in heritage science, the members of which are the University of Oxford, Brighton and UCL. Their annual conference is their main plenary gathering, attended by 100+ members of the consortium (students, their supervisors, researchers and professional staff) with exhibits from companies and organisations. The background of attendees ranges from art conservation to material science, or a mixture. |
Year(s) Of Engagement Activity | 2017 |
URL | http://www.seaha-cdt.ac.uk/activities/events/seaha17/ |
Description | Samsung Satellitte Symposium, European Congress in Radiology |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Talk was part of a lunch symposium presenting latest research in AI applied to radiology |
Year(s) Of Engagement Activity | 2017 |
Description | School talk |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Schools |
Results and Impact | Talk at Headington School as the Key Note speaker for their Year of Science. |
Year(s) Of Engagement Activity | 2017 |
Description | Show and Tell Event - Computer Vision Software - 14 June 2016 (Oxford) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Postgraduate students |
Results and Impact | A main aim of the Seebibyte Project is to transfer the latest computer vision methods into other disciplines and industry. We want the software developed in this project to be taken up and used widely by people working in industry and other academic disciplines, and are organizing regular Show and Tell events to demonstrate new software developed by project researchers. A main outcome from these events will be new inter-disciplinary collaborations. As a first step, Transfer and Application Projects (TAPs) are developed with new collaborators. This first Show and Tell event was restricted to participants from the University of Oxford only, in particular researchers from the Department of Engineering Science, the Department of Earth Sciences and the Department of Materials. Future events will also target external participants, including from industry. The June 14 event focused on four topics: 1) Counting; 2) Landmark Detection (KeyPoint Detection); 3) Segmentation (Region Labelling); and 4) Text Spotting. Further information for each of the topics - including the event presentations and new software demos - is available on the event webpage (www.seebibyte.org/June14.html). The event received a positive feedback from participants and has resulted in several new TAPs being completed. It is anticipated that some of these will lead to new collaborations. |
Year(s) Of Engagement Activity | 2016 |
URL | http://www.seebibyte.org/June14.html |
Description | Show and Tell Event - Computer Vision Software - 14 June 2018 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Postgraduate students |
Results and Impact | The purpose of the event is to demonstrate software for recognising faces and tracking objects in videos in potentially large datasets. |
Year(s) Of Engagement Activity | 2018 |
URL | http://seebibyte.org/ |
Description | Show and Tell Event - Computer Vision Software - 15 June 2017 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Other audiences |
Results and Impact | The purpose of the Show and Tell is to demonstrate software for searching, annotating and categorizing images in (potentially) large datasets. Software is open source and will be available following the meeting. |
Year(s) Of Engagement Activity | 2017 |
URL | http://www.seebibyte.org |
Description | Sight and Sound Workshop at CVPR 2018 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Andrew Zisserman co-organized the Sight and Sound Workshop at CVPR 2018. This is the description of the workshop: In recent years, there have been many advances in learning from visual and auditory data. While traditionally these modalities have been studied in isolation, researchers have increasingly been creating algorithms that learn from both modalities. This has produced many exciting developments in automatic lip-reading, multi-modal representation learning, and audio-visual action recognition. Since pretty much every video has an audio track, the prospect of learning from paired audio-visual data - either with new forms of unsupervised learning, or by simply incorporating sound data into existing vision algorithms - is intuitively appealing, and this workshop will cover recent advances in this direction. But it will also touch on higher-level questions, such as what information sound conveys that vision doesn't, the merits of sound versus other "supplemental" modalities such as text and depth, and the relationship between visual motion and sound. We'll also discuss how these techniques are being used to create new audio-visual applications, such as in the fields of speech processing and video editing. |
Year(s) Of Engagement Activity | 2018 |
URL | http://sightsound.org/2018/ |
Description | Speaker - International Women in Engineering Day 2017 |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Schools |
Results and Impact | Secondary school girls from a number of local schools visited the department to see different areas of engineering and do some simple activities related to engineering. I gave the short talk at tea on some of emerging areas of engineering (wacky engineering) and talked a little about my own research and my field. Feedback from schools was positive for the whole event. |
Year(s) Of Engagement Activity | 2017 |
Description | Teaching in Summer School ICVSS |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | This International Computer Vision Summer School aims to provide both an objective and clear overview and an in-depth analysis of the state-of-the-art research in Computer Vision and Machine Learning. The participants benefited from direct interaction and discussions with world leaders in Computer Vision. |
Year(s) Of Engagement Activity | 2015 |
URL | http://iplab.dmi.unict.it/icvss2015/ |
Description | Teaching in Summer School MISS |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | The Medical Imaging Summer School is the largest summer school in its field. Around 200 students attended the school and received training in the science and technology of medical imaging. Students expressed interest in future research in the area. |
Year(s) Of Engagement Activity | 2016 |
URL | http://iplab.dmi.unict.it/miss/index.html |
Description | Teaching in Summer School iV&L |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | The iV&L Training School aims at bringing together Vision and Language researchers and to provide the opportunity for cross-disciplinary teaching and learning. Over 80 students attended the summer school and received training in deep learning across two disciplines, Computer Vision and Natural Language Processing. Students expressed interest in future research in the area. |
Year(s) Of Engagement Activity | 2016 |
URL | http://ivl-net.eu/ivl-net-training-school-2016/ |
Description | The 2017 IEEE-EURASIP Summer School on Signal Processing (S3P-2017) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | The 2017 IEEE-EURASIP Summer School on Signal Processing (S3P-2017), is the 5th edition of a successful series, organized by the IEEE SPS Italy Chapter and the National Telecommunications and Information Technologies Group - GTTI, with the sponsorship of IEEE (S3P program) and EURASIP (Seasonal School Co-Sponsorship agreement). S3P-2017 represents a stimulating environment where top international scientists in signal processing and related disciplines share their ideas on fundamental and ground-breaking methodologies in the field. It provides PhD students and researcher with a unique networking opportunity and a possibility of interaction with leading scientists. The theme of this 5th edition is "Signal Processing meets Deep Learning". Deep machine learning is changing the rules in the signal and multimedia processing field. On the other hand, signal processing methods and tools are fundamental for machine learning. Time for these worlds to meet. |
Year(s) Of Engagement Activity | 2017 |
URL | http://www.grip.unina.it/s3p2017/ |
Description | University of Reading Department of Typography and Graphic Communication |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | This was One-day workshop to develop a large AHRC funding application on historical printing to be lead by Prof. Rob Banham, based in the world's leading department of typography. The event was attended by typographers; designers; design historians; University research support staff. |
Year(s) Of Engagement Activity | 2018 |
URL | http://www.reading.ac.uk/typography/typ-homepage.aspx |
Description | VGG Web Search Engines |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Industry/Business |
Results and Impact | This was a presentation to Continental AG delegates who are interested in AI/CV research. Plans made for future related activities. |
Year(s) Of Engagement Activity | 2018 |
Description | Video: helping with hearing |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Media (as a channel to the public) |
Results and Impact | This is an online video as part of the OxfordAI outreach. See https://www.research.ox.ac.uk/Article/2018-11-08-video-helping-with-hearing The description is: "Can AI modelling assist people with hearing difficulties? Discover how #OxfordAI could help by isolating voices in noisy environments. We talk to DPhil student Triantafyllos Afouras from the Visual Geometry Group in Oxford's Department of Engineering Science." |
Year(s) Of Engagement Activity | 2018 |
URL | https://www.research.ox.ac.uk/Article/2018-11-08-video-helping-with-hearing |
Description | Visual Search of BBC News at the event Artificial Intelligence @ Oxford - A One-Day Expo |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Policymakers/politicians |
Results and Impact | The event was attended by about 100 people internationally mainly from academia, industry, commerce and government who are interested in AI. Video: https://www.youtube.com/watch?v=9ZKGL0QDLpk |
Year(s) Of Engagement Activity | 2018 |
URL | https://ori.ox.ac.uk/artificial-intelligence-oxford-a-one-day-expo-27th-march-2018/ |
Description | Workshops at the Conference on Computer Vision and Pattern Recognition (CVPR) 2019 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Workshop chair for the IEEE Conference on Computer Vision and Pattern Recognition 2018. We selected and coordinate 90 international workshops. |
Year(s) Of Engagement Activity | 2018,2019 |
URL | http://cvpr2019.thecvf.com/program/workshops |