Visual AI: An Open World Interpretable Visual Transformer

Lead Research Organisation: University of Oxford
Department Name: Engineering Science

Abstract

With the advent of deep learning and the availability of big data, it is now possible to train machine learning algorithms for a multitude of visual tasks, such as tagging personal image collections in the cloud, recognizing faces, and 3D shape scanning with phones. However, each of these tasks currently requires training a neural network on a very large image dataset specifically collected and labelled for that task. The resulting networks are good experts for the target task, but they only understand the 'closed world' experienced during training and can 'say' nothing useful about other content, nor can they be applied to other tasks without retraining, nor do they have an ability to explain their decisions or to recognise their limitations. Furthermore, current visual algorithms are usually 'single modal', they 'close their ears' to the other modalities (audio, text) that may be readily available.

The core objective of the Programme is to develop the next generation of audio-visual algorithms that does not have these limitations. We will carry out fundamental research to develop a Visual Transformer capable of visual analysis with the flexibility and interpretability of a human visual system, and aided by the other 'senses' - audio and text. It will be able to continually learn from raw data streams without requiring the traditional 'strong supervision' of a new dataset for each new task, and deliver and distill semantic and geometric information over a multitude of data types (for example, videos with audio, very large scale image and video datasets, and medical images with text records).

The Visual Transformer will be a key component of next generation AI, able to address multiple downstream audio-visual tasks, significantly superseding the current limitations of computer vision systems, and enabling new and far reaching applications.

A second objective addresses transfer and translation. We seek impact in a variety of other academic disciplines and industry which today greatly under-utilise the power of the latest computer vision ideas. We will target these disciplines to enable them to leapfrog the divide between what they use (or do not use) today which is dominated by manual review and highly interactive analysis frame-by-frame, to a new era where automated visual analytics of very large datasets becomes the norm. In short, our goal is to ensure that the newly developed methods are used by industry and academic researchers in other areas, and turned into products for societal and economic benefit. To this end open source software, datasets, and demonstrators will be disseminated on the project website.

The ubiquity of digital images and videos means that every UK citizen may potentially benefit from the Programme research in different ways. One example is smart audio-visual glasses, that can pay attention to a person talking by using their lip movements to mask out other ambient sounds. A second is an app that can answer visual questions (or retrieve matches) for text-queries over large scale audio-visual collections, such as a person's entire personal videos. A third is AI-guided medical screening, that can aid a minimally trained healthcare professional to perform medical scans.

Planned Impact

The proposed programme encompasses new methodology and applied research in computer vision and other modalities (audio, text) that will enable analysis and search of image and video content while learning new things, with human-like flexibility and interpretability. These capabilities will encourage end user take up of computer vision technologies and commercial interest in embedding these technologies in products.

The Programme will have Economic and Societal impact by
1. Enabling UK industry to leverage AI in their activities with a key strategic advantage.
2. Developing new and improved computer vision technologies that will require substantially less training data to solve problems and is thus suitable for commercialisation by a wide range of companies.
3. Enhancing the visual and audio capabilities and knowledge base of UK industries, including small ones.
4. Enhancing quality of life by improving, for instance, healthcare capabilities, surveillance, environmental monitoring, and the means of accessing and enjoying personal digital media.
5. Reducing the cost and risk of collecting manual annotations for deploying AI technology, especially for sensitive data such as medical records.
6. Collaborating directly with companies and organizations that we have already identified, and will work with over the course of the Programme.
7. Training the next generation of computer vision researchers who will be equipped to support the imaging needs of science, technology and wider society for the future.

Impact on Knowledge includes
1. Realisation of new approaches to essential computer vision technology, and the dissemination of research findings through publications, conference presentations, summer school teaching, and the distribution of open source software and image databases.
2. Sharing knowledge with industrial collaborators via Transfer and Application Projects (TAPs) and other activities leading to adoption of advanced computer vision methods across many disciplines of science, engineering and medicine that currently do not use them.
3. Communication of advances to a public audience through website articles, Show and Tell events, social and broadcast media, and other co-ordinated public understanding activities

Publications

10 25 50

 
Description Royal Society National Academies Data Reform Round Table Consultation
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
 
Description Royal Society Privacy Enhancing Technologies (PETs) Policy Working Group, Chair
Geographic Reach Multiple continents/international 
Policy Influence Type Participation in a guidance/advisory committee
Impact Quoting the aims from the report "We have three objectives for this report. Our first objective is that the use cases inspire those collecting and using data to consider the potential benefits of PETs for their own work, or in new collaborations with others. Second, for the evidence we present on barriers to adoption and standardisation to help inform policy decisions to encourage a marketplace for PETs. Finally, through our recommendations, we hope the UK will maximise the opportunity to be a global leader in PETs - both for data security and collaborative analysis - alongside emerging, coordinated efforts to implement PETs in other countries."
URL https://royalsociety.org/-/media/policy/projects/privacy-enhancing-technologies/From-Privacy-to-Part...
 
Description Royal Society Privacy Enhancing Technologies Working Group - policy report published (Chair)
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
Impact The report has contributed to wider discussion of data sharing between government departments and a number of the recommendations have been followed up. It is well cited. A follow-on project is underway with the Alan Turing Institute which will report in 2022. The important message was to show that PETs are maturing as a technology and can be considered enablers to provided trusted sharing of data and to move the conversation away from security and accepting zero risk in sharing data. The work is relevant to not only may research area (health data science) but many other sectors which are data-driven.
URL https://royalsociety.org/-/media/policy/projects/privacy-enhancing-technologies/privacy-enhancing-te...
 
Description Envisioning Dante c.1472- c.1630
Amount £805,620 (GBP)
Funding ID AH/W005220/1 
Organisation Arts & Humanities Research Council (AHRC) 
Sector Public
Country United Kingdom
Start 09/2022 
End 09/2025
 
Description Royal Society Research Professorship Enhanced research Expenses
Amount £100,000 (GBP)
Funding ID RF\ERE\210331 
Organisation The Royal Society 
Sector Charity/Non Profit
Country United Kingdom
Start 10/2021 
End 03/2024
 
Description Studentship
Amount £154,725 (GBP)
Organisation Facebook 
Sector Private
Country United States
Start 10/2021 
End 09/2025
 
Description Toshiba 2021
Amount $200,000 (USD)
Organisation Toshiba 
Sector Private
Country Japan
Start 07/2021 
End 03/2023
 
Title EPIC-KITCHENS VISOR 
Description We introduce VISOR, a new dataset of pixel annotations and a benchmark suite for segmenting hands and active objects in egocentric video. VISOR annotates videos from EPIC-KITCHENS, which comes with a new set of challenges not encountered in current video segmentation datasets. Specifically, we need to ensure both short- and long-term consistency of pixel-level annotations as objects undergo transformative interactions, e.g. an onion is peeled, diced and cooked - where we aim to obtain accurate pixel-level annotations of the peel, onion pieces, chopping board, knife, pan, as well as the acting hands. VISOR introduces an annotation pipeline, AI-powered in parts, for scalability and quality. Data published under the Creative Commons Attribution-NonCommerial 4.0 International License. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
Impact The dataset can be used for audio event detection and the baseline code will be made publicly available. 
URL https://data.bris.ac.uk/data/dataset/2v6cgv1x04ol22qp9rm9x2j6a7/
 
Title Image Change dataset 
Description Propose a scalable methodology for obtaining a large-scale change detection training dataset by leveraging existing object segmentation benchmarks. Introduce a co-attention based novel architecture that is able to implicitly determine correspondences between an image pair and find changes in the form of bounding box predictions. Contribute four evaluation datasets that cover a variety of domains and transformations, including synthetic image changes, real surveillance images of a 3D scene, and synthetic 3D scenes with camera motion. Evaluate our model on these four datasets and demonstrate zero-shot and beyond training transformation generalization. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
Impact In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023. Future impact to be determined. 
URL https://arxiv.org/pdf/2209.14341.pdf
 
Title Localizing Visual Sounds the Hard Way 
Description The objective of this work is to localize sound sources that are visible in a video without using manual annotations. Our key technical contribution is to show that, by training the network to explicitly discriminate challenging image fragments, even for images that do contain the object emitting the sound, we can significantly boost the localization performance. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact Localizing Visual Sounds the Hard Way Honglie Chen, Weidi Xie, Triantafyllos Afouras, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman CVPR, 2021 
URL https://www.robots.ox.ac.uk/~vgg/research/lvs/
 
Title PASS: An ImageNet replacement for self-supervised pretraining without humans 
Description PASS is a large-scale image dataset that does not include any humans and which can be used for high-quality pretraining while significantly reducing privacy concerns. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact YM. Asano, C. Rupprecht, A. Zisserman, A. Vedaldi PASS: An ImageNet replacement for self-supervised pretraining without humans NeurIPS Dataset Track, 2021 
URL https://www.robots.ox.ac.uk/~vgg/data/pass/
 
Title Semantic Shift Benchmark 
Description Demonstrate that the ability of a classifier to make the 'none-of-above' decision is highly correlated with its accuracy on the closed-set classes. Following the success of modern deep learning systems on closed-set visual recognition tasks, a natural next challenge is open-set recognition (OSR) (Scheirer et al., 2013). In the closed-set setting, a model is tasked with recognizing a set of categories that remain the same during both training and testing phases. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
Impact Future impact to be determined 
URL https://www.robots.ox.ac.uk/~vgg/research/osr/#ssb_suite
 
Title Video Person-Clustering Dataset A multi-modal TV-shows and movies dataset 
Description VPCD contains multi-modal annotations (face, body and voice) for all primary and secondary characters from a range of diverse TV-shows and movies. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact A. Brown, V. Kalogeiton, A. Zisserman Face, Body, Voice: Video Person-Clustering with Multiple Modalities 
URL https://www.robots.ox.ac.uk/~vgg/data/Video_Person_Clustering//
 
Title Video-text Alignment HTM-Align dataset 
Description The objective is a temporal alignment network that ingests long term video sequences, and associated text sentences, in order to: (1) determine if a sentence is alignable with the video; and (2) if it is alignable, then determine its alignment. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
Impact Future impacts to be determined 
URL https://www.robots.ox.ac.uk/~vgg/research/tan/
 
Description NLS Chapbooks 
Organisation National Library of Scotland
Country United Kingdom 
Sector Academic/University 
PI Contribution We used our software to search and analysed the illustrations of the chapbooks.
Collaborator Contribution Partner provided chapbooks in large quatities.
Impact https://www.robots.ox.ac.uk/~vgg/research/chapbooks/
Start Year 2020
 
Description National Consortium of Intelligent Medical Imaging 
Organisation National Consortium of Intelligent Medical Imaging
Sector Academic/University 
PI Contribution A VisualAI postdoc (Jianbo Jiao) is providing expertise for image-based building deep learning models to assess COVID19 deterioration for hospital-based patients.
Collaborator Contribution NCIMI is providing access to COVID19 data for a TAP project.
Impact An initial evaluation of predictive modelling was performed using available covid-19 data. However due to the small size of the data, and the fact that covid treatments for patients have significantly improved and better pathways for patients are in place it was deemed not worth pursuing this work further beyond the preliminary study. A report was written but has not been published.
Start Year 2021
 
Description TAP VAI-02 1516 Project 
Organisation University of Copenhagen
Country Denmark 
Sector Academic/University 
PI Contribution We created a visual search engine using images and metadata supplied by Matilde Malaspina at University of Copenhagen and Barbara Tramelli from University of Venice.
Collaborator Contribution Partner provided images and metadata.
Impact A talk at Venice Centre for Digital and Public Humanities (VeDPH) on 9th Dec. 2020
Start Year 2020
 
Description TAP-VAI-03 16cIllustration Project 
Organisation Ca' Foscari University of Venice
Country Italy 
Sector Academic/University 
PI Contribution We created a visually searchable database (https://www.robots.ox.ac.uk/~vgg/research/16ci/lyon/) of 16th century illustrations printed in Lyon.
Collaborator Contribution Partner provided images and metadata.
Impact The researchers at Venice Centre for Digital and Public Humanities are using this visual search engine as a research support tool.
Start Year 2021
 
Description TAP-VAI-04 Frank-Scholten Archive 
Organisation Leiden University
Country Netherlands 
Sector Academic/University 
PI Contribution Using our VISE software, we found a match between all the photographs and their corresponding negative in the Frank-Scholten image archive.
Collaborator Contribution They provide Dataset containing photographs and negatives captured by Frank-Scholten.
Impact tbc
Start Year 2021
 
Description TAP-VAI-08 Fish Pool Trajectory 
Organisation University of Oxford
Department Department of Zoology
Country United Kingdom 
Sector Academic/University 
PI Contribution We are developing tools and workflow to detect and track a Picasso triggerfish moving in a fish tank to find the food target.
Collaborator Contribution They provide videos dataset showing Picasso triggerfish in a fish pool.
Impact tbc
Start Year 2021
 
Description TAP-VAI-09 Fish Tank Obstacles 
Organisation University of Oxford
Department Department of Zoology
Country United Kingdom 
Sector Academic/University 
PI Contribution We are developing tools and workflow to detect and track Picasso triggerfish navigating through obstacles to reach a food target in a fish tank.
Collaborator Contribution They provide videos dataset showing Picasso triggerfish in a fish tank containing obstacles.
Impact tbc
Start Year 2021
 
Description TAP-VAI-10 Czech National Library/ Czech Academy of Sciences 
Organisation National Library of the Czech Republic
Country Czech Republic 
Sector Public 
PI Contribution We are providing technical support to the RKD team for implementing our VISE software in their platform.
Collaborator Contribution They are using our software tool (VISE)
Impact tbc
Start Year 2021
 
Description TAP-VAI-11 RKD 
Organisation Netherlands Institute for Art History
Country Netherlands 
Sector Public 
PI Contribution We are providing technical support to the RKD for implementing visual image search feature in the public facing web portal and internal research using our VGG Image Search Engine (VISE) software (https://www.robots.ox.ac.uk/~vgg/software/vise/).
Collaborator Contribution The RKD Provide images in millions and they are now using our VISE software for visual search functionality.
Impact Not yet.
Start Year 2021
 
Title Audio-visual synchronisation 
Description The software enables: Audio-visual synchronisation. Requires a model to relate changes in the visual and audio streams. Prior work focused primarily on the synchronisation of talking head videos. In contrast, open-domain videos often have a small visual indication, i.e. sparse in space. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact Paper in British Machine Vision Conference (BMVC), 2022. Future impacts to be determined. 
URL https://iashin.ai/SparseSync
 
Title Auditory Slow-Fast 
Description Recognising actions using auditory signal only 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact Paper won outstanding paper at ICASSP 2021 - 3 papers selected out of 1400 papers. Well-referenced -46 stars. In a followup work by Deepmind [https://arxiv.org/pdf/2111.12124.pdf] this work is referred to as: "We find the Slowfast architecture is good at learning rich repre- sentations required by different domains" extending this work to speech and music audio. 
URL https://github.com/ekazakos/auditory-slow-fast
 
Title Find Identical Images (FII) 
Description Identical images have the same image dimension (i.e. image width, image height, number of colour channels) and same pixel value in all corresponding pixel locations. FII is a command line tool to find all identical images in a folder. It can also find images that are common in two folders. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact tbc 
 
Title Generalised Visual Counting in Images 
Description Our goal is to develop a generalised visual object counting system, that augments humans' ability for recognising the number of objects in a visual scene. Specifically, generalised visual object counting refers to the problem of identifying the number of the salient objects of arbitrary semantic class in an image (i.e. open-world visual object counting) with arbitrary number of instance "exemplars" provided by the end user, to refer to the particular objects to be counted, i.e. from zero-shot to few-shot object counting. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact Future impact to be determined. 
URL https://arxiv.org/pdf/2208.13721.pdf
 
Title Generalized Category Discovery 
Description We present a new setting: 'Generalized Category Discovery' and a method to tackle it. Our setting can be succinctly described as: given a dataset, a subset of which has class labels, categorize all unlabelled images in the dataset. The unlabelled images may come from labelled or novel classes. Our method leverages contrastively trained vision transformers to assign labels directly through clustering. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact Future impact to be determined 
URL https://github.com/prajwalkr/vtp#readme
 
Title Image Counterfeit Spotter 
Description Counterfeit Spotter compares images of suspicious products with a reference image and confirm if it is a real or a fake within seconds, right in your browser. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact Still receiving feedback and reporting 
URL https://www.robots.ox.ac.uk/~vgg/software/image-compare/counterfeit-spotter/#usecases
 
Title ImageCompare 
Description Image Compare is a lightweight, standalone and offline application to visually compare a pair of images and highlight their differences. This application can be used in desktop computers and mobile phones without requiring installation as it runs entires in a web browser. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact tbc 
 
Title Lip Reading 
Description To learn strong lip reading models that can recognise speech in silent videos. 
Type Of Technology Software 
Year Produced 2022 
Impact Research shown the best models achieve state-of-the-art results, outperforming prior work trained on public data by a significant margin, and even industrial models trained on orders of magnitude more data. We have also designed a Visual Speech Detection model on top of our lip reading system that obtains state-of-the-art results on this task and even outperforms several audio-visual baselines. 
URL https://www.robots.ox.ac.uk/~vgg/research/vtp-for-lip-reading/
 
Title List Annotator (LISA) 
Description List Annotator (LISA) is a standalone and light-weight HTML/CSS/JavaScript based application to efficiently annotate a large list of images. LISA is an open source project developed and maintained by the Visual Geometry Group (VGG) and released under a license that grants its users the freedom to use it for any purpose. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact tbc 
 
Title Motion Grouping 
Description This software implements the model as described in the paper. It includes a pre-trained model and inference code to apply to downstream images, as well as the training code to train the model from scratch. It also includes code to evaluate and benchmark the results against existing datasets (DAVIS2016, FBMS59, SegTrackv2, MoCA). 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact This code accompanies the paper: Self-supervised Video Object Segmentation by Motion Grouping Charig Yang, Hala Lamdouar, Erika Lu, Andrew Zisserman, Weidi Xie. ICCV 2021 
URL https://oxris.ox.ac.uk/viewobject.html?id=1190260&cid=1
 
Title VGG Image Annotator (VIA) 
Description VGG Image Annotator is a simple and standalone manual annotation software for image, audio and video. VIA runs in a web browser and does not require any installation or setup. The complete VIA software fits in a single self-contained HTML page of size less than 400 Kilobyte that runs as an offline application in most modern web browsers. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact tbc 
 
Title VGG Image Search Engine (VISE) 
Description VGG Image Search Engine (VISE) is a free and open source software for visual search of a large number of images using an image as a search query. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact tbc 
 
Title VGG Visual Tracker (VV 
Description VGG Visual Tracker (VVT) is a tool for creating bounding box annotations on videos in a semi-automatic fashion, using class agnostic object trackers. VVT runs on modern web browsers (Chrome 65+, Firefox 60+, Safari 11+) and does not require any installation or setup. VVT is a variation of the VGG Image Annotator (VIA) v3 tool and uses the same data format. So, if you are already using VIA v3, the annotations are interoperable with your existing workflow. No changes required. 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact tbc 
 
Title Visual Analysis of Chapbooks 
Description The chapbooks were produced cheaply to create everyday reading material and were the most popular reading material for the masses [1]. This dataset has been made freely available by the National Library of Scotland (NLS) 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact Reduced printing costs, these woodcuts were reused across multiple chapbooks. Helped researchers pursue many related research questions using software tools based on computer vision. 
URL https://data.nls.uk/data/digitised-collections/chapbooks-printed-in-scotland/
 
Description (1) VIA: Image and Video Annotation; (2) Image Comparator; (3) Image search and retrieval 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Industry/Business
Results and Impact Show and Tell Event at the Oxford Big Data Institute. Researchers at the Big Data Institute are now aware about our computer vision tools that can significantly improve their existing research workflow.
Year(s) Of Engagement Activity 2022
URL https://www.bdi.ox.ac.uk/
 
Description ACH talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Forum for conversations on an expansive definition of digital humanities in a broad array of subject areas, methods, and communities of practice.
Year(s) Of Engagement Activity 2021
URL https://drive.google.com/file/d/1CN5CDWPf4cLTT1NY9gyP-JxxvsCdzRG-/view
 
Description AD-Manual Annotation of Radiology Images using VGG Image Annotator (VIA) online Course 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The course website lists our manual annotation tool (VIA). This provides a lot of exposure to our software, which is available for the world as an open source tool. This tool significantly speed up annotating work for professions that need to annotate large volume of visual data.
Year(s) Of Engagement Activity 2020
URL https://folio47.wixsite.com/rp-course/radiology-preprocessor-workflow
 
Description AD-TAP Outcome Presentation for Leiden University 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Presented the outcome of our collaboration with Leiden University.
The team at Leiden University were extremely excited to see the results from our visual search engine. They said that they were "jumping like a child" after seeing the outcome and that this collaboration will lead to many new research projects in related to the Frank Scholten Archives.
Year(s) Of Engagement Activity 2021
 
Description AEOLIAN Network workshop presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Presentation describing a project undertaken within the National Librarian of Scotland's Fellowship in Digital Scholarship programme for 2020-1.
Year(s) Of Engagement Activity 2021
URL https://www.aeolian-network.net/events/workshop-1-employing-machine-learning-and-artificial-intellig...
 
Description AI4 LAM online conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Aimed at professionals in the LAM (Libraries, Archives and Museums sector, this was an online workshop teaching the use of several Visual AI tools and giving context to their application in this sector. Issues of attribution, bias and fairness were discussed as well as technical areas.
Year(s) Of Engagement Activity 2022
URL https://sites.google.com/view/ai4lam/ai4lam-2022-virtual-event
 
Description AI4LAM workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Introduce the use of visual AI for collections research, access and management. Using the example of collaborations between Oxford's Visual Geometry Group (VGG) and researchers and curators within the GLAM sector, the speaker will provide a hands-on introduction to VGG's open-source tools for visual search, classification, comparison and annotation.
Year(s) Of Engagement Activity 2021
URL https://libereurope.eu/event/introduction-to-visual-ai-in-glams-workshop-series-on-applying-and-depl...
 
Description AIUM 2021 Special Session Invited Speaker 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Invited speaker in Session with Title: Deep Learning Applications for New Ultrasound Techniques. Talk was pre-recorded with live questions.
This primary audience was medical physicists rather than medical image analysis experts.
Year(s) Of Engagement Activity 2021
 
Description AV4D: Visual Learning of Sounds in Spaces 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Workshop at the European Conference on Computer Vision (ECCV).
Year(s) Of Engagement Activity 2022
URL https://av4d.org
 
Description AWS Human-Machine Collaboratory conference 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Industry/Business
Results and Impact Hosted by the Amazon Web Services (AWS)-funded Human-Machine Collaboratory at Oxford, Giles Bergel gave two talks (one alongside Dan Schofield, another Visual AI ambassador) on Visual AI collaborations and research in fields ranging from primatology to cultural heritage and media studies.
Year(s) Of Engagement Activity 2022
URL https://www.mpls.ox.ac.uk/innovation-and-business-partnerships/human-machine-collaboration
 
Description Aberystwyth Bibliographical Group 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Presented work in tracing woodcut illustrations, their original woodblocks and copies throughout the surviving corpus of British ballads and chapbooks. He discussed how woodcuts in these forms of cheap print served as visual brands for particular titles, genres or producers of cheap print, and demonstrated some of the bibliographical uses of their identification. Showed how computer vision software can strongly support these researches, and may be further applied to printed images of all kinds.
Year(s) Of Engagement Activity 2021
URL https://www.hugofox.com/community/aberystwyth-bibliographical-group-19783/reports-of-recent-meetings...
 
Description Co-organiser of ASMUS2021, a MICCAI workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Advances in Simplifying Medical UltraSound (ASMUS) 2021 is an international workshop that provides a forum for research topics around ultrasound image computing and computer-assisted interventions and robotic systems that utilize ultrasound imaging. It was held in conjunction with MICCAI 2021 in virtual form.

Accepted papers were selected based on their scientific contribution, via a double-blind process involving written reviews from at least two external reviewers in addition to a member of the committee.
The published work includes reports across a wide range of methodology, research and clinical applications. Advanced deep learning approaches for anatomy recognition, segmentation, registration and skill assessment are the dominant topics, in addition to ultrasound-specific new approaches in augmented reality and remote assistance.
Three invited speakers were included in the workshop, and live demos of technologies were given. The meeting had 80+ attendees.
Year(s) Of Engagement Activity 2021
URL https://miccai-ultrasound.github.io/#/asmus21
 
Description ConCode webinar 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Presentation will highlight some of the ways that cultural heritage collections are using computer vision (or visual AI) for collections management and research, focussing particularly on the work of the Oxford Visual Geometry Group and its collaborators.
Year(s) Of Engagement Activity 2021,2022
URL https://www.youtube.com/watch?v=d4XaZ4bur6Q
 
Description Deep Discoveries webinar 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact Discussed how computer vision excels at matching identical features within images and has made progress in broad classification tasks, though the middle ground remains challenging. Visual similarity, which is essential to human visual recognition, is challenging to conceptualise, measure and compute. Outlined some approaches to defining similarity in computational terms, drawing on the experience of the Visual Geometry Group (Oxford) in collaborating with cultural heritage researchers.
Year(s) Of Engagement Activity 2021
URL https://www.eventbrite.co.uk/e/computer-vision-and-heritage-opportunities-for-research-and-engagemen...
 
Description Digital Humanities Annual Conference - Tokyo 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact The project gave a paper and lead a workshop teaching the use of Visual AI software tools for the study of printed illusttrations.
Year(s) Of Engagement Activity 2022
URL https://dh2022.adho.org/
 
Description Digital Humanities Congress Sheffield 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Presentation of Visual AI collaborations on book history to a diverse audience of digital humanists to promote the sharing of knowledge, ideas and techniques within the digital humanities.
Year(s) Of Engagement Activity 2022
URL https://www.dhi.ac.uk/dhc2022/
 
Description Digital Humanities and Book History conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Project research, tools and collaborations presented to a digital humanities audience, working in particular in the field of book history, in which field Visual AI has a high profile
Year(s) Of Engagement Activity 2022
URL https://dcsco-op.org/past-events/dhbh/
 
Description Digital Humanities at Oxford Summer School 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact A presentation and two hands-on sessions on Visual AI tools and collaborations in Digital Humanities
Year(s) Of Engagement Activity 2022
URL https://eng.ox.ac.uk/events/dhoxss-2022/
 
Description Digitising, Cataloguing, Searching and Sharing the Medieval and Early-Modern Image: On-Going Projects & Different Methodologies 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Other audiences
Results and Impact Presentation on digitising, Cataloguing, Searching and Sharing the Medieval and Early-Modern Image: On-Going Projects & Different Methodologies
Year(s) Of Engagement Activity 2021
 
Description Distinguished Keynote Speaker in Biomedical and Health Data Science in two joint conferences of IEEE EMBS BHI and BSN 2021 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Keynote talk entitled: Simplifying interpretation and acquisition of ultrasound scans, delivered virtually.
Abstract:
Short Abstract:
With the increased availability of low-cost and handheld ultrasound probes, there is interest in simplifying interpretation and acquisition of ultrasound scans
through deep-learning based analysis so that ultrasound can be used more widely in healthcare. However, this is not just "all about the algorithm", and successful innovation
requires inter-disciplinary thinking and collaborations.
In this talk I will overview progress in this area drawing on examples of my laboratory's experiences of working with partners on multi-modal ultrasound imaging, and building
assistive algorithms and devices for pregnancy health assessment in high-income and low-and-middle-income country settings. Emerging topics in this area will also be discussed.
Year(s) Of Engagement Activity 2021
 
Description Edinburgh CDCS Digitised Documents Series workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Workshop to showcase the state of the art in Visual AI for cultural heritage and the digital humanities, and provide a hands-on introduction to some simple techniques for searching and classifying imagery in books, paintings, photographs and film.
Year(s) Of Engagement Activity 2022
URL https://www.cdcs.ed.ac.uk/events/visual-ai-and-humanities-introduction
 
Description Edinburgh CDCS workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact Workshop to showcase the state of the art in Visual AI for cultural heritage and the digital humanities, and provide a hands-on introduction to some simple techniques for searching and classifying imagery in books, paintings, photographs and film. Introduced participants to the study of bias within AI, as such controversial applications as facial recognition and automated image categorisation.
Year(s) Of Engagement Activity 2021
URL https://www.cdcs.ed.ac.uk/events/workshop-chapbooks-national-library-scotland
 
Description Fantastic Futures Conference 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Conference to aim to help participants discover: basic concepts of artificial intelligence in the GLAM sector, concrete uses and practices of AI in the GLAM sector, technologies and tools applicable to the GLAM sector's data and collections.
Year(s) Of Engagement Activity 2021
URL https://www.bnf.fr/en/agendaEN/workshops-tutorials-les-futurs-fantastiques-3rd-conference-about-arti...
 
Description Helping Computers See and Understand the World Around Us 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Schools
Results and Impact Science Week Demonstration for Year 3 and Year 4 students at the Cutteslowe Primary School in Oxford
Year(s) Of Engagement Activity 2022
URL https://www.cutteslowe.oxon.sch.uk/
 
Description History of Printed Illustrations webinar 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Presentation, drawing on a recent collaboration with the National Library of Scotland on their chapbook collections, demonstrated how computer vision (or 'visual AI') can support the study of printed illustrations. Demonstrated free software developed for these purposes; discuss its strengths and weaknesses; and consider its overall place within the illustration researcher's toolbox.
Year(s) Of Engagement Activity 2021
URL https://www.cphc.org.uk/events/2021/7/8/hopin-webinar-ly8r3
 
Description ICDAR Hip2021 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Workshop to bring together researchers from various fields working on document image acquisition, restoration, analysis, indexing, and retrieval to make these documents accessible in digital libraries.
Year(s) Of Engagement Activity 2021
URL https://blog.sbb.berlin/hip2021/
 
Description IIIF Community Call 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact Community Call discussing the University of Oxford Visual Geometry Group's work with IIIF and Machine Learning
Year(s) Of Engagement Activity 2021
URL https://www.youtube.com/watch?v=KXE3-LD6xxI&t=1s
 
Description International Computer Vision Summer School 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact International Computer Vision Summer School
Year(s) Of Engagement Activity 2022
URL https://iplab.dmi.unict.it/icvss2022/
 
Description Learning 3D Geometry 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Undergraduate students
Results and Impact Lecture in the computer vision course at the University of Amsterdam.
Year(s) Of Engagement Activity 2022
 
Description Learning on Screen - BoB/TRilT Academic Engagement launch 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Project tools and collaborations advertised to researchers seeking to use one of the largest research databases of UK TV programmes, leading to follow up discussions.
Year(s) Of Engagement Activity 2022
URL https://learningonscreen.ac.uk/guidance/bob-and-trilt-for-research/launch-event/
 
Description London Rare Books Summer School - the Digital Book Historian's Toolkit 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Schools
Results and Impact View of the landscape of digital research in book history, including bibliographic data and content management systems, data visualisation, systems for image sharing and annotation in libraries and archives, computer vision, and (semi-)automated collation. Instead of emphasising mastery of any particular technology, we encouraged computational thinking and digital experimentation to enhance historical research questions and information management.
Year(s) Of Engagement Activity 2021
 
Description MIUA 2021 Conference - co-organiser 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact MIUA is a UK-based international conference for the communication of image processing and analysis research and its application to medical imaging and biomedicine. This was the 25th edition of the meeting which was held virtually. 40 papers were presented (27k downloads as of 09-03-2022). MIUA is the principal UK forum for communicating research progress within the community interested in image analysis applied to medicine and related biological science. The meeting is designed for the dissemination and discussion of research in medical image understanding and analysis, and aims to encourage the growth and raise the profile of this multi-disciplinary field by bringing together the various communities including among others:
Year(s) Of Engagement Activity 2021
URL https://miua2021.com/
 
Description Max Planck BibHerz Library Seminar: Reflections on the Digital Turn in the Humanities and the Sciences 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Seminar on how digital technologies have changed approaches to the discovery, study, and presentation of images; what impact the changing dynamic between the analogue and digital manifestation of the book or manuscript has on their working practices; and how this affected their use and questions that are asked or could be asked.
Year(s) Of Engagement Activity 2021
URL https://www.biblhertz.it/3069990/seminar-series-reflections-on-the-digital-turn-in-the-humanities-an...
 
Description NLS Digital Scholarship Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact 15 attendees for an annual workshop, which sparked questions and ongoing discussions.
Year(s) Of Engagement Activity 2021
 
Description National Academies roundtable on researcher access to data 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The National Academies Data Reform Round Table was a by invitation meeting that discussed some of the current challenges that researchers face with getting access to data for research due to current data protection regulation. The Department for Digital, Culture, Media and Sport (DCMS) was consulting on
reforming the UK's data protection regime which formed part of a larger effort to implement the government's National Data Strategy, and specifically Mission 2 of that strategy: 'supporting a pro-growth and trusted data regime'. This issue affects researchers working in computer vision and medical image analysis and this was part of the discussion.

In terms of impact/outcome, the meeting output fed into a response that hopefully will have influence (how direct can not be measured/it is too early to determine but I selected this box in the next question for this reason).
Year(s) Of Engagement Activity 2021
 
Description National Academies' party conference event speaker 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Speaker on the (virtual) National Academies panel at the Liberal Democrat political party conference which focused on the theme of 'Becoming a "science superpower": will the UK be fit to tackle the next global crisis?'.

Briefing: The panel discussions will address how the UK should approach the future, building resilience to future crises and achieving 'superpower' status. The panel will include leading experts representing the National Academies, as well as representatives from the political parties and a journalist Chair.

Not aware of any direct impact (see next week) but these sessions are an important part of keeping an open and positive dialogue with MPs.
Year(s) Of Engagement Activity 2021
 
Description National Librarian of Scotland's Lecture in Digital Scholarship 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Media (as a channel to the public)
Results and Impact Introduced research on chapbooks using Visual AI and how machine vision can help others to understand printed heritage collections.
Year(s) Of Engagement Activity 2021
URL https://www.youtube.com/watch?v=5jkq0iLzMvo&t=10s
 
Description Neural Geometry and Rendering: Advances and the Common Objects in 3D Challenge? 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Workshop at the European Conference on Computer Vision (ECCV).
Year(s) Of Engagement Activity 2022
URL https://ngr-co3d.github.io
 
Description Office for National Statistics, Integrated Data Programme Advisory Group, Member, 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other audiences
Results and Impact The Office for National Statistics Integrated Data Programme Advisory Group offers advise to the ONS on its programme aimed at sharing data for pubic good with other organisations. I was invited due to my role as Chair of the Royal Society PETs science policy work together with my research interest in health data science/medical image analysis.
Year(s) Of Engagement Activity 2021,2022
 
Description OxML - Oxford Machine Learning Summer School 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Gave a lecture at the OxML summer school on Deep Learning.
Year(s) Of Engagement Activity 2021
URL https://www.oxfordml.school
 
Description Practical Applications of IIIF Seminar: Image Registration and IIIF 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Discussing the methods, challenges and possibilities of Image Registration.
Year(s) Of Engagement Activity 2021
URL https://www.iiconservation.org/content/practical-applications-iiif-seminar-1-image-registration-and-...
 
Description Renaissance Society of America Day of Digital Learning 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact RSA DAY OF DIGITAL LEARNING. Featuring a varied menu of sessions involving hands-on, participatory work with digital tools and resources.
Year(s) Of Engagement Activity 2021
URL https://rsaddl.hcommons.org/
 
Description Renaissance Society of America Day of Digital Learning 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact An introduction to computer vision - the extraction of information from images - for the purposes of book and art history. Overview of the field, with particular reference to collaborative research performed by the Visual Geometry Group (VGG) at Oxford.
Year(s) Of Engagement Activity 2022
URL https://rsa2022ddl.hcommons.org/main-page/rsa-ddl-2022-topics/
 
Description Royal Society Privacy Enhancing Technologies (PETs) Policy Working Group - Chair 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Royal Society Privacy Enhancing Technologies (PETs) Policy Working Group (policy report), Chair, 2017-19. Also Chair of follow-on to initial report, 2021-.
Year(s) Of Engagement Activity 2019,2020,2021,2022
URL https://royalsociety.org/-/media/policy/projects/privacy-enhancing-technologies
 
Description Royal Society Privacy Enhancing Technologies (PETs) Policy Working Group - Chair 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Royal Society Privacy Enhancing Technologies (PETs) Policy Working Group (policy report), Chair, 2017-19. Also Chair of follow-on to initial report, 2021-.
Year(s) Of Engagement Activity 2019,2020,2021,2022
URL https://royalsociety.org/-/media/policy/projects/privacy-enhancing-technologies
 
Description Sight and Sound Workshop at the IEEE Conference on Computer Vision and Pattern Recognition, 2021 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Andrew Zisserman co-organized the Sight and Sound Workshop at CVPR 2021. This is the description of the workshop: While traditionally visual and audio data have been studied in isolation, researchers have increasingly been creating algorithms that learn from both modalities. This has produced many exciting developments in automatic lip-reading, multi-modal representation learning, and audio-visual action recognition.

Since pretty much every internet video has an audio track, the prospect of learning from paired audio-visual data - either with new forms of unsupervised learning, or by simply incorporating sound data into existing vision algorithms - is appealing, and this workshop will cover recent advances in this direction. It will also touch on higher-level questions, such as what information sound conveys that vision doesn't, the merits of sound versus other "supplemental" modalities such as text and depth, and the relationship between visual motion and sound. We'll also discuss how these techniques are being used to create new audio-visual applications, such as in the fields of speech processing and video editing.
Year(s) Of Engagement Activity 2021
URL https://sightsound.org/2021/
 
Description Sixth Form Schools Science Talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact Gave talk to lower sixth form students at Magdalen College School on my research. This was part of their lecture series related to the lower sixth form project which provides them with experience of researching a topic. Lots of interesting questions particularly about the global health angle of the research/potential impact and ethics of using AI. In fact the quality of questions was much higher than most technical audience ones! Teacher followup said there was good discussion afterwards.
Year(s) Of Engagement Activity 2022
 
Description Summer School on Artificial Intelligence, India 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Lectured at Summer School on "Recognizing Human Actions in Videos", followed by Q & A session.
Year(s) Of Engagement Activity 2021
URL https://cvit.iiit.ac.in/summerschool2021/index.php
 
Description The Sixth Annual Conference for Research Software Engineering 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Research Software Engineers from other Universities got to learn about our methods and processes of developing software tools that are used widely all over the world.
Year(s) Of Engagement Activity 2022
URL https://virtual.oxfordabstracts.com/#/event/3101/submission/70
 
Description University of Stockholm Digital Humanities Now workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Showcase new and ongoing research in the broad Digital Humanities field.
Year(s) Of Engagement Activity 2021
URL https://su.powerinit.com/Data/Event/EventTemplates/2602/?EventId=879
 
Description VGG Image Search Engine (VISE) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Industry/Business
Results and Impact Talk at the RKD Netherlands Institute for Art History. The RKD team have integrated our VISE image search engine software into their platform. In this event, all the contributors to the digital platform talked about their work and their software. Our VISE software was introduced to a wider group of international audience.
Year(s) Of Engagement Activity 2022
URL https://rkd.nl/en/
 
Description VisuAI Show and Tell 2021 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Presneted our Visual annotation and Visual search software to potential interetest reseachers, some of whom enquired further and later adopted tools in their research.
Year(s) Of Engagement Activity 2021
 
Description VisualAI Show and Tell 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact The event was for the University of Edinburgh. It showcased the software developed by the VisualAI team with the aims of publicising the open source software produced in the project, and of attracting potential collaborators.
Year(s) Of Engagement Activity 2021
URL https://www.robots.ox.ac.uk/~vgg/projects/visualai/events.html#ST15621
 
Description VoxCeleb Speaker Recognition Challenge (VoxSRC) Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Andrew Zisserman co-organized the VoxCeleb Speaker Recognition Challenge (VoxSRC) and workshop. The purpose of the challenge was to "probe how well current methods can recognize speakers from speech obtained 'in the wild'." It was based on the VoxCeleb dataset obtained from YouTube videos of celebrity interviews, and consisting of audio from both professionally edited and red carpet interviews as well as more casual conversational audio in which background noise, laughter, and other artefacts are observed in a range of recording environments. The challenge consisted of both speaker verification and speaker diarisation tracks. The task of speaker verification is to determine whether two samples of speech are from the same person, while speaker diarization involves the more general task of breaking up multi-speaker audio into homogenous single speaker segments, effectively solving 'who spoke when'.
Year(s) Of Engagement Activity 2021
URL https://www.robots.ox.ac.uk/~vgg/data/voxceleb/interspeech2021.html
 
Description VoxCeleb Speaker Recognition Challenge (VoxSRC) Workshop 2022 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Andrew Zisserman co-organized the VoxCeleb Speaker Recognition Challenge (VoxSRC) and workshop. The purpose of the challenge was to "probe how well current methods can recognize speakers from speech obtained 'in the wild'." It was based on the VoxCeleb dataset obtained from YouTube videos of celebrity interviews, and consisting of audio from both professionally edited and red carpet interviews as well as more casual conversational audio in which background noise, laughter, and other artefacts are observed in a range of recording environments. The challenge consisted of both speaker verification and speaker diarisation tracks. The task of speaker verification is to determine whether two samples of speech are from the same person, while speaker diarization involves the more general task of breaking up multi-speaker audio into homogenous single speaker segments, effectively solving 'who spoke when'.
Year(s) Of Engagement Activity 2022
URL http://mm.kaist.ac.kr/datasets/voxceleb/voxsrc/interspeech2022.html
 
Description What do you learn after Developing, Maintaining and Supporting Research Software for 6 years? 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Industry/Business
Results and Impact Talk at the Vision, Graphics and Learning (VGL) research group in the Department of Computer Science, University of York. The PhD and Postdocs in the VGL group of University of York became aware about the software development methods and practices for create research software tools used by millions all over the world.
Year(s) Of Engagement Activity 2022
URL https://www.youtube.com/watch?v=8S0HbFX4HBM