will explore how advances in generative networks and deep-learning approaches to inpainting can be combined to create virtual camera views

Lead Research Organisation: University of Surrey

Department Name: Surrey Space Centre Academic

Abstract

Generating virtual camera views using a Generative Query Network This project will explore how advances in Generative networks and deep-learning approaches to inpainting can be combined to create virtual camera views. A unified
generative deep learning framework for viewpoint interpolation and inpainting, should effectively blend the observed viewpoints and hallucinated information, to produce a virtual camera feed from otherwise impractical vantage points. The
developed technology is of interest as a potential extension to the capabilities of the Ed system currently being developed by BBC R&D (using AI techniques for automated content editing).

Student:

Violeta Menendez Gonzalez

Period of Study:

Sep 19 - Sep 23

Funder:

EPSRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

2288914

Research Topic:

Unclassified

Organisations

People	ORCID iD
Simon Hadfield (Primary Supervisor)	http://orcid.org/0000-0001-8637-5054
Violeta Menendez Gonzalez (Student)

Publications

Author Name Title Publication Date Published

10 25 50

Menendez Gonzalez V (2022) SVS: Adversarial refinement for sparse novel view synthesis

Menendez Gonzalez V (2022) SaiNet: Stereo aware inpainting behind objects with generative networks

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/T517616/1			30/09/2019	29/09/2025
2288914	Studentship	EP/T517616/1	30/09/2019	29/09/2023	Violeta Menendez Gonzalez

Key Findings
Impact Summary
Software and Technical Products
Engagement Activities


Description	It is not always possible to record video from an optimal point of view. For example, at live events the audience are often placed at the best vantage points. Nevertheless, it is valuable to document these events in order to bring them to a wider audience and to record them for posterity. During this project we have attempted to solve this issue by developing a multi-view consistent machine learning approach that combines information from multiple cameras. This would allow cameras to be placed in non-optimal locations, and to then synthesise a "virtual camera" from the optimal viewing position. In particular, we have developed two marchine learning models that advance this goal in two stages. The first model learns how to use stereoscopic image information to inpaint the missing areas of an image caused by object disocclusions. This gives the computer a sense of 3D space using only 2D images, and can then be applied on filling the gaps left when the camera point of view changes. The second model takes this a step further and instead of stereoscopic cameras, it is able to process input images from a sparse set of unconstrained cameras, which may be at different camera planes, wider baselines, and different viewing angles. This makes the changes from view to view more extreme and difficult to recover. This second model combines the input images in a deep learning pipeline to generate an new point of view from a camera position which may have a significant extrapolation.
Exploitation Route	This research could be expanded to include temporal consistency across frames in a video, to be able to reproduce dynamic scenes. This model can be use by artists and producers to recreate new points of view of a scene without the need to re-record their scene with a new setup.
Sectors	Digital/Communication/Information Technologies (including Software)


Description	This project advances the development of industrial tools for media production. The outputs of this project are an extension of our industrial partner's product, which provide new techniques and tools for creating media content for public consumption. As well as continuing collaboration with our industrial partner and opening up new paths of research to improve these tools.
First Year Of Impact	2021
Sector	Digital/Communication/Information Technologies (including Software)
Impact Types	Cultural Economic


Title	SVS: Adversarial refinement for sparse novel view synthesis
Description	A deep learning model that generates an image from a new point of view, given a sparse set of input views.
Type Of Technology	Software
Year Produced	2022
Open Source License?	Yes
Impact	Other researchers have been able to use my code to compare my model to their own, and be able to cite my paper in their publications, as well as potentially using the code into their future developments.
URL	https://github.com/violetamenendez/svs-sparse-novel-view


Description	Poster presentation at AI4CC workshop at The IEEE/CVF Conference on Computer Vision and Pattern Recognition
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	Presenting of a poster about my paper publication in the conference. Reached to around 20 people working on similar fields or interested in my approach. Promoted networking.
Year(s) Of Engagement Activity	2022
URL	https://figshare.com/articles/poster/Poster_SaiNet_Stereo_aware_inpainting_behind_objects_with_gener...


Description	Poster presentation at Altitude X
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Industry/Business
Results and Impact	Presentation of a poster about my publication SaiNet: Stereo aware inpainting behind objects with generative networks, at the datascience conference Altitude X in Manchester UK. Reached to 15 people who showed potential interest in future work.
Year(s) Of Engagement Activity	2022
URL	https://altitudex.live/manchester/


Description	Poster presentation at The British Machine Vision Conference
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	Poster presentation of my published paper in the same conference. Reach of about 30 people who showed interests in the approaches and methods I used, with potential future collaborations.
Year(s) Of Engagement Activity	2022
URL	https://bmvc2022.mpi-inf.mpg.de/886/


Description	Presentation to BBC
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Industry/Business
Results and Impact	Presentation of the current developments of my project to an audience of around 40 people across the BBC Research & Development, and Design & Engineering departments. Talking about key findings and future works and approaches.
Year(s) Of Engagement Activity	2022

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects