Multisource audio-visual production from user-generated content

Lead Research Organisation: Queen Mary University of London
Department Name: Sch of Electronic Eng & Computer Science

Abstract

The pervasiveness of amateur media recorders either embedded in smartphones or as stand-alone devices is revolutionizing the way events are captured and reported. The aim of this project is to devise intelligent editing and production algorithms based on new signal processing techniques for processing multi-view user-generated content.

The explosion of shared video content offers the opportunity for new ways of not only analysing but also timely reporting stories, ranging from disaster scenes and protests to music concerts and sports events. However, the large amount of data increasingly available and their varying quality makes the selection and editing of appropriate multimedia items in a timely manner very difficult thus strongly limiting the opportunity to harvest this data for security, cultural and entertainment applications. There is an urgent need to investigate and develop new ways to help or replace what used to be the role of a producer/director in this rapidly changing landscape. In particular, there is the need to automate production tasks and to generate new and high-quality content from multiple views.

The key aspect of the project is the integration of audio and visual inputs that support each other in reaching objectives that would otherwise be impossible using only one modality. We will focus on a set of relevant event-types: sports, music shows and crowd scenes. We will devise novel multisource processing techniques to improve audiovisual production and to enable synchronisation processing. This will in turn allow generation of novel and higher quality audio-visual rendering of captured events.

Planned Impact

This project will have major impact on the use and quality of user-generated videos. It will lead to new, intelligent content production algorithms that enable user-generated content to be seamlessly and rapidly integrated into multimedia items. It also supports the development of video-based citizen journalism and other participatory media, where members of the public play an active role in the process of news reporting and community-based content creation.

Dissemination will be through journal and conference publications as well as an up-to-date project website, which will also provide a service for uploading and producing content. We will build on existing links and engage with new beneficiaries and user groups by publishing the research at conferences (especially those with strong industry presence) and in high impact journals, visits to other groups and institutions.

Videos related to published results will also be disseminated through the Investigators' YouTube channels (that have already attracted 10,000 views to date) and websites. Our tools and accomplishments will be promoted to the multimedia signal processing community via press releases sent to mailing lists and magazines. Halfway through the project, we will host a one day Workshop on multisource signal processing, bringing together leaders in this emerging field. Near the end of the project, a Workshop on systems for intelligent production of user generated content will also be held, in which we will discuss and promote the outcomes of our research.

To enhance engagement with the public and improve communication of project results, the Queen Mary EECS Public Engagement team will advise and mentor the research team to deliver high impact public engagement activities, assist with writing and editing articles for web and magazine, integrating messages from the project in events for schools and the general public and contribute towards distribution costs of magazines including project results. We will participate in communication activities by exhibiting prototypes developed in the project, contributing our technologies for use in creative projects (c4dmpresents.org) and disseminating results through the school magazines Audio! and CS4Fun (Computer Science for Fun).

The Investigators will lead Impact-related activities, although all researchers will be involved. We will use the services and expertise available from the QMUL Communications Officer for drafting press releases and promotional materials, and from the QM Innovation Marketing Manager for producing materials aimed at industry and user groups with a commercial interest. Descriptions of the project and its outcomes will be included in promotional material from the Investigators' research groups, the Computer Vision group and the Centre for Digital Music

Publications

10 25 50
publication icon
Bano S (2015) ViComp: composition of user-generated videos in Multimedia Tools and Applications

publication icon
Chen F (2014) Resource Allocation for Personalized Video Summarization in IEEE Transactions on Multimedia

publication icon
Hon T (2015) Audio Fingerprinting for Multi-Device Self-Localization in IEEE/ACM Transactions on Audio, Speech, and Language Processing

publication icon
Llagostera Casanovas A (2014) Audio-visual events for multi-camera synchronization in Multimedia Tools and Applications

publication icon
Wang L (2016) An Iterative Approach to Source Counting and Localization Using Two Distant Microphones in IEEE/ACM Transactions on Audio, Speech, and Language Processing

publication icon
Wang L (2018) Pseudo-Determined Blind Source Separation for Ad-hoc Microphone Networks in IEEE/ACM Transactions on Audio, Speech, and Language Processing

publication icon
Wang L (2021) Deep Learning Assisted Time-Frequency Processing for Speech Enhancement on Drones in IEEE Transactions on Emerging Topics in Computational Intelligence

publication icon
Wang L (2016) Over-Determined Source Separation and Localization Using Distributed Microphones in IEEE/ACM Transactions on Audio, Speech, and Language Processing

 
Description The grant led to new models and tools for the synchronisation of audiovisual files captured by ad-hoc networks of sensors, such as smartphones; to new models and tools for the localisation of the sensors based on the collected data only; and to new models and tools for the automated processing and editing of the multi-viewpoint audiovisual recordings
Exploitation Route A prototype system is available to user to automatically edit user-generated videos.
Sectors Creative Economy,Digital/Communication/Information Technologies (including Software),Culture, Heritage, Museums and Collections,Other

URL http://www.eecs.qmul.ac.uk/~andrea/mavip
 
Description Availability of prototype for users to upload their user-generated videos and to receive an automatically generated final cut (https://gifu.eecs.qmul.ac.uk). This webtool has been used for content creation and teaching purposes. A dedicated multimedia synchroniser has been developed based on the technolgy of the previous prototype (http://synch.eecs.qmul.ac.uk/) this is used to align temporally videos that are captured by independent cameras. This is to support multi-view video production.
First Year Of Impact 2015
Sector Digital/Communication/Information Technologies (including Software),Leisure Activities, including Sports, Recreation and Tourism,Culture, Heritage, Museums and Collections
Impact Types Cultural

 
Description research collaboration 
Organisation Catholic University of Louvain
Department Institute of Information and Communication Technologies, Electronics and Applied Mathematics
Country Belgium 
Sector Academic/University 
PI Contribution Contributed to the design and testing of novel methods for personalised summarisation of multi-view content
Collaborator Contribution Contributed dataset, implementation, methods for automated summarisation of multi-view content
Impact Journal paper F. Chen, C. De Vleeschouwer, A. Cavallaro, "Resource allocation for personalized video summarization", IEEE Transactions on Multimedia, Vol. 16, Issue 2, February 2014, pp. 455-469
Start Year 2013
 
Description research collaboration 
Organisation Japan Advanced Institute for Science and Technology
Country Japan 
Sector Private 
PI Contribution Contributed to the design and testing of novel methods for personalised summarisation of multi-view content
Collaborator Contribution Contributed dataset, implementation, methods for automated summarisation of multi-view content
Impact Journal paper F. Chen, C. De Vleeschouwer, A. Cavallaro, "Resource allocation for personalized video summarization", IEEE Transactions on Multimedia, Vol. 16, Issue 2, February 2014, pp. 455-469
Start Year 2013
 
Description research collaboration 
Organisation University of Genoa
Department Department of Informatics, Bioengineering, Robotics and Systems Engineering
Country Italy 
Sector Academic/University 
PI Contribution Contributed a new methods for the synchronisation of video streams based on matching articulated objects
Collaborator Contribution Contributed a dataset and software implementation for the synchronisation of video streams based on matching articulated objects
Impact Journal paper L. Zini, F. Odone, A. Cavallaro, "Multi-view matching of articulated objects, " IEEE Transactions on Circuits and Systems for Video Technology, Vol. 24, No. 11, November 2014, pp. 1920-1934
Start Year 2013