Learning to Recognise Dynamic Visual Content from Broadcast Footage

Lead Research Organisation: University of Leeds
Department Name: Sch of Computing

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
 
Description This grant (in part) relates to sign language recognition using large corpus's of video data. One major Leeds aspect (this is a collaboration with Oxford and Surrey) is tracking of upper body pose. Our efforts have led to personalised models that can be generated automatically for new video data sets. This means large corpus's can be annotated automatically and 'deep learning' used on this massive volume of data to generate a state of the art pose detector.

Additionally, we have collated a number of large video data sets within the project which are available to other researchers. The Methods developed are also openly available.

http://www.robots.ox.ac.uk/~vgg/data/pose/
Exploitation Route Generic pose detection is useful in the Chosen domain (sign language interpretation) and is being used by our project partners in Oxford. However, pose detection from a single monocular camera has wider applicability in areas such as security monitoring, entertainment, assisted living etc. Conventionally special equipment (e.g. Microsoft Kinekt) are required for such applications.
Sectors Digital/Communication/Information Technologies (including Software)

URL http://www.robots.ox.ac.uk/~vgg/research/pose_track/index.html
 
Description The project has been working with providors of sign language augmentation to the BBC in order to determine applicability of developed technologies to real world application.
Sector Digital/Communication/Information Technologies (including Software),Education
 
Description Oxford 
Organisation University of Oxford
Country United Kingdom 
Sector Academic/University 
PI Contribution Joint research
Collaborator Contribution Code and Ideas
Impact See joint publications: 1) Charles, J., Pfister, T., Magee, D., Hogg D. and Zisserman A. Upper Body Pose Estimation with Temporal Sequential Forests. in British Machine Vision Conference (BMVC), 2014. 2) Pfister, T., Simonyan, K., Charles, J. and Zisserman A. Deep Convolutional Neural Networks for Efficient Pose Estimation in Gesture Videos. in Asian Conference on Computer Vision (ACCV), 2014. 3) Pfister, T., Charles, J. and Zisserman A. Domain-adaptive Discriminative One-shot Learning of Gestures. in European Conference on Computer Vision (ECCV), 2014. 4) Charles, J., Pfister, T, Everingham, M. and Zisserman A. Automatic and Efficient Human Pose Estimation for Sign Language Videos. International Journal of Computer Vision (IJCV) 5) Charles, J., Pfister, T., Magee, D., Hogg D. and Zisserman A. Domain Adaptation for Upper Body Pose Tracking in Signed TV Broadcasts. in British Machine Vision Conference (BMVC), 2013. 6) Pfister, T., Charles, J. and Zisserman A. Large-scale Learning of Sign Language by Watching TV (Using Co-occurrences). in British Machine Vision Conference (BMVC), 2013 7) Pfister, T., Charles, J., Everingham, M. and Zisserman A. Automatic and Efficient Long Term Arm and Hand Tracking for Continuous Sign Language TV Broadcasts. in British Machine Vision Conference (BMVC), 2012.
Start Year 2010