Object-Centric Visual Representation And Reinforcement Learning

Lead Research Organisation: University of Oxford

Abstract

Abstract We will develop new object-centric sequence models for vision, with the intention to improve data-efficiency
and robustness to out-of-distribution environments for video prediction and vision-based reinforcement learning.
Aims And Objectives The first stage of the proposed research aims to combine temporal predictive coding with object centric learning and apply the resulting model to video prediction. This stage will aim to answer several questions:
What requirements should an object-centric video model satisfy? How can these requirements be reflected in the design
of OCTPC? How should object properties, instantaneous attribute variables, stochastic temporal evolution of attribute
variables, and inter-object relationships be represented? How should object instances be represented differently to object
types? How precisely should the temporal predictive coding mechanism be specified such that it is capable of learning
sufficiently long-term dependencies? What are the commonalities and differences between existing approaches to object centric learning, and how do these relate to the performance of such algorithms? How does OCTPC perform on a variety of
video-prediction benchmarks? Which design choices and hyperparameters have the greatest effect on performance? How
does the performance, behaviour, and learning efficiency of OCTPC compare with models which aren't object-centric, or
which don't use predictive coding?
1
The second stage of the proposed research aims to explore the possible benefits of OCTPC in reinforcement learning.
There are two primary motivations for doing so. Firstly, reinforcement learning effectively depends on being able to predict
the future, because the ultimate objective is to choose a policy which maximises expected long-term future reward. It is
plausible that a using a performant video prediction model as a component in model-based reinforcement learning would
allow an agent to make more accurate predictions of how changes in policy would affect future experience, and therefore
how the policy should be changed in order to maximise future reward. Secondly, there is a neuroscientific principle which
states that "the processing function of neocortical modules is qualitatively similar in all neocortical regions... there is
nothing intrinsically motor about the motor cortex, nor sensory about the sensory cortex." [13]. Therefore, if it is found
that object-centric inductive biases are useful for video prediction, then it may be the case that similar inductive biases
are useful in policy representations as well. It would be interesting to compare such policy representations to existing
work in hierarchical reinforcement learning [17], and explore whether such representations can improve sample-efficiency
in reinforcement learning, and robustness to out-of-distribution environments.
Novelty Of The Research Methodology To our knowledge, combining temporal predictive coding with object-centric
learning has not previously been explored. It is plausible that exploring this combination will provide valuable contributions
and insights to machine learning, while also providing value to the cognitive sciences by moving closer to an understanding
of human intelligence on the algorithmic level.
Alignment To EPSRC's Strategies And Research Areas This research proposal aligns with the areas of Artificial
intelligence and robotics theme, Artificial intelligence technologies, and Image and vision computing.

Planned Impact

AIMS's impact will be felt across domains of acute need within the UK. We expect AIMS to benefit: UK economic performance, through start-up creation; existing UK firms, both through research and addressing skills needs; UK health, by contributing to cancer research, and quality of life, through the delivery of autonomous vehicles; UK public understanding of and policy related to the transformational societal change engendered by autonomous systems.

Autonomous systems are acknowledged by essentially all stakeholders as important to the future UK economy. PwC claim that there is a £232 billion opportunity offered by AI to the UK economy by 2030 (10% of GDP). AIMS has an excellent track record of leadership in spinout creation, and will continue to foster the commercial projects of its students, through the provision of training in IP, licensing and entrepreneurship. With the help of Oxford Science Innovation (investment fund) and Oxford University Innovation (technology transfer office), student projects will be evaluated for commercial potential.

AIMS will also concretely contribute to UK economic competitiveness by meeting the UK's needs for experts in autonomous systems. To meet this need, AIMS will train cohorts with advanced skills that span the breadth of AI, machine learning, robotics, verification and sensor systems. The relevance of the training to the needs of industry will be ensured by the industrial partnerships at the heart of AIMS. These partnerships will also ensure that AIMS will produce research that directly targets UK industrial needs. Our partners span a wide range of UK sectors, including energy, transport, infrastructure, factory automation, finance, health, space and other extreme environments.

The autonomous systems that AIMS will enable also offer the prospect of epochal change in the UK's quality of life and health. As put by former Digital Secretary Matt Hancock, "whether it's improving travel, making banking easier or helping people live longer, AI is already revolutionising our economy and our society." AIMS will help to realise this potential through its delivery of trained experts and targeted research. In particular, two of the four Grand Challenge missions in the UK Industrial Strategy highlight the positive societal impact underpinned by autonomous systems. The "Artificial Intelligence and data" challenge has as its mission to "Use data, Artificial Intelligence and innovation to transform the prevention, early diagnosis and treatment of chronic diseases by 2030". To this mission, AIMS will contribute the outputs of its research pillar on cancer research. The "Future of mobility" challenge highlights the importance the autonomous vehicles will have in making transport "safer, cleaner and better connected." To this challenge, AIMS offers the world-leading research of its robotic systems research pillar.

AIMS will further promote the positive realisation of autonomous technologies through direct influence on policy. The world-leading academics amongst AIMS's supervisory pool are well-connected to policy formation e.g. Prof Osborne serving as a Commissioner on the Independent Commission on the Future of Work. Further, Dr Dan Mawson, Head of the Economy Unit; Economy and Strategic Analysis Team at BEIS will serve as an advisor to AIMS, ensuring bidirectional influence between policy objectives and AIMS research and training.

Broad understanding of autonomous systems is crucial in making a society robust to the transformations they will engender. AIMS will foster such understanding through its provision of opportunities for AIMS students to directly engage with the public. Given the broad societal importance of getting autonomous systems right, AIMS will deliver core training on the ethical, governance, economic and societal implications of autonomous systems.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S024050/1 01/10/2019 31/03/2028
2722103 Studentship EP/S024050/1 01/10/2022 30/09/2026 Jake Levi