Learning to move as a human: one-shot learning of human motion

Lead Research Organisation: University of Sheffield
Department Name: Computer Science

Abstract

Computational models for human motion analysis and synthesis have applications in fields as diverse as healthcare, computer graphics, and robotics. In healthcare, analysis of human movements can be used, for example, for tracking motor decline in the elderly. In computer graphics, human motion analysis can be used for human pose tracking from a single camera when measurements might be noisy or missing due to occlusion. In robotics, human motion analysis and synthesis can be used for teaching robots new skills by imitating demonstrations of a human, reducing the effort required to program an industrial robot or a service robot.

One approach to understand how humans move consists of collecting examples of a particular human activity and designing a machine learning model that extracts patterns from those examples. The more examples we collect, the more likely it is for the model to find common features in the data that can be exploited for solving predictive tasks. However, in different applications that require human motion analysis and synthesis, particularly in robot programming by demonstration, collecting many examples is expensive and time-consuming. I.e. we would like a robot to learn a new skill with as few demonstrations as possible, more like a human does. Indeed, humans learn efficiently by imitation with just one or few examples, which is further validated by their ability to generate new examples or creating abstract motions that were not previously seen in the examples that were used to imitate.

In this project, our objective is to develop a data-efficient machine learning model for human motion using the cognitive science concept of one-shot learning.

In cognitive science, one-shot learning (OL) refers to the idea of building intelligent agents using one or few examples. Successful illustrations of the use of this concept for building data efficient models include OL models for generating speech concepts and handwritten characters with human-like appearance. Recent research in cognitive science suggests that humans achieve OL through the combination of three core principles applied to primitive concepts: causality, compositionality, and "learning to learn". It also claims that these ingredients could play an active role in producing machine learning models that replicate human intelligence.

We will achieve our objective through the two key novelties of this proposal: (i) a generic methodology that simultaneously combines causality, compositionality and learning to learn of motor primitives and (ii) a particular instantiation that uses physics-inspired Gaussian process (GP) representations of such motor primitives.

With respect to (i), although there are machine learning models that incorporate some of the ingredients of OL, their simultaneous combination to build data-efficient models for human motion analysis and synthesis has not been proposed yet. With respect to (ii), our GP representation of a motor primitive uses a physics-inspired covariance function with two features: the efficient use of data due to its non-parametric nature; and the inclusion of the principle of causality of OL, providing a generative mechanism for trajectory data. Compositionality of these GP motor primitives will be approached using ideas from formal language theory, in particular, hidden Markov models with explicit state durations. Learning to learn will be accomplished by providing hierarchies of such hidden Markov models.

In order to use the model in practice, we will provide a statistical inference framework for fitting the parameters of the OL model to given data, and for computing probability distributions for prediction. We will test the performance of the OL model for different tasks related to motion capture data, and for imitation learning using kinesthetic demonstrations from anthropomorphic robots. Our results will be fully reproducible and our software to be released as open source.

Planned Impact

Developing data-efficient machine learning models for human motion can potentially have a positive impact in the following areas.

-- Economy. The Fourth Industrial revolution is taking place at the time this proposal is being written. It has been coined with the term Industry 4.0 (or Digital Manufacturing) and its aim is to use advances in information and communication (IC) technologies to promote the computerisation of manufacturing, consequently increasing productivity. Adopting IC technologies in the industry has the potential to increase the UK manufacturing revenue by 12.5%, equivalent to £20bn (based on the same experience in Germany) by 2020. A key driver force of this Fourth Industrial revolution is Industrial Robotics, a market that is expected to reach $41bn by 2020. These new industrial robots are expected to be more intelligent and flexible enough to develop different tasks in the factory. Data-efficient machine learning models for imitation learning of human motion can impact positively on the advancement of these capabilities for industrial robotics in the UK. Likewise, the UK spent £8.34 billion on social care for the elderly in 2015/2016 and it is estimated that today, one in eight older people live without the proper level of care. The methods that we will develop in this project will potentially provide technologies to build assistive robots with the ability to easily being programmed by imitation for an older person to carry out a daily living task, alleviating the pressure on the social care system.

-- Society. Computational models for human motion have a range of applications in healthcare. For example, they can be used for measuring the progress of a rehabilitation therapy for a previously injured limb, or for early diagnosis of motor-related diseases, e.g. Parkinson's. Developing more faithful models of human locomotion can potentially be used to build more accurate biomarkers in these applications. In addition, analysis of human motion in healthcare is progressively moving from a laboratory-based analysis to daily life monitoring. Therefore, developing probabilistic models for human motion that can cope with noisy measurements obtained from wearable sensors can help to establish them as reliable tools for assessing a medical condition correlated to human locomotion.

-- Knowledge. The generative models that we will develop in this project will have an impact on machine learning, and on imitation learning for robotics. Within machine learning, the probabilistic models that we will develop are examples of semi-parametric models, where the parametric part of the model is modular and given by different types of hidden Markov models, and the non-parametric part is given by powerful representations of observed data in the form of Gaussian processes distributions. Further research can exploit these architectures for designing machine learning models in different applications like computational biology and geostatistics. Within imitation learning for robotics, we claim that using the principles of human-like intelligence for building machine learning models for human motion paves the road for data-efficient robot learning. Further research projects can be developed by increasing the complexity of the models that take these principles into account.

-- People. The project will have a positive impact on the careers of the PI and the RA, who will both gain additional experience in the formulation of novel probabilistic machine learning models and hands-on experience with real robotic systems. An EPSRC New Investigator Award is the ideal opportunity for the PI to start building himself as a leading researcher in human-like computing in the UK.

To achieve these potential impacts, in the shorter and longer term, we will develop Pathways to Impact as described in the corresponding Pathways to Impact section.

Publications

10 25 50
 
Description We have proposed and tested a new model, the HMMLFM, that combines the Hidden Markov Model (HMM) and Latent Force Model (LFM). The model is implemented referencing the research on human movement. Human movement can be considered as a composition of smaller movements, which are called movement primitives. Models for movement primitives can be used in action recognition, abnormal motion detection, and robot motion planning.

There are a variety of models for movement primitives. Dynamic movement primitives (DMP) use linear dynamic systems and controlled force to code a movement primitive with different initial movement states flexibly. However, existing DMP cannot provide information about the error or likelihood of fitting results, which is a strength of probabilistic models. LFM is a Gaussian process with kernels that contains information about the second-order linear dynamic system. It can fit motions with an underlying DMP, providing the error or likelihood of fitting results.

The HMMLFM model extends the capability of LFM to code movements with changing properties more efficiently than the existing extension of LFM, the switched latent force model (Switched LFM). The Switched LFM allows the change of hyperparameters between different segments but does not reuse hyperparameters. Differently, HMMLFM has multiple states, each of which is an LFM with its hyperparameters and applied to different segments. This hierarchical architecture allows reusing of the states, and in experiments, successfully identified the higher-level composition of the movement primitives. The experiments were conducted (1) on toy data to examine the functionality of the model, and the result shows the model can handle data with a significant change of hyperparameters; (2) on data sampled from Switched LFM to test the capability of HMMLFM, and the model got comparable performance with Switched LFM; (3) on CMU MoCap data to test the perforce of HMMLFM on real motion capture data, and the model can fit data from motions such as walking on uneven terrain, running, dancing, revealing the states of motion in training data and identify similar motions in testing data (different captures); (4) on data from a robot learning table tennis, showing the feasibility of applying the model to robot learning.

A key work in combining HMM and LFM is the adjustment to LFM for continuity between states. LFM assumes a motion always starts from position 0 with speed 0. It is not sufficient for connecting LFM in HMMLFM, as second-order smoothness at the connection points is required. According to the superposition of a linear system, we add an adjusting function to LFM. The adjusting function is calculated according to the position and speed at the connection points, to keep a segment's starting with position and speed that are the same as the end of its former segment. The adjusting function will not impact the convergence of the motion if the motion converges to a specific position.

A major difficulty in developing the model is optimisation. The log-likelihood function for this model is non-convex, and the training could fail with bad hyperparameter initialisation. The training of HMMLFM is even harder. We developed a pipeline to handle the training process, including initialisation of parameters by segment-wised training and clustering, and explore different initialisation when the training falls.
Exploitation Route The major application of outcomes is about modelling motor primitives. The HMMLFM has been tested on offline data from robot.

The RA of the project has been attending group meetings held by Prof. Tony Prescott about the Human Brain Project (HBP). We are seeking the application of our models to datasets from service robots.

We attended Research Workshop in Structural Dynamics held by the Dynamics Research Group (DRG) at the University of Sheffield. We found that researchers working on vibration were interested in the models we are working on. Thus, we will explore further for collaboration with people working in the field toward modelling vibration of aeroplane or buildings.

The models in this project are being implemented in Matlab and Python, with the framework GPmat and GPy, respectively.
Sectors Aerospace, Defence and Marine,Construction,Digital/Communication/Information Technologies (including Software),Leisure Activities, including Sports, Recreation and Tourism,Manufacturing, including Industrial Biotechology,Other

 
Description As part of this research, we proposed a new way for faster-developing kernel functions for Gaussian processes with a physics intuition. Traditionally, these types of covariance functions had been expensive to develop and solve. Our key contribution to this project has opened the door to ways in which covariance functions with physics constraints can be easily implemented. Based on this methodological contribution, we have created sophisticated non-linear models that have outperformed deep learning models in cases where we can include some flavour of physical constraints in the machine learning model (see McDonald and Álvarez, 2021).
First Year Of Impact 2021