Persistent topological structures in noisy images

Lead Research Organisation: Durham University
Department Name: Mathematical Sciences


The proposed research is on the interface between topology and algorithms.The methods are topological coming from the recently developed theory of persistent topology.The key idea of persistent topology is identifying features of geometric objects that remain stable for a wide range of varying parameters. The research objectives are applied and involve designing novel learning algorithms to recognise topological graphs in noisy images.A natural application is identifying a text-based advanced CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) widely used on the web to prevent automatic posting to blogs.A desired learning algorithm will be parallel like the human brain analysing local persistent properties of pixels in a noisy image.Hence it is the next step towards reverse-engineering the human brain.Other potential applications include recognising handwritten mail addresses and designing computer glasses reading signs of directions for blind people.

Planned Impact

We plan to complete at least three research papers. Their extended versions including more explanations, examples and computer experiments will be posted in the arXiv and on the university web page. Additionally, we will implement the final learning algorithm as a Java applet and make it publicly available for testing. The input can be any noisy image as a jpg file, e.g. a shop sign or a geographical map or a sign showing directions. The output is a text produced by the algorithm. We will seek to present our results at reputable international conferences such as COLT (computational learning theory), ALT (algorithmic learning theory), ICML (machine learning) and at various university seminars. Topology is a rather abstract branch of pure mathematics studying very general geometric objects and their properties. Successful applications of topology to real-life problems such as learning algorithms, classification of patterns, computer vision will certainly stimulate more inter-disciplinary collaboration. Many pure mathematicians feel that their research is far away from commercial developments, probably because there are few examples of the opposite. Most practical algorithms including the neural network program beating human champions in backgammon are based on very elementary facts, e.g. any continuous function can be approximated by a linear combination of sufficiently many step functions. So mathematicians will be delighted in sharing their really deep knowledge with developers of learning algorithms. The proposed parallelised learning algorithm based on the prior knowledge about topological graphs is a step towards reverse-engineering the human brain. Our brains have very slow neural connections (about 200 cps, computations per second) in comparison with the latest supercomputer Cray XT5 Jaguar (1759 teraflops cps). However, the human brain is massively parallel and can identify a familiar face in about 150 milliseconds. Recognising handwritten addresses by conventional kernel methods is widely used in the American mail system. We could reduce costs of the British Royal Mail implementing more advanced algorithms. Other applications are automatic converting papers with handwritten data into electronic files and designing computer glasses that can read signs of directions for blind people.
Description A new algorithm was designed to reconstruct persistent topological structures of arbitrary graphs or networks in noisy images given as point clouds on the plane. The algorithm is based on the recent theory of persistent homology in topological data analysis. The evolution of topological features of data at different scales can be summarized in a simple combinatorial object called the persistence diagram. Persistent features have a long life span and can be separated from noisy defects in the persistence diagram. The stability of persistence can theoretically guarantee a correct reconstruction if a given sample sufficiently approximates an unknown graph. The project can be potentially extended to recognising more complicated topological structures in high-dimensional big data sets in metric spaces.
Exploitation Route Since Durham refused to support any research in Data Science and twice rejected by the prospective PhD student funded by Intel, this student has started in April 2017 a PhD at the University of Liverpool, funded through the Intel Parallel Computing Centre.
Sectors Aerospace, Defence and Marine,Chemicals,Creative Economy,Digital/Communication/Information Technologies (including Software),Electronics,Environment,Manufacturing, including Industrial Biotechology,Culture, Heritage, Museums and Collections

Description The research has led to two Knowledge Transfer Secondments at Microsoft Research Cambridge, UK. First: October 2014 - September 2015. Second: June - August 2016. Both secondments were funded by 20K from EPSRC. Unfortunately Durham insisted on 50% IP (Intellectual Property), which was unacceptable for Microsoft. As a result Durham has lost the collaboration with Microsoft (as with Rolls-Royce earlier) and, in addition, has refused to support any research in the emerging area of Data Science. That is why I have moved to the Materials Innovation Factory at the University of Liverpool.
First Year Of Impact 2014
Sector Chemicals,Creative Economy,Digital/Communication/Information Technologies (including Software),Electronics,Manufacturing, including Industrial Biotechology,Culture, Heritage, Museums and Collections
Impact Types Economic

Description Knowledge Transfer Secondment
Amount £15,000 (GBP)
Organisation Microsoft Research 
Department Microsoft Research Cambridge
Sector Private
Country United Kingdom
Start 04/2016 
End 03/2017
Description Knowledge Transfer Secondment at Microsoft Research Cambridge 
Organisation Microsoft Research
Department Microsoft Research Cambridge
Country United Kingdom 
Sector Private 
PI Contribution Joint work on applications of topological data analysis to computer vision
Collaborator Contribution Joint work on applications of topological data analysis to computer vision
Impact In progress since October 2014
Start Year 2014
Description PhD studentship for a topologcal analys of micro-cracks in art paintings 
Organisation The Bowes Museum
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution My PhD students develops automatic tools for analysing micro-cracks in art paintings for conservation purposes.
Collaborator Contribution Bowes museum provides access to their art collection and image data.
Impact Levehulme-funded PhD in Visual Culture
Start Year 2015
Title Automatic Cloud Analysis 
Description The C++ code computes a Homologically Persistent Skeleton, which is the first stable-under-noise structure of an organised point cloud. 
Type Of Technology Software 
Year Produced 2015 
Open Source License? Yes  
Impact The skeleton is used for developing a new superpixel algorithm jointly with Microsoft Research Cambridge, UK.