Persistent topological structures in noisy images

Lead Research Organisation: Durham University
Department Name: Mathematical Sciences

Abstract

The proposed research is on the interface between topology and algorithms.The methods are topological coming from the recently developed theory of persistent topology.The key idea of persistent topology is identifying features of geometric objects that remain stable for a wide range of varying parameters. The research objectives are applied and involve designing novel learning algorithms to recognise topological graphs in noisy images.A natural application is identifying a text-based advanced CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) widely used on the web to prevent automatic posting to blogs.A desired learning algorithm will be parallel like the human brain analysing local persistent properties of pixels in a noisy image.Hence it is the next step towards reverse-engineering the human brain.Other potential applications include recognising handwritten mail addresses and designing computer glasses reading signs of directions for blind people.

Planned Impact

We plan to complete at least three research papers. Their extended versions including more explanations, examples and computer experiments will be posted in the arXiv and on the university web page. Additionally, we will implement the final learning algorithm as a Java applet and make it publicly available for testing. The input can be any noisy image as a jpg file, e.g. a shop sign or a geographical map or a sign showing directions. The output is a text produced by the algorithm. We will seek to present our results at reputable international conferences such as COLT (computational learning theory), ALT (algorithmic learning theory), ICML (machine learning) and at various university seminars. Topology is a rather abstract branch of pure mathematics studying very general geometric objects and their properties. Successful applications of topology to real-life problems such as learning algorithms, classification of patterns, computer vision will certainly stimulate more inter-disciplinary collaboration. Many pure mathematicians feel that their research is far away from commercial developments, probably because there are few examples of the opposite. Most practical algorithms including the neural network program beating human champions in backgammon are based on very elementary facts, e.g. any continuous function can be approximated by a linear combination of sufficiently many step functions. So mathematicians will be delighted in sharing their really deep knowledge with developers of learning algorithms. The proposed parallelised learning algorithm based on the prior knowledge about topological graphs is a step towards reverse-engineering the human brain. Our brains have very slow neural connections (about 200 cps, computations per second) in comparison with the latest supercomputer Cray XT5 Jaguar (1759 teraflops cps). However, the human brain is massively parallel and can identify a familiar face in about 150 milliseconds. Recognising handwritten addresses by conventional kernel methods is widely used in the American mail system. We could reduce costs of the British Royal Mail implementing more advanced algorithms. Other applications are automatic converting papers with handwritten data into electronic files and designing computer glasses that can read signs of directions for blind people.
 
Description A new algorithm was designed to reconstruct persistent topological structures of arbitrary graphs or networks in noisy images given as a cloud of points with only pairwise distances. The algorithm is based on the recent theory of persistent homology in topological data analysis. The evolution of topological features of data at different scales can be summarized in a simple combinatorial object called the persistence diagram. Persistent features have a long life span and can be separated from noisy defects in the persistence diagram. The stability of persistence can theoretically guarantee a correct reconstruction if a given sample sufficiently approximates an unknown graph.
Exploitation Route Since Durham refused to support any research in Data Science and twice rejected the prospective PhD student funded by Intel, this student has started in April 2017 a PhD at the new £82M Materials Innovation Factory within the University of Liverpool, funded through the Intel Parallel Computing Centre. The research has led to numerous grants including the EPSRC £3.5M award "Application-driven Topological Data Analysis" (2018-2023), EP/R018472/1. Since October 2019 I'm supervising one research assistant, 6 PhD students as the 1st supervisor, 6 PhD students as the 2nd supervisor and 2 PhD students as the 3rd supervisor.
Sectors Aerospace, Defence and Marine,Chemicals,Construction,Creative Economy,Digital/Communication/Information Technologies (including Software),Healthcare,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

URL http://kurlin.org
 
Description The research has led to two Knowledge Transfer Secondments at Microsoft Research Cambridge, UK. First: October 2014 - September 2015. Second: June - August 2016. Both secondments were funded by £24K from EPSRC and £75K in-kind by Microsoft. Unfortunately Durham insisted on 50% IP (Intellectual Property), which was unacceptable for Microsoft. As a result Durham has lost the collaboration with Microsoft (as with Rolls-Royce earlier) and, in addition, has refused to support any research in the emerging area of Data Science. That is why I have moved to the new £82M Materials Innovation Factory at the University of Liverpool, where I now lead the Topological Data Analysis group of over 10 researchers, http://kurlin.org/#group.
First Year Of Impact 2014
Sector Chemicals,Construction,Creative Economy,Digital/Communication/Information Technologies (including Software),Healthcare,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology
Impact Types Economic

 
Description Application driven Topological Data Analysis
Amount £2,847,111 (GBP)
Funding ID EP/R018472/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 09/2018 
End 08/2023
 
Description Knowledge Transfer Secondment
Amount £24,000 (GBP)
Organisation Microsoft Research 
Department Microsoft Research Cambridge
Sector Private
Country United Kingdom
Start 09/2014 
End 08/2016
 
Description PhD studentship "A rigorous identification of all metastable polymorphs for better and safer drugs" supervised by Dr Vitaliy Kurlin
Amount £34,000 (GBP)
Organisation Cambridge Crystallographic Data Centre 
Sector Academic/University
Country United Kingdom
Start 10/2020 
End 09/2023
 
Description PhD studentship "Data Driven Discovery of Functional Molecular Co-crystals" of Katerina Vriza co-supervised by Dr Vitaliy Kurlin
Amount £30,000 (GBP)
Organisation Cambridge Crystallographic Data Centre 
Sector Academic/University
Country United Kingdom
Start 10/2018 
End 09/2021
 
Description Royal Society International Exchanges with Prof Edelsbrunner
Amount £12,000 (GBP)
Funding ID IES/R2/170039 
Organisation The Royal Society 
Sector Charity/Non Profit
Country United Kingdom
Start 12/2017 
End 12/2019
 
Description Accelerated Crystal Structure Prediction 
Organisation Cambridge Crystallographic Data Centre
Country United Kingdom 
Sector Academic/University 
PI Contribution Vitaliy Kurlin and his team at the University of Liverpool develops a continuous approach to a similarity of crystals, which were studied in the past only by discrete invariants such as symmetry groups
Collaborator Contribution Cambridge Crystallographic Data Centre has contributed £64K to 2 PhD projects supervised by Vitaliy Kurlin at the University of Liverpool, and provides advice on the state-of-the-art in Crystal Structure Prediction and access to the Cambridge Structural Database.
Impact Two PhD scholarships
Start Year 2019
 
Description Knowledge Transfer Secondment at Microsoft Research Cambridge 
Organisation Microsoft Research
Department Microsoft Research Cambridge
Country United Kingdom 
Sector Private 
PI Contribution Joint work on applications of topological data analysis to computer vision
Collaborator Contribution Joint work on applications of topological data analysis to computer vision
Impact 1) Jeremy Forsythe, Vitaliy Kurlin, Andrew Fitzgibbon. Resolution-independent superpixels based on convex constrained meshes without small angles. Proceedings of ISVC 2016: International Symposium on Visual Computing. Lecture Notes in Computer Science, v. 10072, p. 223-233. 2) Jeremy Forsythe, Vitaliy Kurlin. Convex Constrained Meshes for superpixel segmentations of images. Journal of Electronic Imaging, 26(6), 061609 (13 pages), 2017. 3) Vitaliy Kurlin, Donald Harvey. Superpixels Optimized by Color and Shape (SOCS). Proceedings of EMMCVPR 2017 (Energy Minimization Methods in Computer Vision and Pattern Recognition), 14 pages, LNCS 10746, Springer.
Start Year 2014
 
Description PhD studentship for a topologcal analys of micro-cracks in art paintings 
Organisation The Bowes Museum
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution My PhD students develops automatic tools for analysing micro-cracks in art paintings for conservation purposes.
Collaborator Contribution Bowes museum provides access to their art collection and image data.
Impact Levehulme-funded PhD in Visual Culture
Start Year 2015
 
Title Automatic Cloud Analysis 
Description The C++ code computes a Homologically Persistent Skeleton, which is the first stable-under-noise structure of an organised point cloud. 
Type Of Technology Software 
Year Produced 2015 
Open Source License? Yes  
Impact The skeleton is used for developing a new superpixel algorithm jointly with Microsoft Research Cambridge, UK. 
URL http://kurlin.org
 
Title HoPeS: Homologically Persistent Skeleton 
Description For a given 2D cloud of points, the HoPeS algorithm computes a Homologically Persistent Skeleton that optimally captures 1-dimensional cycles corresponding to all dots in the 1D persistence diagram as described in the paper by Philip Smith, Vitaliy Kurlin. Skeletonisation Algorithms with Theoretical Guarantees for Unorganised Point Clouds with High Levels of Noise, https://arxiv.org/abs/1901.03319. 
Type Of Technology Software 
Year Produced 2019 
Impact none yet 
URL https://arxiv.org/abs/1901.03319