iSee - Intelligent Vision for Grasping

Lead Research Organisation: University of Glasgow

Department Name: School of Computing Science

Abstract

Intelligent vision is a key enabler for future robotics technology. Shadow's recent development work on a disruptive universal gripper (launched at Innovate16) has identified a need for better vision to permit the automation of grasping. Building on R&D at the University of Glasgow for over 20 years, the iSee project will establish concrete robot vision benchmarks, based on commercially relevant scenes, and then develop, validate and integrate vision sensors and processing algorithms into the Smart Grasping System (SGS) to enable it to reach significant new markets in automation, logistics and service robotics. The following candidate sensors have been selected for benchmarking:

A. Low-cost time of Flight (TOF) 3D cameras (available from various companies)
B. Stereo pairs of off-the-shelf HD cameras and stereo pairs of embedded vision sensors in conjunction with CVAS's existing custom stereo-pair image matching and photogrammetry software
C. An Asus Xtion RGBD camera will serve as a benchmark reference sensor

We propose to build an integrated hand-eye system for each sensor listed above, along with appropriate lighting, and to develop complete integrated pipelines to benchmark the different combinations of capture and analysis systems on the specified scenarios. This investigation will allow us to determine the performance of 2D and 3D sensing methods in terms of quality of image capture and hand-eye performance.

We propose also to evaluate the new and highly disruptive Deep Convolutional Neural net technology (DCNNs) that has the potential to leapfrog the best algorithmic vision methods and to provide a fast, accurate and complete vision solution that meets the demands of advanced robotic grasping and manipulation. We will thus augment the evaluation with potentially very efficient and highspeed DCNN algorithms for interpreting images using potentially low-cost sensors for:

* Detecting and localising known objects and estimating their pose for grasping purposes
* Estimating depth, size and surface normals directly from single monocular images, using transfer methods
* Recovering depth from binocular and monocular camera systems using stereo matching and structure from motion (optical flow) respectively

Once trained, DCNNs can analyse images very quickly and are now becoming suitable for low-cost embedded platforms, such as smartphones. This aspect of the proposed investigation has the potential to massively simplify the sensor in terms of hardware only single cameras, or stereo pairs of cameras, are required in combination with DCCNs as the basis for a vision system which might potentially provide all of the functionality required to control the hand in a wide range of scenarios.

Benchmark results will let us develop specific camera-algorithm combinations to improve performance for the specified use cases over a number of test-evaluate-improve iterations. The core 3D sensing approaches will be integrated with the SGS, and we shall evaluate additional cameras situated off-hand providing critical ancillary visual input when objects are close to the gripper camera prior to a grasp, or during in-hand manipulation. Both camera systems will be used to acquire different views of the scene, with both systems mounted on a robot arm for interactive perception of the scene and objects contained within it.

In parallel, we will develop a showcase demonstration system at Shadow based on Shadow's current grasp planning software coupled to the 3D images captured by the benchmarked 3D vision systems. The developed vision modules will be encapsulated within the "Blocky" programming system to afford a simple and direct method for end-users to take advantage of this capability.

In conclusion, we believe that the robotics hand-eye pipelines proposed within the iSee project have the potential to play an important role in maintaining market leadership in the development of complete robotics system solutions.

Planned Impact

The iSee project will provide a technology that can utilise in assistive and social settings that will underpin fundamental research and commercial collaborations and will deliver impacts in the Knowledge, Economy, Society, and People areas.

Knowledge
In iSee, we will ensure that the understanding of problems and solutions generated by this activity will flow in both directions between the Shadow and the CVAS group. Therefore, direct scientific impacts will occur if the benefits of the iSee's vision library lead to its adoption within the smart grasping system. Indirect impacts will occur if iSee's integrated smart sensing is adopted in other robotic scenarios outside the current scope of the project -- e.g. assistive and care roles, human-robot interaction, etc.

Robotics and autonomous systems are recognised by the UK as one of the eight grand technologies of the future of which iSee's vision and grasping will subserve as the founding robotic platform for the development and design of new robotic and autonomous technologies. In the longer term enhanced and integrated visual sensing technologies within smart grasping systems will encourage the development of new types of robots and robotic systems. Specifically, robots will be capable of working in new areas such as constrained places in manufacturing that are challenging for humans to access, and to operate in environments not suitable for industrial vision.

Economy
Robotic technologies are increasingly used in high wage economies such as the UK, and it is anticipated to be one of the drivers of the fourth industrial revolution. The technology developed in iSee will provide the industrial sector with immediate economic impact in terms of the exploitation of new product sales and profitability that is tightly coupled with design, research, production, sales, user feedback and field trials.

It is anticipated that robotics will fundamentally reduce labour costs by replacing a large proportion of routine roles. The Copenhagen study shows that UK adoption of industrial automation will produce a long-term increase in productivity of 22% and workforce increase of 7.4% as staff are re-skilled and moved to higher skilled roles. In this context, iSee's will be extremely significant since the project will have impact on the development of future service robots unlocking new industries. Vision for service robots is a significant challenge, and if we can deploy an effective sensorised solution, we have the potential to enable a new wave of startups creating vision-enabled service robots across multiple market domains.

People
iSee will facilitate the development of new vision and robotics skills in the research associates and CVAS academic staff. People with robotics and autonomous systems expertise are in high demand in both industry and academia and are significant and economic contributors. Likewise, the ability to deploy robots to perform a wider range of repetitive manual tasks will reduce the incidence of industrial injuries due to tiredness and boredom, and, hence, improve the quality of people's working environments.

Society
Robots are also key to addressing social challenges in high wage economies, e.g. increasing healthcare demands and the aging population. The enhanced reliability that iSee will deliver is essential for disruptive new robotic technologies that are not currently possible such as hospitals and care facilities. A new generation of robots may be deployed in assistive and care roles, which could have significant impact on social care and aging society challenges. End effectors that can see mean that robots can work in areas where illumination cannot be controlled and in areas where access is constrained or dangerous such as inside storage shelving or inside large workpieces, and land mine defusing or bomb disposal.

Funded Value:

£149,216

Funded Period:

Jan 17 - Mar 18

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/R005605/1

Principal Investigator:

Jan Paul Siebert

Research Subject:

Info. & commun. Technol. (50%)

Mechanical engineering (50%)

Research Topic:

Image & Vision Computing (50%)

Robotics & Autonomy (50%)

Organisations

People	ORCID iD
Jan Paul Siebert (Principal Investigator)
John Williamson (Co-Investigator)
Gerardo Aragon-Camarasa (Co-Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Hristozova Nina (2018) Efficient Egocentric Visual Perception Combining Eye-tracking, a Software Retina and Deep Learning in arXiv e-prints

Khan A (2017) Interactive Perception based on Gaussian Process Classification Applied to Household Object Recognition & Sorting

Khan A. (2017) Interactive Perception based on Gaussian Process Classification Applied to Household Object Recognition & Sorting.

Ozimek P (2019) A Space-Variant Visual Pathway Model for Data Efficient Deep Learning. in Frontiers in cellular neuroscience

Ozimek P (2017) Egocentric Perception using a Biologically Inspired Software Retina Integrated with a Deep CNN.

Ozimek P (2017) Integrating a Computational Model of the Retinocortical Transform with Deep Learning.

Ozimek, P. (2019) A Space-Variant Visual Pathway Model for Data Efficient Deep Learning in Frontiers in Cellular Neuroscience

Siebert, J.P. (2018) Smart Visual Sensing Using a Software Retina Model

Siebert, J.P. (2017) Advances in Hand-Eye Robot Interactions

Key Findings
Impact Summary
Further Funding
Research Databases and Models
Research Tools and Methods
Collaboration
Engagement Activities


Description	The most significant achievements of the award comprise the construction of a robotic hand-eye testbed system that has demonstrated state of the art performance in detecting and localising objects for robotic grasping and manipulation. A novel training pipeline has been developed that support the practical use of deep learning technology for learning object appearance that can be deployed to serve the use-cases envisioned for the technology, namely bin-picking, materials handling and order completion in warehouse and manufacturing scenarios. We have demonstrated object recognition using both conventional 3D computer vision techniques and state of the art deep learning methods. We have also shown that these latter Deep Net based methods are capable of segmenting objects in challenging situations where they are partially occluded, piled in heaps or located inside transport totes. In addition to the above we have demonstrated that it is possible to couple a high-resolution software retina to the deep net to allow large images to be processed in a single pass of the network, greatly improving the efficiency and utility of deep nets. Finally we have improved out 3D imaging algorithms to a level where these appear to be capturing 3D surface information of a quality rivaling laser scanning technology, opening the potential for very low-cost, and highly compact, 3D vision systems.
Exploitation Route	The technology we have developed could be applied in a wide variety of industries, manufacturing processes and warehouse order processing systems. The vision components are sufficiently generic to improve operation of a wide variety of autonomous systems including drone aircraft and driverless vehicles. The current consortium is bidding for further funding support to take the system we have developed to the next level of industrial integration and more extensive trials in manufacturing and warehouse order completion scenarios. Separate consortia are bidding for funds to develop the vision technology in the context of drone aircraft control and navigation and also driverless vehicles. In a new collaboration with a number of Institutes of Cognitive Neuroscience we are investigating using the retina-brain model explored in iSee for developing functional visual pathway models linked to real subject data collected using fMRI scanners.
Sectors	Aerospace, Defence and Marine,Agriculture, Food and Drink,Chemicals,Communities and Social Services/Policy,Construction,Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Electronics,Energy,Environment,Healthcare,Leisure Activities, including Sports, Recreation and Tourism,Manufacturing, including Industrial Biotechology,Culture, Heritage, Museums and Collections,Pharmaceuticals and Medical Biotechnology,Retail,Transport


Description	Our findings are being used to support developments in advanced manufacturing and also in how to analyse aerial images of landscape to identify sites of historic interest. However, these are still in the research and development stage, as opposed to deployed practice or products.
First Year Of Impact	2018
Sector	Manufacturing, including Industrial Biotechology,Culture, Heritage, Museums and Collections
Impact Types	Cultural,Policy & public services


Description	Doctoral Training Account
Amount	£62,000 (GBP)
Organisation	Engineering and Physical Sciences Research Council (EPSRC)
Sector	Public
Country	United Kingdom
Start


Title	Deep Net Trainign Pipeline
Description	New training pipeline to allow large numbers of segmented and labelled images depicting views of objects to be extracted from video sequences to allow the large scale data required to train Depp Nets to recognise objects for robotics grasping and manipulation.
Type Of Material	Improvements to research infrastructure
Year Produced	2018
Provided To Others?	No
Impact	Too early as the method has not yet been published.


Title	Deep Softare Retina
Description	This is an Engineering/Technology project. We have produced a software retina-DCNN combination that improves the speed at which deep nets can process images and also process large images.
Type Of Material	Improvements to research infrastructure
Year Produced	2017
Provided To Others?	Yes
Impact	It is now possible to process images of the order of 1M pixels and larger in a single forward pass through a DCNN, allowing real-time computer vision on embedded computer platforms.
URL	http://www.eyewear-computing.org/EPIC_ICCV17/Short_Papers/EPIC17_id21.pdf


Title	Retina pipeline codes
Description	Codes to implement a model of the retino-cortical transformation for improving the efficiency of Deep Learning artificial neural networks for visual image processing.
Type Of Material	Model of mechanisms or symptoms - human
Year Produced	2017
Provided To Others?	Yes
Impact	Support for ~20 undergraduate and MSc student projects and a number of PhD projects. Primary URL: https://github.com/Pozimek/RetinaVision
URL	https://github.com/Pozimek/RetinaVision


Title	Deep Software Retina
Description	Novel combination of a software retina with a Depp Neural network
Type Of Material	Computer model/algorithm
Year Produced	2017
Provided To Others?	Yes
Impact	Enables deep nets to process large images > 1Mpixel in a single pass and therefore train and execute 10-100x faster.
URL	http://www.eyewear-computing.org/EPIC_ICCV17/Short_Papers/EPIC17_id21.pdf


Title	Object DCNN Training Set
Description	Collection of segmented and labelled RGB images depicting views of objects for training Depp Nets (7,500 images per object for 10 objects)
Type Of Material	Database/Collection of data
Year Produced	2018
Provided To Others?	No
Impact	The ability to recognise and segment objects which are partially occluded, in heaps or in totes.


Description	Shadow-Glasgow-iSee
Organisation	Shadow Robot Company
Country	United Kingdom
Sector	Private
PI Contribution	The GU research team has demonstrated functional computer vision systems integrated within the Robot Operating System that are capable of detecting, identifying and localising the pose of household objects to allow these to be grasped and manipulated using a a Shadow SGS (Smart Grasping System) robot manipulator. We have further demonstrated a Deep Learning based vision system capable of identifying and segmenting objects in challenging scenarios (partially occluded, piles or in a transport tote). We have advanced passive stereo-based 3D sensing to operate in robot industrial scenarios and demonstrated an in-hand wide angle camera suitable for use with a software retina developed by GU. Under iSee we developed a high-resolution software retina and have demonstrated its potential to accelerate DL vision systems by reducing the input data to the DCNN by x17, with the potential for greater efficiencies.
Collaborator Contribution	Shadow developed the Smart Grasping System 3-fingered hand and integrated this with ROS and the camera and vision systems being developed by GU.
Impact	An integrated hand-eye robot manipulation system is in the final stages of development.
Start Year	2017


Description	ARM Research Summit 2017
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Industry/Business
Results and Impact	Presentation at the 2017 ARM Research Summit in Cambridge describing our software retina work and its application to robot vision in the iSee project.
Year(s) Of Engagement Activity	2017
URL	https://developer.arm.com/research/summit/previous-summits/2017


Description	Human Brain Project 2017 Meeting
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Presentation describing the software retina work being undertaken and applications in robotics.
Year(s) Of Engagement Activity	2017
URL	https://sos.exo.io/public-website-production/filer_public/e0/1c/e01cdb4e-5590-46fb-ae4b-46c38ce0db1d...


Description	KESS Public Engagement Lecture
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Public/other audiences
Results and Impact	The iSee PI gave a lecture in 3D vision for Robotics by invitation to the Kilmarnock Engineering Science Society. This produced debate about future developments in automation and AI.
Year(s) Of Engagement Activity	2018
URL	http://www.kess2012.org/


Description	Keynote presentation opening ICMMI 2017
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Personal invitation to the PI to give the keynote presentation opening ICMMI 2017 International Conference on Man-Machine Interactions October 3-6, 2017 Cracow, Poland
Year(s) Of Engagement Activity	2017
URL	http://icmmi.polsl.pl/

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications