Human-Machine Classification for Astrophysical Projects

Lead Research Organisation: University of Oxford
Department Name: Oxford Physics

Abstract

A major challenge for twenty-first century science is in learning to deal with the large datasets which are rapidly produced by a wide variety of modern surveys and experiments. This is especially true for astrophysicists, who not only deal with large and diverse datasets but who often have to make rapid decisions about which objects to target with future observations. Two separate sets of solution have been proposed. The first relies on advances in computer vision and machine learning to automate the process of astronomical classification, but much of the recent progress in these fields relies on the availability of large training sets of already classified data, something that is difficult to supply in many cases. In many cases, the datasets are so large that even very accurate computer classification will leave an enormous dataset to be sifted through.

The other solution has been through citizen science, collaboration with volunteers in order to work through large datasets. This has been enormously successful for a wide variety of astronomical problems, from the discovery of extra-solar planets to galaxy classification and space weather forecasting. However, the size of datasets expected from new astronomical surveys will overwhelm even the largest and most enthusiastic volunteer base.

This proposal aims to demonstrate a flexible system which can combine these two methods. It builds on the successful Zooniverse platform, which has been responsible for many of the most successful citizen science data analysis projects. We will :

1. Produce more efficient citizen science projects by being smart about assigning tasks to particular volunteers. At present, for example, images which are to be classified are shown randomly to volunteers instead of allowing experts to review more difficult cases. Our preliminary research shows this may increase our efficiency by a factor of ten.
2. Include machine and human classifiers together in the same project. As volunteers work their way through a dataset, so machines can learn from them. This allows an increasing proportion of the dataset to be automatically processed, reducing the burden on the volunteers.
3. Combine both of these new facilities allowing us to sort through data in a new way. We will establish a hierarchy of classification tasks, so that volunteers look for the most common types of object first, before moving on to rarer objects. This will allow a cycle of human and machine classification to rapidly search through large and diverse datasets, and critically will allow us to search for categories of interest that develop during the classification process.

For this demonstration project, we will run an example which makes use of all of these features in a real astronomical survey. This will allow us to demonstrate and measure the efficiencies achieved by these improvements, as well as producing valuable science in its own right. As the Zooniverse platform already supports projects across many disciplines, these tools will be made available for use by a wide range of scientists and researchers, accelerating their progress and making the time invested by more than 1.3 million volunteers more useful.

Planned Impact

Citizen science through the Zooniverse has already proven to be a transformational way of engaging the public in science. Since the projects' beginnings in 2007, more than a million people have used our platform to make a real contribution to science. Participants have explored galaxies, worked with ecologists in Antartica and the Serengeti, and uncovered hidden texts in historical archives. By engaging with the researchers who are leading the projects they are participating in, these volunteers gain a real sense of ownership over the research process; our studies show that Zooniverse volunteers are overwhelmingly more likely to engage with scientific content after encountering our projects. This is especially important because research shows that, having begun, volunteers are equally likely to go on to substantial participation whatever their initial educational level; rather than finding an audience already excited about science, the Zooniverse is creating a new crowd of hungry participants in research.

Nor is this impact limited to the participants themselves. Our projects have featured in museums around the world, and play a regular starring role in the BBC's Stargazing Live series, which reaches an audience of millions. The projects are heavily engaged on social media, with one - Planet Hunters - being amongst the most visited science pages on Facebook. Our volunteers enjoy communicating with their friends and colleagues about their scientific adventures, making them powerful advocates for the scientific process.

We have recently redeveloped our core platform to make it easier for people to build projects, and are already seeing the adoption of citizen science by new audiences. Partnerships with Cancer Research UK - who used our platform to build science-filled games, and with the Natural History Museum demonstrate the uses to which our software can be put. Collaboration with Microsoft Research, and with researchers at Google, inform our understanding of how participants behave in these projects, and how we can do better. Companies like Imperative Space are adapting Zooniverse's astronomy projects for use in the classroom, giving schools a taste of cutting-edge science, and our platform also supports school-led experiments at CERN.

The platform can also be used for more than research. A recent partnership with an NGO, Rescue Global, and the Earth observation company Planet Labs allowed rescuers to quickly generate new maps of settlements in Nepal following the tragic earthquake there. The work contained in this proposal - which aims at efficient, rapid classification - will be key in enabling us to expand this disaster relief work for future crises.

Zooniverse is a project whose primary goal is to aid science. However, uniquely, at its core is the need to engage a very large company of volunteers, and a methodology which allows for long-lasting and effective transformation in attitudes to science. It is effective science, and highly exciting public engagement.

Publications

10 25 50

publication icon
Wright Darryl E. (2017) A transient search using combined human and machine classifications in Monthly Notices of the Royal Astronomical Society

publication icon
Wright D (2017) A transient search using combined human and machine classifications in Monthly Notices of the Royal Astronomical Society

publication icon
Willett Kyle W. (2017) Galaxy Zoo: morphological classifications for 120 000 galaxies in HST legacy imaging in Monthly Notices of the Royal Astronomical Society

publication icon
Weigel Anna K. (2017) Galaxy Zoo: Major Galaxy Mergers Are Not a Significant Quenching Pathway in The Astrophysical Journal

publication icon
Walmsley Mike (2019) Identification of low surface brightness tidal features in galaxies using convolutional neural networks in Monthly Notices of the Royal Astronomical Society

publication icon
Walmsley Mike (2020) Galaxy Zoo: probabilistic morphology through Bayesian CNNs and active learning in Monthly Notices of the Royal Astronomical Society

publication icon
Walmsley M (2020) Galaxy Zoo: probabilistic morphology through Bayesian CNNs and active learning in Monthly Notices of the Royal Astronomical Society

publication icon
Smethurst R. J. (2017) Galaxy Zoo: the interplay of quenching mechanisms in the group environment? in Monthly Notices of the Royal Astronomical Society

publication icon
Smethurst R. J. (2018) SDSS-IV MaNGA: the different quenching histories of fast and slow rotators in Monthly Notices of the Royal Astronomical Society

publication icon
Simmons B. D. (2017) Supermassive black holes in disc-dominated galaxies outgrow their bulges and co-evolve with their host galaxies in Monthly Notices of the Royal Astronomical Society

publication icon
Simmons B. D. (2017) Galaxy Zoo: quantitative visual morphological classifications for 48 000 galaxies from CANDELS in Monthly Notices of the Royal Astronomical Society

publication icon
Robertson Brant E. (2017) Large Synoptic Survey Telescope Galaxies Science Roadmap in arXiv e-prints

publication icon
Mahabal A (2019) Machine Learning for the Zwicky Transient Facility in Publications of the Astronomical Society of the Pacific

publication icon
Kruk Sandor J. (2017) Galaxy Zoo: finding offset discs and bars in SDSS galaxies? in Monthly Notices of the Royal Astronomical Society

publication icon
Kruk Sandor J. (2018) Galaxy Zoo: secular evolution of barred galaxies from structural decomposition of multiband images in Monthly Notices of the Royal Astronomical Society

publication icon
Kruk S (2018) Galaxy Zoo: secular evolution of barred galaxies from structural decomposition of multiband images in Monthly Notices of the Royal Astronomical Society

publication icon
Kapinska A (2017) Radio Galaxy Zoo: A Search for Hybrid Morphology Radio Galaxies in The Astronomical Journal

publication icon
Kapi (2017) Radio Galaxy Zoo: A Search for Hybrid Morphology Radio Galaxies in The Astronomical Journal

publication icon
Hart Ross E. (2017) Galaxy Zoo and SPARCFIRE: constraints on spiral arm formation mechanisms from spiral arm number and pitch angles in Monthly Notices of the Royal Astronomical Society

publication icon
Hart Ross E. (2017) Galaxy Zoo: star formation versus spiral arm number in Monthly Notices of the Royal Astronomical Society

publication icon
Fortson Lucy (2018) Optimizing the Human-Machine Partnership with Zooniverse in arXiv e-prints

publication icon
Boyajian T. S. (2016) Planet Hunters IX. KIC 8462852 - where's the flux? in Monthly Notices of the Royal Astronomical Society

publication icon
Boyajian T (2018) The First Post- Kepler Brightness Dips of KIC 8462852 in The Astrophysical Journal

publication icon
Boyajian (2018) The First Post-Kepler Brightness Dips of KIC 8462852 in The Astrophysical Journal

publication icon
Beck Melanie R. (2018) Integrating human and machine intelligence in galaxy morphology classification tasks in Monthly Notices of the Royal Astronomical Society

publication icon
Beck Melanie R. (2018) Integrating human and machine intelligence in galaxy morphology classification tasks in Monthly Notices of the Royal Astronomical Society

publication icon
Beck M (2018) Integrating human and machine intelligence in galaxy morphology classification tasks in Monthly Notices of the Royal Astronomical Society

 
Description We were able to build a deploy a sophisticated system for incorporating human and machine classifications in the same system; this means faster and more reliable classifications of a variety of astronomical objects are now possible. In particular, our system for detecting transient objects such as supernovae now includes both a trained neural network and human classifications via a Zooniverse project; the combination is more accurate than either alone. More than twenty Zooniverse projects are using the infrastructure which was provided as part of this grant.
Exploitation Route Our software is open source and we encourage researchers to make use of the Zooniverse platform however possible.
Sectors Agriculture, Food and Drink,Education,Environment,Healthcare,Culture, Heritage, Museums and Collections,Pharmaceuticals and Medical Biotechnology

 
Description There are a variety of impacts from the improvements to the Zooniverse platform supported by this award. Firstly, increased participation and range of projects offered means that greater numbers of people are participating, which we know from prior research leads to greater likelihood of participating in other research related activities. Secondly, we have spun out the for-profit company 1715 Labs, which is commercialising our software. Finally, we have had an indirect impact on policy on Antarctic fishing through our work with the Penguin Watch team.
First Year Of Impact 2018
Sector Other
Impact Types Cultural,Societal,Economic,Policy & public services

 
Description Crowdsourcing and Machine Learning for Disaster Relief and Resilience
Amount £212,000 (GBP)
Funding ID ST/S00307X/1 
Organisation Science and Technologies Facilities Council (STFC) 
Sector Public
Country United Kingdom
Start 04/2019 
End 03/2020
 
Description Google 
Organisation Google
Country United States 
Sector Private 
PI Contribution We are deploying aspects of the Zooniverse platform on Google cloud, developing ways to use TensorFlow and other tools for machine learning.
Collaborator Contribution Providing funding to support the use of Google Cloud tools, including our infrastructure.
Impact Open source code available via the Zooniverse repo.
Start Year 2018
 
Description Microsoft Research 
Organisation Microsoft Research
Country Global 
Sector Private 
PI Contribution We are working with Microsoft Research to develop models which encourage volunteers to remain active on the Zooniverse platform. Through this collaboration, we are building tools which allow for interventions such as messaging when volunteers appear to be flagging.
Collaborator Contribution The MSR team are developing the statistical models which predict user behaviour, and which can be used to direct interventions on the platform.
Impact Code available via the Zooniverse open source repository.
Start Year 2018
 
Description Partnership with Crick Institute 
Organisation Francis Crick Institute
Country United Kingdom 
Sector Academic/University 
PI Contribution We are working on a combined human/machine classification scheme for generic high resolution microscopy data which will make use of ConSciCom's understanding of user communities around medical and clinical communities.
Collaborator Contribution Crick are providing data and expertise on machine learning for a suite of such projects.
Impact Projects are currently in beta.
Start Year 2016
 
Title NERO 
Description NERO is the software built on top of the Zooniverse API which provides task assignment and allocation for projects. It is released under an open source license. 
Type Of Technology Webtool/Application 
Year Produced 2017 
Impact The tool has already made possible our partnership with Stargazing Live 
URL https://github.com/zooniverse/nero
 
Company Name 1715 LABS LIMITED 
Description 1715 Labs seeks to make commercial use of the Zooniverse platform, bringing the techniques for crowd management and crowdsourcing we have developed into the commercial realm. 
Year Established 2018 
Impact N/A
Website https://www.1715labs.com/
 
Description Chris Lintott Talk at Bluedot Festival 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Talk about history of citizen science at large festival.
Year(s) Of Engagement Activity 2018
 
Description Chris Lintott Talk at Latitude Festival 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Talk given as part of Latitude Festival - headlining smaller stage.
Year(s) Of Engagement Activity 2018
 
Description Talk: Hammersmith Apollo 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Talk as part of event at Hammersmith Apollo on discovering the unexpected.
Year(s) Of Engagement Activity 2016
 
Description Twitter 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Lintott's twitter feed covers items of interest in citizen science, and now has more than 25000 followers
Year(s) Of Engagement Activity 2017
URL http://twitter.com/chrislintott