LUCID: Clearer Software by Integrating Natural Language Analysis into Software Engineering

Lead Research Organisation: University College London
Department Name: Computer Science

Abstract

Developers spend most of their time maintaining code, with little tool support.
To maintain code, one must understand it. Clear code is easier to read and
understand, and therefore less expensive and risky to evolve and maintain; it
is also notoriously difficult to write. We will help developers write clearer
code to speed maintenance, and increase developer productivity. Source code
unites two channels - the programming language and natural language - to
describe algorithms. LUCID will advance the state of the art in software
engineering by developing new analyses that exploit the interconnections
between these channels to find uninformative names, stale comments, and bugs
that manifest as discrepancies between the two channels.

Planned Impact

LUCID attacks a core software engineering concern; it will build tools that
help developers to maintain software more quickly, with less risk and less
cost. Thus, the work in this proposal has the potential for enormous economic
benefits in the long term. The UK has one of the strongest software sectors in
Europe. For example, in 2008 the UK accounted for 25% of European software
companies. By making software maintenance cheaper, this project will benefit
companies that sell software by lowering the costs of evolving their code and
releasing new versions. These tools will also benefit the many companies that
evolve and maintain custom software systems for their own in house use, by
lowering the cost of these infrastructural projects.

Publications

10 25 50
publication icon
Allamanis M (2018) A Survey of Machine Learning for Big Code and Naturalness in ACM Computing Surveys

publication icon
Allamanis M (2018) Mining Semantic Loop Idioms in IEEE Transactions on Software Engineering

publication icon
Hellendoorn V (2018) Deep learning type inference

publication icon
Meyer A N (2019) Today was a Good Day: The Daily Life of Software Developers in IEEE Transactions on Software Engineering

 
Description Committee Member in Program Committee within the ECOOP Research Papers-track
Geographic Reach Multiple continents/international 
Policy Influence Type Participation in a advisory committee
URL https://2018.ecoop.org/committee/ecoop-2018-research-track-program-committee
 
Description Committee Member in Program Committee within the ISSTA Technical Papers-track
Geographic Reach Multiple continents/international 
Policy Influence Type Participation in a advisory committee
URL https://2018.ecoop.org/committee/issta-2018-technical-papers-program-committee
 
Description Earl Barr on National Science Foundation panel for Software and Hardware Foundations (SHF) Program
Geographic Reach North America 
Policy Influence Type Participation in a advisory committee
URL https://nsf.gov/
 
Description Collaboration with Miltos Allamanis, Microsoft Research Cambridge 
Organisation Microsoft Research
Department Microsoft Research Cambridge
Country United Kingdom 
Sector Private 
PI Contribution We are introducing new conceptual types in programs by studying how identifiers flow to each other through assignments in programs.
Collaborator Contribution Miltos is helping us learn new types for these identifiers by learning over the data on flows across assignments
Impact No outputs yet
Start Year 2017
 
Description Dr Earl Barr presentation at 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact AIFORSE Conference 2017 - the first global Conference on Artificial Intelligence (AI) for Software Engineering (SE) - will host on the 10th of November 2017 in Barcelona.
The main Purpose of the Conference, first of all, is to build a Bridge between the most significant Players of the Software Engineering Industry from one side and the most advanced adopters of cutting-edge Applied Artificial Intelligence Technologies from another side.
The Leaders and Experts of Software Engineering and pioneer Innovators of Artificial Intelligence in SE will meet on Communication Stage to accelerate the Development and increase the Efficiency of the Operations in the Industry.
12 Hours of Networking, Discussions, bright and unique Reports of the 10 best Speakers in the Industry. Speakers are Representatives of Companies from around the World, who already apply AI to the Software Engineering. They will not only disclose the Tools that help solve Problems faster and decrease Costs, but will also define the Development Vector of Software Engineering Industry.
Year(s) Of Engagement Activity 2017
URL http://aiforse.org/conference-2017
 
Description Earl Barr - talk at Semmle 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Earl Barr presents his paper To Type or Not to Type: Quantifying Detectable Bugs in JavaScript at Semmle
http://earlbarr.com/publications/typestudy.pdf
Year(s) Of Engagement Activity 2018
URL https://semmle.com/
 
Description Earl Barr attending invitation only workshop on software security, National Cyber Security Centre 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Invitation only workshop to discuss the requirements for an international collaborative effort and infrastructure to support large-scale empirical research on software security. The workshop was held in London on the 17th of December and was supported by the National Cyber Security Centre.

The collective goal for the day was to develop a shared understanding of the challenges faced by research on software code analysis for cybersecurity and outline a roadmap for an international testbed infrastructure for large-scale experimental research on software security.
Year(s) Of Engagement Activity 2018
 
Description Earl Barr invited speaker CHOOSE forum in Switzerland 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The CHOOSE Forum 2018 is organized by the Zurich Empirical Software engineering Team (ZEST) at the University of Zurich, on behalf of CHOOSE.
Earl Barr presented his paper Bimodal Software Engineering
Year(s) Of Engagement Activity 2018
URL https://choose.swissinformatics.org/events/choose-forum-2018-software-engineering-and-machine-learni...
 
Description Earl Barr presents Bimodal Software Engineering at FLOC 2018: FEDERATED LOGIC CONFERENCE 2018, MLP PROGRAM 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Earl Barr presents Bimodal Software Engineering at FLOC 2018: FEDERATED LOGIC CONFERENCE 2018 in the MLP PROGRAM
Year(s) Of Engagement Activity 2018
URL https://easychair.org/smart-program/FLoC2018/MLP-program.html
 
Description Earl Barr presents his paper Mining Semantic Loop Idioms @ FSE 2018 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Earl Barr presents his paper Mining Semantic Loop Idioms @ FSE 2018 in the Journal-First track
Sun 4 - Fri 9 November 2018 Lake Buena Vista, Florida, United States
Year(s) Of Engagement Activity 2018
URL https://2018.fseconference.org/event/fse-2018-journal-first-mining-semantic-loop-idioms
 
Description Earl Barr research visit to Monash University and University of Adelaide, Australia 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Earl Barr visited Monash University and University of Adelaide, Australia. He presented Bimodal Software Engineering at both Universities.
Tuesday (Oct. 2nd) - Monash University
09:00-10:00 KEYNOTE (General Seminar, Earl Barr, UCL) - Bimodal Software Engineering
https://www.monash.edu/it/our-research/research-seminars/events/events/2018/earl-barr-bimodal-software-engineering
Year(s) Of Engagement Activity 2018
URL https://www.monash.edu/it/our-research/research-seminars/events/events/2018/earl-barr-bimodal-softwa...
 
Description Huawei Workshop - Dr Earl Barr 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Automated Programming Workshop funded by Huawei on 11th Dec 2017
Professor Daniel Kröning : University of Oxford/DiffBlue
Professor Earl Barr : UCL
Professor Charles Sutton : University of Edinburgh
Professor Hong Zhu : Oxford Brookes University
Dr. Ian Bayley : Oxford Brookes University
Professor Mark Harman : UCL/Facebook
Professor Peter O'Hearn : UCL/Facebook
Professor Alastair F. Donaldson : Imperial College London
Professor Philippa Gardner : Imperial College London
Dr. David White : UCL
Dr. David Kelly (UCL)
Dr. Zheng Gao (UCL)
Mr. Laifa Zhang: President of RDCC (R&D Competence Center)
Mr. Tony Chang: Chief Scientist, VP of RDCC in US
Mr. Ni Huang (Eric) : Senior Director of RDCC Technology Planning Dept.
Mr. Xuewen Gong (Sean) : Director of RDCC Technology Cooperation Dept.
Professor Qianxiang Wang: Director of software analysis LAB of HUAWEI, vice chair of ACM CSOFT(China chapter of SIGSOFT), secretary-general of CCF TCSE(Technical Committee of Software Engineering, China Computer Federation).
Mr. Michael Hill-King: Collaboration Director, Huawei Cambridge Research Centre.
Mr. Duo Wu: Collaboration Manager, Huawei Cambridge Research Centre.
Miss. Yuncong Zou: Collaboration Assistant, Huawei Cambridge Research Centre.
Year(s) Of Engagement Activity 2017
 
Description Talk at source{d} paper reading club - Madrid, 16 November 2016 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Deep Learning for Programming Language Type Inference
On Friday, November 16th, as part of source{d} paper reading club [1], we are going to talk about a paper that was recently published at FSE'18: Deep Learning Type Inference [2].

ABSTRACT
Dynamically typed languages such as JavaScript and Python are
increasingly popular, yet static typing has not been totally eclipsed:
Python now supports type annotations and languages like TypeScript
offer a middle-ground for JavaScript: a strict superset of
JavaScript, to which it transpiles, coupled with a type system that
permits partially typed programs. However, static typing has a cost:
adding annotations, reading the added syntax, and wrestling with
the type system to fix type errors. Type inference can ease the
transition to more statically typed code and unlock the benefits of
richer compile-time information, but is limited in languages like
JavaScript as it cannot soundly handle duck-typing or runtime evaluation
via eval. We propose DeepTyper, a deep learning model
that understands which types naturally occur in certain contexts
and relations and can provide type suggestions, which can often
be verified by the type checker, even if it could not infer the type
initially.
Year(s) Of Engagement Activity 2018
URL https://github.com/src-d/reading-club
 
Description The 55th CREST Open Workshop - Bimodal Program Analysis 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Overview:

Software is bimodal: it interlinks two channels, an algorithmic channel aimed at devices and a natural language channel aimed at developers. Most research has focused on one channel or the other, not their interplay. Simultaneously considering both channels promises a new source of constraints for improving program analysis and software engineering tools. For example, names in program text can be exploited to refine a type lattice. The CREST Open Workshop on PL and NLP will explore how to identify and exploit these cross-channel connections.

Organisers:

Earl Barr, CREST Centre, SSE Group, Department of Computer Science, UCL, UK

Santanu Dash, CREST Centre, SSE Group, Department of Computer Science, UCL, UK
Year(s) Of Engagement Activity 2017
URL http://crest.cs.ucl.ac.uk/cow/55/
 
Description The 60th CREST Open Workshop - Those were the DAASE 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Overview:

DAASE has advanced the state of the art in numerous directions: novel, principled technique for handling class imbalance, optimising energy consumption, automated program repair, automatically generating product roadmaps, to name a few.

DAASE has achieved breakthroughs, including automated software transplantation, the first approach for transplanting code that dynamically adapts it for a new context, and a human-competitive multi-objective software effort estimator that balances accuracy against variance, both of which won Hummies at GECCO. DAASE has pioneered a new field of research called genetic improvement and produced award-winning work on fitness landscape analysis and visualisation.

DAASE has spawned a number of start-ups, most notably Sapienz which Facebook acquired and which now tests and automatically repairs code at Internet scale. Heathrow's plane scheduling now relies on a bespoke optimisation algorithm devised by DAASE researchers. Automated software repair using genetic improvement is also now a part of Janus Manager, a management software for rehabilitation centres in Iceland.

Join us to review and celebrate these accomplishments, and discuss how to carry them forward.

Day 1 - Monday 3 Dec

10:45 - Pastries

11:15 - Introductions - Earl Barr

11:30 - Bill Langdon, CREST Centre, SSE Group, Department of Computer Science, UCL, UK

Genetic Improvement by Evolving Program Data

12:00 - Darrell Whitley, Colorado State University, USA

Optimal Neuron Selection and Ensemble Based Learning

12:30 - Lunch

13:30 - Mark Harman, CREST Centre, SSE Group, Department of Computer Science, UCL, UK and Facebook

Deploying Search Based Software Engineering with Sapienz at Facebook

14:00 - John Woodward, School of Electronic Engineering & Computer Science, Queen Mary University of London, UK

Genetic Improvement in a Live System

14:30 - Earl Barr, CREST Centre, SSE Group, Department of Computer Science, UCL, UK

Bimodal software engineering

15:00 - Refreshments

15:30 - John Clark, Department of Computer Science, University of Sheffield, UK

Pushing the searchboat out: from quantum software simulation to digital twinning

16:00 - Jeff Kramer, Department of Computing, Imperial College London, UK

The challenge of change

16:30 - Close of day

Day 2 - Tuesday 4 Dec

11:00 - Pastries

11:30 - Justyna Petke, CREST Centre, SSE Group, Department of Computer Science, UCL, UK

Specialising Software Using Genetic Improvement and Code Transplantation

12:00 - Leandro Minku, School of Computer Science, University of Birmingham, UK

A Novel Automated Approach for Software Effort Estimation Based on Data Augmentation

12:30 - Lunch

13:30 - Gabriela Ochoa, Computing Science and Mathematics, University of Stirling, UK

LON Maps: Recent Advances in Local Optima Networks

14:00 - Erwin Pesch, Faculty of Economics and Business Administration, University in Siegen, Germany

Preventing Crane Interferences at Automated Container Terminals

14:30 - David R. White, Department of Computer Science, University of Sheffield, UK

Gin: a Tool for Program Improvement

15:00 - Refreshments

15:30 - Closing remarks

16:00 - Close of day
Year(s) Of Engagement Activity 2019
URL http://crest.cs.ucl.ac.uk/cow/60/
 
Description Visit to Luxembourg University 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact 8 + 9 of April 2019
Research visit to Jacques Klein at Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg
Year(s) Of Engagement Activity 2019
URL https://wwwen.uni.lu/research/fstc/computer_science_and_communications_research_unit/members/jacques...