Big Data for Law

Lead Research Organisation: The National Archives
Department Name: Legislation Services

Abstract

There are an estimated 50 million words in the statute book, with 100,000 words added or changed every month. Search engines and services like legislation.gov.uk have transformed access to legislation. No longer the preserve of legal professionals, law is accessed by a much wider group of people, the majority of whom are typically not legally trained or qualified. All users of legislation, from researchers in history, linguistics, lawyers, to a myriad of disciplines, are confronted by the volume of legislation, its piecemeal structure, frequent amendments, and the interaction of the statute book with common law and European law.

There is a problem. Researchers typically lack the raw data, the tools, and the methods to undertake research across the whole statute book. Arts and humanities researchers are constrained. Meanwhile, the combination of low cost cloud computing, open source software and new methods of data analysis - the enablers of the big data revolution - are transforming research in other fields.

Big data research is perfectly possible with legislation if only the basic ingredients - the data, the tools and some tried and trusted methods - were as readily available as the computing power and the storage. The vision for this project is to address that gap by providing a new Legislation Data Research Infrastructure at research.legislation.gov.uk. Specifically tailored to researchers' needs, it will consist of downloadable data, online tools for end-users; and open source tools for researchers to download, adapt and use. There has never been a better time for research into the architecture and content of law.

There are three main areas for research:

1.Understanding researchers' needs: to ensure the service is based on evidenced need, capabilities and limitations, putting big data technologies in the hands of non-technical researchers for the first time.

2.Deriving new open data from closed data: No one has all the data that arts and humanities researchers might find useful in a Legislation Data Research Infrastructure. For example, the potentially personally identifiable data about users and usage of legislation.gov.uk cannot be made available as open data but is perfect for processing using big data tools; eg to identify clusters in legislation or "recommendations" datasets of "people who read Act A or B also looked at Act Y or Z". The project will look at whether it is possible to create new open data sets from this type of closed data. An N-Grams dataset and appropriate user interface for legislation or related case law, for example, would contain sequences of words/phrases/statistics about their frequency of occurrence per document. N-Grams are useful for research in linguistics or history, and could also be used to provide a predictive text feature in a drafting tool for legislation.

3.Pattern language for legislation: We need new ways to model the architecture of the statute book if we are to study it using big data. The project will seek to learn from other disciplines, applying the concept of a 'pattern language' to legislation. Pattern languages have revolutionised software engineering over the last twenty years and have the potential to do the same for our understanding of legislation. A pattern language is simply a structured method of describing good design practices, providing a common vocabulary between users and specialists, framed around problems or issues, with a solution. Patterns are not created or invented - they are identified as 'good design' based on evidence about how useful and effective they are. Applied to legislation, this might lead to a common vocabulary between the users of legislation and legislative drafters, to identify more effective drafting practices and design solutions that effect good law. This could also enable a radically different approach to structuring teaching materials or guidance for legislators.

Planned Impact

Legislators and policy makers
How: providing a new evidence base to understand how much legislation is currently in force, what legislation is used by people and what courts most often refer to; using the Pattern Language to start to better manage the statute book by viewing it as a networked system rather than separate texts, improving scrutiny of new legislation, benefitting from improvements made by drafters and by having a framework to assess good law. Opportunity to feed into a potential new Interpretation Act in the fifth session of this Parliament.

Drafters of legislation
How: providing evidence about what types of pattern in legislative drafting aids/impedes clarity; providing a model for the deeper understanding of the architecture and content of law; improved drafting guidance. Opportunity to feed into a project for a new browser-based drafting tool for legislation.

Wider public sector
How: Pattern Language provides a bridge between non-legally-trained professionals who currently struggle to understand a complicated, piecemeal, constantly changing statute book, or to understand how the UK statute book interacts with common and European law.

Legal profession and the judiciary
How: providing a clearer understanding of the substantive content of the law and new knowledge that can improve how new laws are made. The legal profession will benefit from an enhanced understanding of: the substantive interrelationships between distinct legislative enactments and their underlying provisions; the relationship between legislative output and judicial output; how the content of the statute book has developed over time and how this has affected the content of the common law; how legislation may be better framed to promote the objects of the legislation and minimise courts' need to interpret ambiguous legislation; how judicial activity affects legislative activity; and how legislation is actually being used.

Established commercial legal publishers
How: use the data outputs from the project to create new products and services. In the short term, legal publishers will be able to use LDRI outputs to add value to existing products eg applying taxonomy to legislation using machine learning, or optimising search relevance. for example to map the relationship between legislation and case law and to use this knowledge to produce new value added commercial/eductionl/academic products.

Other businesses
How: Commercial companies will also be able to download the tools and resources available through the LDRI and customise them, to support their own R&D, eg in high growth areas such as automated accountability systems in banking and finance. All businesses will benefit from reducing the burden of compliance and enabling efficiency savings in so far as the project facilitates good law and more effective regulation.

Start-ups
How: Opportunity to develop new and innovative products from new open data provided eg mobile apps (this has already happened with young entrepreneurs using legislation.gov.uk data).

Open data, big data and linked data communities
How: developing methods for creating new open data from closed data that can be applied to other content domains. By being an exemplar, collaborative open data project. Through the wider benefits identified, supporting the evidence base that shows the economic value and impact of open data policies.

General public
How: enhanced user experience when viewing legislation on legislation.gov.uk that better contextualizes the legislation; all citizens will benefit from more accessible legislation in so far as the project facilitates good law and better legislation.

Governments around the world
How: providing an exemplar that others can adapt for their own context, by providing leadership, developing and sharing new ideas that demonstrably lead to better legislation thereby supporting the rule of law.

Publications

10 25 50
 
Description 1) We set out to understand researchers' needs from a Legislation Data Research Infrastructure (LDRI).

We conducted research into users' capabilities to work with legislation data and possible limitations (knowledge, equipment, skills). Through a survey, workshops and one to one interviews, we developed a set of user personas for our new service. These represent archetypal users and informed our proposition, tools and designs. We also tested with users a set of wireframe prototypes of the capabilities we had proposed.

We discovered a significant level of ambition and appetite, from a broad range of research perspectives, to conduct data driven analysis of the statute book as a system. However, the overwhelming majority of researchers were not ready or equipped to make good use of the more sophisticated offering, in terms of data, tools and methods that we originally imagined. We found many potential academic users of the service with interesting research questions but who were not confident in data analysis, statistical methods or using tools to interrogate data. Almost universally researchers lacked the skills and capability to download and work with raw data. These insights transformed our plans for the development of the Legislation Data Research Infrastructure. To address the user needs we uncovered, as well providing the raw data, tools and documented research methods envisaged by the award proposal, we developed a set of pre-packaged data analyses and some powerful but easy-to-use web based tools. These enable researchers to ask complex questions of the statute book and produce robust empirical evidence, without requiring advanced programming or data science skills.

2) We set out to explore deriving new open data from closed data, for example data held by our partners, and to make that available through the LDRI.

The project was fortunate in having excellent commercial and not-for profit partners, with significant data holdings (Lexis Nexis, Thomson Reuters and the Incorporated Council of Law Reporting (ICLR)). We discovered significant interest and appetite to explore big data techniques amongst those holding rich legal data. We also encountered some blockers - in particular regarding access to court judgments. Although this data is owned by the Crown, we were not able to negotiate access to this data from the BAILLI database.

We wanted to create a new dataset from law reports data supplied to us by ICLR to enable study of the links between case law and statute law. In order to process the law reports data, to identify references or citations to legislation using natural language processing techniques, we first needed to create a gazetteer or list of legislation. The law reports data provided by ICLR was comprehensive with reports dating back to the nineteenth century to the present day. We found none of the project partners had a sufficiently comprehensive and accurate list of legislation to use as the basis of a gazetteer. To address this issue we gathered 21 different dataset lists of legislation, including information held by The National Archives, our partners, and others such as the Parliamentary Archives and the Office of the Parliamentary Counsel. We also digitised some of the Government owned paper sources of this information, such as the Chronological Tables of Statutes and the discontinued Table of Government Orders. To extract the information we developed a sophisticated technique using a combination of OCR and natural language processing software to facilitate the identification and extraction of the information we required.

We modelled each source dataset using RDF (a metadata data model). We then created a range of queries for identifying levels of match across the different datasets supplied. To resolve discrepancies we undertook significant additional research, going back to primary sources, such as Statutes of the Realm. This exhaustive process enabled us to refine the resulting data and produce the first truly comprehensive list of primary and secondary legislation enacted or made in the UK.
The development of a very high quality, validated, core reference dataset of legislation, made available as open data, and maintained by The National Archives as part of legislation.gov.uk, is one of the most important outputs of the research. It will be an invaluable resource for researchers in years to come.

We found there to be significant value is using multiple sources to create high quality core reference data. We also learnt that compiling high quality data from multiple sources involves a significant level of manual editorial work to resolve discrepancies. We found that the flexibility of RDF lends itself to managing this kind of data. The data model (the ontology) for what notionally is a simple dataset, is surprisingly complicated. The resultant dataset also tells the story of the evolution of the law, Parliament and government, over 750 years.

3) We set out to examine the concept of a pattern language for legislation. Our hypothesis was that there are commonly occurring legal design solutions, in legislation, to commonly occurring policy problems. We wanted to identify these patterns as a way of abstracting or mapping the statute book.

We ran a series workshops with small groups of drafters of legislation from the Parliamentary Counsel offices and the Government Legal Department. As a result of these workshops, and our research we developed an initial catalogue of design patterns in legislation. Our candidate patterns are mainly in the field of public law and can alternatively be thought of patterns of decision making system, in areas of social or economic activity over which the state seeks to exercise a degree of control. Example patterns include the Licensing Pattern, the Regulator Pattern, the Protector Pattern.
We wanted to find a way of expressing the patterns more formerly in terms of legal relationships. To do this we experimented with using Hohfeldian jurisprudence and the Hohfeldian terms to concretely express the patterns. We also experimented with using the Hohfeldian terms as a bridge to finding instances of the patterns. We found that neither traditional search tools nor natural language processing techniques were sufficient to help find instances of a typical pattern in legislation. Often the characteristics of a pattern would involve a set of related provisions, for example in the case of the Licensing Pattern, the creation of an offence and the granting of a power to a public authority to permit a particular activity. To search for instances of these more complex patterns we developed a powerful but easy to use "Query Builder" tool and an associated domain specific query language for legislation. This has the capability to exploit the structure of the legislation data (that Acts are sub-divided into parts, chapters and sections, grouped by cross headings) as part of the search.

The Query Builder tool aided us in finding multiple instances of the patterns. We refined the domain specific query language, to include returning both instances (portions of a piece of legislation) and counts, including counts by year. We have found this more generic capability to be extremely powerful and flexible. We included various options to the tool to deal with some of the complexity of the documents, for example that one piece of legislation amends another. There is a facility to either include or exclude amending texts, or to search only the amending texts. There is also a facility to include or exclude inoperative text (headings etc). We have discovered that a very wide range of questions about the evolution of the statute book can now be expressed and answered using the Query Builder tool.
Exploitation Route Academic researchers:
The Legislation Data Research Infrastructure enables new lines academic enquiry into legislation and the evolution of statute law. By providing raw data as well as easy to use tools, it is truly transformative as an enabler of new types of research. Questions about the volume of legislation, the numbers and nature of powers granted by Parliament to the government, or the types of legal design solution being legislated, can now be rapidly and easily explored. The language of legislation can also be more readily researched. The provision of the data in a wide variety of formats enables the application of other types of analysis to the statute book (e.g. word vector analysis). The tools enable academics from non-computing or non-data orientated backgrounds to conduct this type of research for themselves quickly and easily. The examples we have provided will inspire and stimulate further research questions. The tools are easy enough to be used with undergraduates, as part of a project or assignment, whilst being sophisticated enough to enable entirely new lines of enquiry.

The pattern language provides a new way of analysing the shape and evolution of the statute book. The approach to framing patterns might be refined, new patterns found and new methods of discovering patterns tried.

Policy makers:
The patterns could be used by policy makers to better manage the statute book. By clarifying what legal design solutions are available, policy makers may be able to give more coherent instructions to drafters of legislation. It could be used by drafters to design legislation at a different level of abstraction. The patterns may also be used to aid understanding by readers, making it easier to have a sense of gist. Findings from the research could be used to offer strategic insight into drafting techniques. The data and tools might be used by researchers for MPs and lobbyists to aid better scrutiny of legislation, as well as independent organisations. Both the Institute for Government and the Full Fact organisation have expressed considerable interest in using the capabilities and tools provided by the project. Policy makers from other jurisdictions (there has been significant interest both in Australasia and Europe) may use our findings to catalyse their own domestic research.

Legal professionals:
The legal profession and the judiciary could use the tools and pre-packaged data to gain a clearer understanding of the substantive content of the law and new knowledge that can improve how new laws are made. They could use the Query Builder tool, for example to explore the substantive interrelationships between distinct legislative enactments and their underlying provisions; the relationship between legislative output and judicial output; how the content of the statute book has developed over time and how this has affected the content of the common law; how legislation may be better framed to promote the objects of the legislation and minimise courts' need to interpret ambiguous legislation; how judicial activity affects legislative activity; and how legislation is actually being used.

Commercial:
The data can be used in its own right to underpin high value commercial products and services. The tools we have developed enable various facts to be derived from legislation data much more easily. For example, information about the relationship between UK and EU legislation can be more readily compiled and used to develop commercial products or services around compliance with EU. By making available the data as open data, both start-up innovators as well as traditional legal publishers can use the data or the tools for commercial benefit.
Sectors Communities and Social Services/Policy,Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Government, Democracy and Justice

URL http://research.legislation.gov.uk
 
Description The concept of a pattern language has been taken forward by Parliamentary and legislative counsel from the four UK drafting offices. They set up a group to develop the patterns and have published the results as a set of "Common Legislative Solutions" by the Parliamentary Counsel Office (https://beta.gov.scot/publications/guidance-instructing-counsel-common-legislative-solutions/). The conversion of the legislation data on legislation.gov.uk to Akoma Ntsoso has proved the efficacy of this standard sufficiently well that it has now been adopted by the Parliaments and Governments of the UK, and is being used in the new browser based drafting, amending and publishing tool. This in turn has influenced the European Commission who have funded the development of an open source legislation drafting tool (LEOS) through the ISA programme, that uses Akoma Ntoso as a data model. The data download service is fully maintained as part of legislation.gov.uk with weekly updates. The core reference dataset created by the project, that lists all legislation over the last 800 years, is maintained on a daily basis and is at the heart of legislation.gov.uk's Linked Data strategy and service offering. The Query Builder tool and the Words Explorer tool are being used by legislation drafters, policy makers and practitioners in government to identify, count or measure different aspects of the statute book. The Query Builder is being used inside government on a daily basis to help identify legislation impacted by Brexit. The Government Legal Department (SI Hub) are using these tools to identify and measure quality metrics for the drafting of secondary legislation, demonstrating the transformative capability of the new tools in a policy context. The Government Digital Service (GDS) have used the Words Explorer tool to quickly and easily find all of the statutory registers. This provided them with evidence for the 'discovery' phase of their registers initiative, which is now being taken forward as part of the government data programme and the government as a platform agenda. Another part of GDS have used the tools to help provide evidence for a common licensing digital service for the public sector. The Word Explorer has been used, as a teaching aid, for post graduate students at the Institute of Advanced Legal Studies to provide empirical evidence for their post-doctoral work. The datasets we have created as part of the new research service have helped us to produce a new core reference dataset of legislation, and the recommended use of this dataset is now included in the official guidance on Statutory Instrument Practice provided to the drafters of secondary legislation. In Europe, we have developed models that will aid the interoperability of UK and EU legislation data, helping researchers to address questions about the impact of EU law on the UK statute book. The findings of the project are to be included in a study for the European Commission on how big data and data analytics offer strategic insight for policy making. The official UK government paper is formally included in the Council and European Council documents, which are made available through its public register. This, and presentations to the European Council E-Law (E-Law) working party ensure that the findings of the project are feeding into future work programmes. Research findings are also being used to shape the direction of key open data initiatives, such as the openlaws.eu initiative which aims to help Europe innovate in the legal field; and the UK Administrative Justice Institute (UKAJI), which aims to encourage more research into administrative justice. Legislation patterns relating to the patterns of decision making, discovered through the research, are particularly relevant to the UKAJI. The findings of the research have also been presented to government and across the civil service - in civil service round tables, briefings to ministers, in evidence to Parliamentary Select Committees and in evidence to the Speaker's Commission on Digital Democracy, demonstrating how the research findings and tools developed can aid better Parliamentary Scrutiny of legislation.
First Year Of Impact 2017
Sector Communities and Social Services/Policy,Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Government, Democracy and Justice
Impact Types Cultural,Societal,Economic,Policy & public services

 
Description Evidence to a private session of the House of Lords Constitution Committee about opportunities for improved scrutiny through use of advanced legislation data querying tools
Geographic Reach National 
Policy Influence Type Gave evidence to a government review
 
Description Guidance on Instructing Counsel: Common Legislative Solutions
Geographic Reach National 
Policy Influence Type Citation in other policy documents
Impact We catalysed a group of drafters around the legislative pattern language, developed as part of Big Data for Law, to identify common legislative solutions to common policy issues. Drafters from all four legislative drafting offices in the UK applied our research to develop guidance intended to help drafters to develop policy and produce instructions for primary legislation of certain commonly occurring types, such as establishing a new public body or creating a licensing regime. "Guidance on Instructing Counsel: Common Legislative Solutions" (ISBN 9781788513722; https://beta.gov.scot/publications/guidance-instructing-counsel-common-legislative-solutions/) is helping to make the initial instructions for primary legislation as rigorous and detailed as possible, and making the process of drafting Bills more efficient in turn. Their work was shortlisted for "collaboration award" in the Civil Service Awards 2017 (see: http://www.civilserviceawards.com/shortlist). In his foreword to the guidance, Andy Beattie, Chief Parliamentary Counsel for Scotland, writes: "Its genesis is in work undertaken by the National Archives to research patterns which occur in legislation: in other words, common legislative solutions to policy questions or problems which occur frequently".
URL https://beta.gov.scot/publications/guidance-instructing-counsel-common-legislative-solutions/
 
Description Influencing Daniel Thornton, Programme Director for the Institute for Government about use of research outcomes in "Whitehall Monitor"
Geographic Reach Local/Municipal/Regional 
Policy Influence Type Influenced training of practitioners or researchers
 
Description Providing the tools for policy makers and other interested parties to understand what legislation is in force
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact The Query Builder tool, developed as part of the legislation.gov.uk Research service, has been used to find revocations in legislation in order to establish which of a number of older implementing statutory instruments are still in force. The tool was used to search for references to the enactment in later enactments to identify possible revocations, and is useful for anyone who needs to find out whether a piece of secondary legislation is still in force. For example from a list of 300 enactments, 200 were identified as revoked and and this reduced the editorial work required in this area by two thirds. The use of the tool, and examples of how it can provide the evidence to change policy and practice, is changing capability and skills across the public sector.
 
Description Provision of training and support to central government departments and legislation drafters
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact The National Archives' legislation services team provided training on the use of the legislation.gov.uk Research service to officials from central government departments, and lawyers responsible for drafting legislation, to help them to undertake research to understand legislative requirements following the UK's departure from the EU.The tool is being used to provide evidence and data to support policy making and drafting.
 
Description Supporting government departments to find out which legislative enactments implement European Directives
Geographic Reach Europe 
Policy Influence Type Influenced training of practitioners or researchers
Impact The Query Builder tool, developed as part of the legislation.gov.uk Research service has been used to find out which enactments implement (or possibly implement) European Directives. This is helping government to predict the impact of the UK leaving the EU on the UK statute book, and is useful to anyone in the wider public sector interested in having an accurate, consolidated list of implementing enactments. It is also potentially of use to the EU Publications Office (who are responsible for delivering EurLex - the European legislation website) because it will help them to identify errors or omissions in their database. The availability of the tool, and examples of how it can be used to gather the evidence necessary for informed policy making is influencing practice in the public sector - this work would have been difficult, or impossible, to achieve without the tools developed as part of the research service.
 
Description Use of query builder by senior policy officials in the Northern Ireland government to aid understanding of implications of Brexit
Geographic Reach Local/Municipal/Regional 
Policy Influence Type Influenced training of practitioners or researchers
Impact John Sheridan briefed senior policy officials in the Northern Ireland government about how they can use the tools developed as part of the Big Data for Law research project to help understand the implications of the UK leaving the EU on Northern Ireland legislation. Further training and instruction was provided on how to use the tools to those officials, which has resulted in a much better understanding of the legal position.
 
Description Using the Query Builder tool to find company-related EU instruments
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact The Query Builder tool, developed as part of the legislation.gov.uk Research service, was used to create a list of company- and EU-related enactments for a central government department, using a structured search to find enactments which relate to companies (in their title or subject) and which refer to the European Communities Act 1972 in their introductory provisions, or refer to implementations/directives/regulations in their explanatory notes. The use of the tool demonstrates impact in providing practitioners with the research tools they need to support evidence based policy making - adding to the capabilities of government and the wider public sector.
 
Description Using the Query Builder tool to find enactments made under the European Communities Act 1972
Geographic Reach Europe 
Policy Influence Type Influenced training of practitioners or researchers
Impact The Query Builder tool, developed as part of the legislation.gov.uk Research service has been used to discover which enactments are made under the European Communities Act 1972 to provide a list of enactments to a central government department. The structured search was used to find enactments which refer to the Act in their introductory provisions. This provides an invaluable resource for government departments, quickly and efficiently finding information that would otherwise have taken very many hours of manual research. The ability to use tools that surface information efficiently and accurately is supporting evidence-based policy making.
 
Description Using the Query Builder tool to find references to Data Protection and privacy
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact The query builder tool was used to provide data to a central government department on the enactments that refer to data protection and privacy related terms. This use of the research tool clearly demonstrates the potential of the tool to support government decision making and evidence-based policy.
 
Title Census of the statute book 
Description Census of the statute book - Our user research revealed that the option many researchers would most like is a set of pre-packaged analyses of the data that they can easily access and use online. To meet this need the new service will provide an online census of the statute book. When most people think of the statute book they think of words rather than numbers, yet the simple act of counting can reveal much about the law, the evolution of policy, politics, history, as well as the evolution of drafting techniques and practice. Imagine, for example, being able to count how many times a legally significant word or phrase has appeared in legislation. What might that reveal about drafting styles and trends? What might counting the number of retrospective provisions reveal? There are so many possibilities and yet so little about the statute book has been measured before. With so many options, the first challenge for the project has been to decide what to count. We tested ideas for core indices with researchers during the user testing. As a result, our annual census will include indices for the number of words, the use of legally significant phrases, the frequency of amendments, the occurrence of internal and external references and the use of powers. An important piece of feedback from the user testing was the need researchers have to know of instances, as well as counts. Users will not only be able to use these indices to discover the numbers, they will also be able to drill down into the instances. So if a word appeared 50 times in a year, for example, the researcher will be able to drill down into the data to find out where those 50 instances were; which specific pieces of legislation. 
Type Of Material Improvements to research infrastructure 
Provided To Others? No  
Impact Developing a service that meets users' needs (based on user research) The ability to track instances as well as counts. The simple act of counting can reveal much about the law, the evolution of policy, politics, history, as well as the evolution of drafting techniques and practice. 
 
Title Pattern language for legislation 
Description Pattern language for legislation - one of the research questions for the project is to examine the concept of a 'pattern language' for legislation as a way of transforming our understanding of the statute book, bridging the gap between users of legislation, policy makers and drafters. A pattern language is simply a set of design patterns. Each design pattern is framed in terms of a structured method, and generalises a particular good design practice. Pattern languages are most helpful when used to conceptualise and help design or manage large complex systems. We have reviewed the literature around pattern languages, from architecture and software engineering and design and identified the common characteristics of a pattern language and individual design patterns. In our pattern language for legislation each design pattern consists of four elements: a name, a problem, a solution and the consequences of applying the pattern. We have also identified 'candidate' patterns in legislation, holding workshops with a range of individuals and organisations, ranging from commercial legal publishers to drafters and academics. To formalise the patterns rigorously we are also exploring the use of Hohfeldian jural correlatives. 
Type Of Material Improvements to research infrastructure 
Provided To Others? No  
Impact Will help to conceptualise and manage a complex system (the statute book). Outputs can be used to transform our understanding of the statute book, bridging the gap between users of legislation, policy makers and drafters. 
 
Title Query Builder tool 
Description We discovered that many researchers lacked the ability to work with the complex and sophisticated legislation data we were providing. To bridge this gap we developed a 'query builder' tool. This is a domain specific query language for legislation that enables searching the statute book using the structure of legislative documents. Users can target individual legislative provisions and receive direct links to the matching document components. For example they can search for chapters that contain certain phrases, or sections that contain certain words in their headings. The query builder aids finding instances of legislative patterns, which traditional search tools are not capable of identifying. The results can also be returned as counts; effectively as observations from slices of a multi-dimensional data cube which conceptually sits above the statute book. These counts form the foundation of the Census of the statute book - pre-packaged data that researchers can link to and use. 
Type Of Material Improvements to research infrastructure 
Provided To Others? Yes  
Impact A powerful new tool that enables researchers to ask complex questions of the statute book without needing to learn a programming language. It makes it easy to identify and count or measure different aspects of the statute book - underpinning the Census of the statute book (pre-packaged data that users can link to and use). It revolutionises the research carried out by legal researchers. Framing appropriate queries also generates new datasets, for example enabling powers or offences. The tool is now being used by drafters of legislation. 
 
Title Words Explorer tool 
Description A Words Explorer tools that enables the user to instantly explore the frequency of use of words and phrases in statutes over the last 115 years. It is an Ngrams tool (inspired by a tool originally developed by Google and used to explore the occurrence of words across a large database of literature). In parallel we have created a new legislation Ngrams database - opening up the data that enables the large scale production of empirical evidence about the statute book. The tool has the capability to identify larger n-grams by frequency, through querying the data for a smaller n-gram. It was used by the Government Digital Service to identify all the Statutory Registers (searching for 'register of'. This provided essential evidence for the 'discovery' phase of their registers project, which is now being taken forward as part of the government data programme and the 'government as a platform' agenda (see https://www.gov.uk/service-manual/technology/government-as-a-platform.html). 
Type Of Material Improvements to research infrastructure 
Provided To Others? Yes  
Impact Provision of new data. Provides legal researchers who lack the confidence or skills to work with raw data an easy to use 'starter' tool that allows an immediate exploration of the statute book without requiring technical and statistical expertise. The tool is simple but still produces robust empirical evidence about how our system of statute law is evolving. Without this tool, for example, the Government Digital Service would have found it difficult, if not impossible, to quickly and easily find all of the registers required under statute. With the tool it took a few seconds. 
 
Title Akoma Ntoso 
Description Akoma Ntoso - we have developed a high quality conversion routine from the Crown Legislation Mark Up language (the data format used to store and publish legislation on legislation on legislation.gov.uk) to Akoma Ntoso (which is evolving into an international standard for legislation and other legal documents) This means that people can easily use tools that have been designed for working with Akoma Ntoso data, with UK legislation. We have discussed our work with Akoma Ntoso data with leading academics, for example hosting a two-day visit to The National Archives by Fabio Vitali from the University of Bologna. All the legislation data on legislation.gov.uk is now available in Akoma Ntoso. 
Type Of Material Data handling & control 
Year Produced 2015 
Provided To Others? Yes  
Impact The conversion work enabled by the Big Data for Law project Akoma Ntoso proved the efficacy of an Akoma Ntoso model for UK legislation documents sufficiently well that it has now been adopted as a standard by the Parliaments and Governments of the UK for a new browser-based drafting, amending and publishing tool. This tool is being developed by The National Archives, the two Houses of the UK Parliament, the Scottish Parliament; the Office of the Parliamentary Counsel; and the Scottish Government's Parliamentary Counsel Office. For more information see the drafting, amending and publishing tool technology choices factsheet which you can download from: http://www.legislation.gov.uk/projects/drafting-tool. 
 
Title Bulk downloads 
Description Bulk downloads - we have written software that constructs a legislation.gov.uk dataset that can be bulk downloaded (approximately 30Gb of data, and routines for this dataset to be maintained on a regular basis as new legislation is made, and new revised versions of legislation are published. 
Type Of Material Data handling & control 
Year Produced 2015 
Provided To Others? Yes  
Impact Critical to the development of the legislation data research infrastructure. The availability of UK legislation as bulk data led to the UK achieving number one position in the world for its legislation data, in the Open Knowledge Foundation Open Data Census: http://census.okfn.org/ 
 
Title Datasets 
Description Datasets - we have identified a number of other datasets, in addition to the legislation texts, that we will make available as part of the Big Data for Law service. These include an up to date database of legislative amendments, a bibliographic dataset for legislation from 1970 to date, a dataset of government orders and the chronological tables of general and local statutes. We have obtained sample sets of the legislation.gov.uk raw usage data, recorded by Akamai, and are exploring how best to use this using Apache Mahout. 
Type Of Material Database/Collection of data 
Provided To Others? No  
Impact Provision of new data. 
 
Title HTML5 
Description HTML5 - we have developed a conversion from Akoma Ntoso to HTML5 and RDFa, the new version of HTML, with additional embedded metadata using RDF. There are several features of HTML5 that make it very appealing as a mark-up language for legislation, and there is a rapidly expanding base of tools for working with HTML5 data. We have extended Akoma Ntoso so that certain features catered for in both CLML and HTML5 are not lost in the conversion process. We expect most of the users of the data to use this new HTML5 version. 
Type Of Material Data handling & control 
Provided To Others? No  
Impact We expect most data re-users to use the new HTML5 version of legislation that we have developed - there is a rapidly expanding base of tools for working with HTML5 data. 
 
Title Hosting 
Description The project has procured a cloud-based hosting environment for datasets that we will be publishing to bulk download. There will be options to download the data directly or via a torrent (to help manage hosting costs). 
Type Of Material Data handling & control 
Provided To Others? No  
Impact Essential to the delivery of the legislation data research infrastructure - the sustainable output required from the research funding. 
 
Title Linked data/core reference dataset 
Description Linked Data/core reference dataset - we have developed a Linked Data Strategy, which is a key element to releasing a set of connected datasets. The first step has been to construct a core reference dataset that lists all legislation, with identifiers that reference where each piece of legislation can be found. As well as data from legislation.gov.uk, this dataset includes data from Lexis Nexis, the Parliamentary Archives and Thomson Reuters. 
Type Of Material Database/Collection of data 
Year Produced 2016 
Provided To Others? Yes  
Impact Provision of new data and a new canonical identifier and citation scheme that can be used wherever legislation needs to be referenced. Recommended use of this dataset is included in the official guidance on Statutory Instrument Practice provided to the drafters of secondary legislation. 
 
Description Data Science Community of Interest group 
Organisation Data Science Community of Interest Group
Country United Kingdom 
Sector Public 
PI Contribution Judith Riley, a member of The National Archives' Big Data for Law project team, represents the project on the Data Science Community of Interest Group; a cross-government group that shares best practice on big data research, and showcases Big Data projects across Government.
Collaborator Contribution Dissemination; Sharing of expertise; Forums for showcasing the project's work.
Impact Dissemination; Sharing of expertise; Forums for showcasing the project's work.
Start Year 2014
 
Description Partnership with Lexis Nexis 
Organisation LexisNexis UK
Country United Kingdom 
Sector Private 
PI Contribution John Sheridan presented to over 100 legal editors at Lexis Nexis on 6 June 2014. The presentation was wide ranging, discussing the future of legislation, Big Data for Law, and areas of collaboration. The presentation is available on the Lexis Nexis intranet, and as a vimeo video, which is available at: http://vimeo.com/99344104. We have also conducted a series of workshops with Lexis Nexis to frame the development of indices for the annual census of the statute book we aim to deliver as part of the project, and to explore potential legislation patterns.
Collaborator Contribution Lexis Nexis have contributed staff time and expertise, and have embraced the opportunity to collaborate with the Big Data for Law project team.
Impact Promotion of project aims and ambitions, internally and externally; Contribution to research on the concept of pattern languages; Contribution to the development of indices for the annual census of the statute book; Provision of data.
Start Year 2014
 
Description Partnership with Thomson Reuters 
Organisation Thomson Reuters
Country United States 
Sector Private 
PI Contribution We have held a one-day workshop with Thomson Reuters, who have provided us with data to include in our core legislation reference database.
Collaborator Contribution Provision of data
Impact provision of data for a core reference database
Start Year 2014
 
Description Partnership with the Incorporated Council of Law Reporting (ICLR) 
Organisation Incorporated Council of Law Reporting
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution The Incorporated Council of Law Reporting (ICLR) are active members of the project board, and have provided us with case law data. We are currently working with the ICLR to explore how we can process this currently closed data set to create new open data for the legislation data research infrastructure.
Collaborator Contribution Provision of data; Input into ideas around pattern languages; Creating indices for the census of the statute book.
Impact Input into draft indices for the annual census of the statute book; Input into discovering candidate patterns for the pattern language research.
Start Year 2014
 
Description Partnership with the Office of Parliamentary Counsel (OPC) 
Organisation Cabinet Office
Department Office of Parliamentary Counsel
Country United Kingdom 
Sector Public 
PI Contribution The National Archives has built upon its existing, strong relationship with the Office of Parliamentary Counsel, and the Good Law initiative to help ensure that the project's aims and ambitions are connected to wider government policy drivers (the aims of the Good Law initiative).
Collaborator Contribution The Office of Parliamentary Counsel (OPC) have hosted workshops for us to discuss the wider context about the project (including the Good Law initiative, managed by the OPC and the Cabinet Office) and to discuss key areas of the project in depth, for example exploring technical challenges and patterns in legislation. Drafters at the OPC have agreed to take part in a pattern language workshop. The OPC have also helped us to promote the Big Data for Law project to wider audience across government.
Impact promotional activities, for example on social media; Workshops to progress project objectives and research questions.
Start Year 2014
 
Description Partnership with the Parliamentary Archives 
Organisation Government of the UK
Department Parliamentary Archives
Country United Kingdom 
Sector Public 
PI Contribution We have been working closely with the Parliamentary Archives, who have provided us with data for the core legislation reference database.
Collaborator Contribution Provision of data.
Impact Provision of data for a new core reference database
Start Year 2014
 
Title Europe 
Description Europe - we have agreed an approach with the Office of Publications of the European Union, to align the UK's use of the Functional Requirements for the Bibliographic Records model of works, expressions and manifestations, with that used by the OPEU for European legislation, in the CELLAR (the database that underpins Eur-lex). We have mapped the legislation-specific metadata requirements for the third pillar of the European Legislation Identifier initiative to Dublin Core properties. We have agreed an approach to expressing transposition information as data, and we are participating in a European Council project to begin to exploit data in the National Implementing Measures database, working with the Commission and their contractor PWC. 
Type Of Technology New/Improved Technique/Technology 
Year Produced 2014 
Impact Models that will aid the interoperability of UK and EU legislation data; Will help researchers to address questions about the impact of EU law on the UK statute book, which is potentially of considerable research, political and media interest. 
 
Title Natural language processing 
Description Natural language processing - we have identified the legislation-specific natural language processing components that we will make available for researchers to use, with the GATE toolset. These components are capable of identifying internal and external references in legislation documents, as well as the enabling powers in Statutory Instruments, and the main types of textual amendments. 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact Crucial to the development of the legislation data research infrastructure. 
 
Description Announcement, 6 February 2014 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact The National Archives issued a press release when the award from AHRC was formally announced (the press release included quotes from Ministry of Justice Minister Simon Hughes and from Richard Heaton (Permanent Secretary at the Cabinet Office and First Parliamentary Counsel), and ensured Ministers were formally briefed on the project's scope and ambitions. The press release was uploaded onto The National Archives' website and generated some coverage (for example Information Daily, Global Banking and Finance Review, and legal informatics website) alongside social media activity. We have also created a project page for Big Data for Law on the legislation.gov.uk website: www.legislation.gov.uk/projects/big-data-for-law

Press coverage; Social media activity; Web presence; Ministerial recognition.
Year(s) Of Engagement Activity 2014
 
Description Civil Service roundtable, 18 September 2014 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact John Sheridan, Principal Investigator for the Big Data for Law project represented the project at a civil service roundtable about informing policy decisions with Big Data. The roundtable discussed a range of issues - What benefits can analytics bring to policy-making? What kinds of big data might policy makers use to inform policy decisions? What skills will be needed to understand this data? What cultural changes might be needed to ensure that analytics are embedded into policy- and decision-making processes? Can departments use their operational and management data more effectively to improve policy decisions? Are there technical, political or legal barriers to using big data to improve policy making?

Sharing of ideas; Raised awareness of big data for law project amongst an audience of people who can influence big data policy and practice in government.
Year(s) Of Engagement Activity 2014
 
Description Engaging drafters of secondary legislation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact John Sheridan met with Daniel Jenkins, Deputy Director at the Government Legal Department, and lead for the SI Hub. The SI Hub is a new UK Government initiative, set up to consolidate expertise and capability for drafting statutory instruments across the Government Legal Department. It is a centre of excellence in drafting legislation, and its expert lawyers aim to draft 20 percent of all UK secondary legislation. John Sheridan provided access to the new tools developed as part of the Big Data for Law programme, allowing the Hub to identify and measure quality metrics for SI drafting. The use of the tools by the Hub demonstrates the transformative capability of the new tools in a policy context.
Year(s) Of Engagement Activity 2015
URL https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/446638/Business_Plan_2015_...
 
Description Engaging leading international practitioners 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Thomas R Bruce, Research Associate and Director of the Legal Information Institute at the Cornell University, is one the world's leading experts in the field of law and IT. In 2009, the American Bar Association Journal named him one of 50 innovators doing the most to change the American legal profession. Tom is in regular contact with John Sheridan, principal investigator for the Big Data for Law Project, and on 23 April 2015 they met in the UK for an in-depth briefing on the project, and a demonstration of the new tools and methodologies created. This provided an excellent opportunity for knowledge sharing, disseminating ideas internationally and reaching a wider academic audience.
Year(s) Of Engagement Activity 2015
URL http://www.lawschool.cornell.edu/faculty/bio.cfm?id=188
 
Description Engaging senior government stakeholders 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact We have shared work on the project with various stakeholders in Government, including meeting with and providing a briefing for the Permanent Secretary and the new National Statitician, John Pullinger.

Informing and engaging senior policy makers in Government - resulting in positive feedback and desire for ongoing engagement.
Year(s) Of Engagement Activity 2014
 
Description Engaging the Primary Counsel of the four legislative jurisdictions 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Twice yearly meetings with the four first parliamentary and legislative in the UK provided an opportunity to share and showcase the work of the research, and how it contributes to good law, supporting effective drafting, improved communications between policy makers and drafters, and improved ways of presenting legislation online to help the reader to better understand legislative intent. John Sheridan, Principal Investigator for the Big Data for Law project, demonstrated the new research tools developed, and shared the links to the tools for circulation amongst the legislation drafters in the four jurisdictions.
Year(s) Of Engagement Activity 2015,2016
 
Description Good Law 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact John Sheridan, the Principal Investigator for the project, regularly meets (at least monthly) with Richard Heaton, Permanent Secretary at the Cabinet Office and First Parliamentary Counsel, to update on the Big Data for Law project. The project is actively promoted and discussed as part of the Good Law initiative, which aims to ensure legislation is clear and accessible, changing legislation policy and practice across government.

Wider dissemination of Big Data for Law ideas as part of a broader government initiative (the Good Law initiative); High level advocacy by the Cabinet Office Permanent Secretary; Increased social media activity; Participation in Parliament Week 2014.
Year(s) Of Engagement Activity 2014
 
Description Good Law Hackathon 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact The Good Law Hackathon, which takes place on 22 November 2014 as part of Parliament Week, will showcase some new datasets and the initial version of the service. The Hackathon has been selected as one of the events to promote during Parliament Week 2014, and is being run jointly with the Cabinet Office.

Actively engaging the technical specialists with an interest in law; Promoting new datasets; Creating indices for the census of the statute book and experimenting with methods and tools; Shared, iterative learning.
Year(s) Of Engagement Activity 2014
URL https://www.eventbrite.co.uk/e/the-good-law-hackathon-tickets-13997756667
 
Description Influencing legislative practice 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Daniel Greenberg is a Parliamentary Counsel, and the legal adviser to the Office of Speakers' Counsel. He advises the Joint Committee on Statutory Instruments and gives advice in relation to legislation and Parliamentary law. He is also the editor of Craies on Legislation - a practitioner's guide to the nature, process, effect and interpretation of legislation. Daniel Greenberg met John Sheridan, Principal Investigator for the Big Data for Law project in May 2015, to better understand the project and its impact on understanding how the statute book works as a system, and improving the quality of legislation. John Sheridan demonstrated the tools and how they could be used to enable more effective Parliamentary scrutiny of statutory instruments.
Year(s) Of Engagement Activity 2015
URL http://www.danielgreenberg.co.uk/
 
Description Internet Newsletter for Lawyers, March 2014 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact John Sheridan, Principal Investigator for the Big Data for Law project, authored an article which was published in the Internet Newsletter for Lawyers (http://www.infolaw.co.uk/newsletter/2014/03/big-data-for-law/) in March. The newsletter targets a community of lawyers who are interested in how to make the most of the legal internet - how it presents the law, how it widens access to the law, legal aspects of e-commerce and websites, how lawyers use the internet to market themselves and to sell legal services online and related IT issues.

Promotion and dissemination to a law/tech audience.
Year(s) Of Engagement Activity 2014
URL http://www.infolaw.co.uk/newsletter/2014/03/big-data-for-law/
 
Description Leading a workshop at the Cambridge Digital Humanities Digital Methods Workshop series 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The National Archives' Digital Director, John Sheridan, lead a workshop session at the Cambridge Digital Humanities Digital Methods Workshop Series on analysing documents as data - lessons from big data for law. His talk explored researcher's needs when working with a large corpus of documents. What are the issues researchers encounter? What might be some of the solutions to enabling the use of data analytics technologies by non-technical researchers? He presented a domain specific query language for legislation and explored some of the benefits of conducting this type of research, in particular the idea of creating a pattern language for legislation. To what extent do we need new ways of codifying and modelling the architecture of documents if we are to make it easier to research the entirety of a document corpus using big data technologies?
Year(s) Of Engagement Activity 2018
URL https://www.digitalhumanities.cam.ac.uk/
 
Description Legal Information Management Journal, December 2014 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact John Sheridan, Principal Investigator for the Big Data for Law project, has been selected to publish an article about the Big Data for Law project in the Legal Information Management Journal, published by Cambridge University Press. The article has been submitted and will be published in the December edition.

Dissemination; Promotion; Reaching a wider academic research community.
Year(s) Of Engagement Activity 2014
 
Description Meeting data.gov.uk team in Government Digital Service to share the research outputs 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Policymakers/politicians
Results and Impact John Sheridan met the data.gov.uk team in the Government Digital Service to discuss the research outputs of the Big Data for Law project, in particular the evidence gathered around data user needs and how best to meet those needs when focused on a particular field. This provided valuable insights for the data.gov.uk team to build into their own research programme about data user needs.
Year(s) Of Engagement Activity 2016
 
Description Meeting the "Exiting the EU" team at the Financial Conduct Authority 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Policymakers/politicians
Results and Impact John Sheridan met the "Exiting the EU" team at the Financial Conduct Authority to brief them about the use of the tools developed in Big Data for Law research project to help research the implications of the UK leaving the EU on financial services legislation. As a result a small team from FCA will be trained in how to best exploit the tools.
Year(s) Of Engagement Activity 2017
 
Description Ministerial briefing 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact We have formally briefed Simon Hughes, the Minister of State responsible for The National Archives about the project's aims and discussed them in a face-to-face meeting.

Informing and engaging senior policy makers and Ministers in Government - resulting in positive feedback and desire for ongoing engagement.
Year(s) Of Engagement Activity 2014
 
Description Open Data Camp 2015 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact To mark International Open Data Day, the UK hosted its first Open Data Camp on 21 to 22 February 2015 in Hampshire. Two hundred participants from local and central government, start-ups and social enterprises and the private and voluntary sector met to share data, write applications, create visualisations and develop fresh insights. The camp helped to bring the benefits of open data to a wide audience. John Sheridan, principal investigator for the Big Data for Law project, led a session about the new data provided through the project and its potential to transform our understanding of the statute book. The session highlighted the capability of big data technology with legislation, and included a demonstration of the Words Explorer tool, developed as part of the Big Data for Law project. Of particular interest was how to make data available in a way that meets users' needs.
Year(s) Of Engagement Activity 2015
URL https://www.gov.uk/government/news/first-uk-open-data-camp-held-as-government-releases-more-data
 
Description Openlaws.eu workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Mr John Sheridan, principal investigator for the Big Data for Law project was invited to a half day workshop at the London School of Economics organised by the openlaws.eu project team. Openlaws.eu is a European funded initiative aiming to stimulate the wider availability of legal data, with a particular focus on app building, combining both legislation and care law. The workshop provided an opportunity to share work being done by The National Archives in the Big Data for Law project and to identify areas of possible collaboration.
Year(s) Of Engagement Activity 2015
URL http://www.openlaws.eu/?page_id=1004
 
Description Parliamentary Select Committees 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact We have talked in depth about the project's aims with the Justice Select Committee and the Public Administration Select Committee (PASC).

Informing and engaging senior policy makers and Ministers in Government - resulting in positive feedback and desire for ongoing engagement.
Year(s) Of Engagement Activity 2014
 
Description Participation in an educational programme to promote good law hosted by University of Ulster 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Undergraduate students
Results and Impact John Sheridan gave a presentation and took questions and answers from a group of students from the University of Ulster Law School about the government's good law initiative and the findings from the "Big Data for Law" research project. The presentation included an explanation of the design patterns uncovered as part of the research and a demonstration of the tools that have been developed.
Year(s) Of Engagement Activity 2016
 
Description Participation in the government's Data Leaders expert group Data Strategy Workshop 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact John Sheridan participated in a workshop for Data Leaders in government organised by the Government Digital Service, to feed into the data strategy parts of the government's digital strategy. He contributed the evidence gathered as part of the Big Data for Law project about data user's needs, in particular the importance of developing easy to use data analytics tools for researchers who lack data science skills but have domain knowledge and strong research questions. He followed the engagement up by providing written evidence of the findings to the GDS team.
Year(s) Of Engagement Activity 2016
 
Description Pattern languages workshop 16 December 2015 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact One of the research questions for the project is to examine the concept of a pattern language for legislation as a way of transforming our understanding of how the statute book works as a system. John Sheridan, principal investigator for the Big Data for Law project, lead a workshop with members of the four UK drafting offices to introduce the concept of a pattern language for legislation, to share what we have learned to date and to work with legislation drafters to identify candidate patterns in legislation - common solutions to common legislative problems. Patterns help policy makers and drafters to design legislation at a higher level of abstraction. This could support policy makers not trained in law to think about what legislative solutions might be available, for example, and improve the delivery of good law. Candidate patterns cover everything from licensing and regulation, to registration, protection and prohibition, and have been captured in a legislation pattern catalogue.
Year(s) Of Engagement Activity 2015
URL https://www.gov.uk/government/organisations/office-of-the-parliamentary-counsel
 
Description Presentation to LEX 2016 Summer School 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact John Sheridan presented the outputs of the Big Data for Law project to an international group of legal informatics specialists, at the Lex 2016 Summer School, hosted by the University of Bologna. In particular he demonstrated the domain specific query language for legislation and initiated a conversation about how this approach may influence the further development of the LegalDocML data standard for legal and legislative documents.
Year(s) Of Engagement Activity 2016
URL http://summerschoollex.cirsfid.unibo.it/?page_id=1463
 
Description Presentation to Public Understanding of Law hosted by the University of Middlesex 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact John Sheridan gave a presentation about the Big Data for Law project as part of a day long seminar for academics and policy makers about the Public Understanding of Law hosted by the University of Middlesex. He described the main research goals, demonstrated the tools that were available to academics to do their own research and talked through the findings of the project.
Year(s) Of Engagement Activity 2016
 
Description Presentation to the Director General and Senior Leadership of Publications Office of the European Union 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Policymakers/politicians
Results and Impact The Publications Office of the European Union is responsible for the production and dissemination of legal and general publications, managing a range of websites providing EU citizens, governments and businesses with digital access to official information and data from the EU, including the EU Open Data Portal and EUR-Lex, and ensuring long-term preservation of digital content produced by EU institutions and bodies. John Sheridan, principal investigator for the Big Data for Law project, demonstrated some of the tools developed by the Big Data for Law project to the director general and senior leadership. He talked through the implications of the new capabilities developed by the project and led a Q&A session and discussion about big data technology in the context of European legislation.
Year(s) Of Engagement Activity 2015
URL https://publications.europa.eu/en/home
 
Description Reaching a wider academic audience 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact John Sheridan, principal investigator for the Big Data for Law project, met with Andrew Le Soeur University of Essex, Professor of Constitutional Justice and Deputy Head of the School of Law at the University of Essex. John Sheridan discussed opportunities to measure the trend of legislation drafted to be processed by computer. Key outputs were the dissemination of project ideas to a wider academic research audience - legal researchers interested in legal justice; sharing expertise; increased awareness amongst a specialist audience.
Year(s) Of Engagement Activity 2015
URL http://www.essex.ac.uk/law
 
Description Speaker's Commission on Digital Democracy 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact Along with the Office of Parliamentary Counsel, we gave evidence to the Speaker's Commission on Digital Democracy (set up by the Speaker to investigate the opportunities digital technology can bring for parliamentary democracy in the UK) sharing information about Big Data for Law and how the findings might aid better Parliamentary scrutiny of legislation.

Informing and engaging senior policy makers and Parliamentarians, resulting in positive feedback and desire for ongoing engagement.
Year(s) Of Engagement Activity 2014
URL http://www.parliament.uk/business/commons/the-speaker/speakers-commission-on-digital-democracy/
 
Description Summit on Data Science for Government and Policy making 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact John Sheridan attended Summit on Data Science for Government and Policy-making held by the University of Oxford in partnership with the Alan Turing Institute. The audience was a mixture of policy makers with an interest in data science and academics. He spoke about the "Big Data for Law" project findings, particularly what had been learnt about data user needs as part of the "views from the front line" session.
Year(s) Of Engagement Activity 2016
 
Description The Good Law Event 2015 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact The Good Law event in February 2015 brought together Parliamentary Counsel, policy makers from across government and representatives from Civil Society groups, to explore the work done to date to ensure legislation is clear and accessible. John Sheridan, principal investigator for the Big Data for Law project, gave a presentation on developing a digital picture of the UK statute book and the concept of a pattern language for legislation. He gave a demonstration of the words explorer tool developed as part of the project, showing drafters how it could reveal changes in the language of legislation. The event provided the opportunity for the wider dissemination of Big Data for Law ideas as part of a broader government initiative which aims to change legislation policy and practice across government.
Year(s) Of Engagement Activity 2015
URL https://www.gov.uk/government/collections/good-law
 
Description UK drafting offices' Pattern Language group 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact As a consequence of the pattern languages workshop led by John Sheridan, and the presentation at the Commonwealth Association of Legislative Counsel, the four UK drafting offices set up a working group to take forward the pattern language for legislation for their own use. The group will meet regularly to explore the patterns in legislation that could support good law - enabling more effective drafting and clearer communication with policy makers. The group is led by Luke Norbury, Parliamentary Counsel for Northern Ireland. Its inaugural meeting was attended by John Sheridan, principal investigator for the Big Data for Law project.
Year(s) Of Engagement Activity 2015,2016
URL https://www.gov.uk/government/organisations/office-of-the-parliamentary-counsel
 
Description UKAJI Advisory Board 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The UK Administrative Justice Institute (UKAJI) aims to encourage more research into administrative justice issues. John Sheridan, principal investigator for the Big Data for Law project, talked to the research team about the implications of the research and the potential of the tools to enable new types of research in the administrative justice arena. Many of the legislation patterns identified by the project are patterns of decision making by the government, so are particularly relevant in the context of administrative justice research. John Sheridan is also a member of the UK Administrative Justice Institute's Board, which includes membership from the senior judiciary and senior civil service - the Board influences the direction of the Institute.
Year(s) Of Engagement Activity 2015,2016
URL https://ukaji.org/
 
Description Understanding research users' needs 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact We carried out initial work to understand users' needs at a workshop in April 2014, led by an expert usability company and involving academic partners and Co-Investigators. We have also carried out in-depth interviews with a wide variety of different types of researchers, including legal and other academics, as well as those with an interest in legislation from a policy perspective, such as the House of Commons Library and the Institute for Government. We have developed 'personas' for the service we are developing as part of the Big Data for Law project - personas are fictional characters that are representative of target user groups, representing behavioural patterns and behaviours. We have subsequently developed three personas for the new service: 'Maria Keane', an academic who is very enthusiastic about the possibilities that the legislation data research infrastructure could open up but who need support when it comes to complex data analysis; 'Adam McCann', who works for a government think tank and who has a broad range of interest related to legislation. He wants access to raw data and has all of the skills required to handle complex data analysis; and 'Peter Walker', a lawyer with decades of experience. He wants access to material that will support his arguments, but does not want to access the raw data or carry out complex analysis. Based on this research we are now developing our ideas for the products we will provide through the service (including an annual census of the statute book) and have developed 'wireframes' for the service.

Development of personas for the new service - fictional characters that are representtive of target user groups, representing behavioural patterns and behaviours; Development of wireframes for the new service.
Year(s) Of Engagement Activity 2014
 
Description Using legislation data to answer policy questions 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Policymakers/politicians
Results and Impact We were approach by policy officials in the Cabinet Office to query and analyse a large corpus of legislation data to answer a significant current policy question, around the pattern of law making by the UK government in comparison to the devolved administrations. We developed a methodology, conducted the data analysis and gave the results to the relevant policy team. This was the first time such policy questions have been answered using big data.

Developed new methodology; The first time legislation big data was used to answer policy question; Very positive feedback from the Cabinet Office - who are advocates for the project across government.
Year(s) Of Engagement Activity 2014
 
Description Visit to Constitutional Law team in the Cabinet Office 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Policymakers/politicians
Results and Impact John Sheridan briefed the Constitutional Law team in the Cabinet Office about the main findings of the Big Data for Law research project and demonstrated to them how they could use the tools that had been developed as part of their work.
Year(s) Of Engagement Activity 2016
 
Description Visit to The Parliamentary Counsel Offices in Australia 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact As a direct result of his keynote presentation at the CALC conference, John Sheridan, principal investigator for the Big Data for Law project was invited by the Australian Government to meet with The Parliamentary Counsel Offices in Australia. The Parliamentary Counsel Offices in Australia are responsible for drafting legislation as well as publishing and operating the official Gazettes, for the Commonwealth and the States. He gave three 50 minute presentations to an international drafters' forum including drafters from Australia and the States of Australia, New Zealand, Hong Kong and Singapore. He also presented to the Parliamentary Counsel Committee's IT Forum, and met with the Australian Government's Digital Transformation Office. The visit was an opportunity to share ideas internationally, and to increase engagement and interest.
Year(s) Of Engagement Activity 2015
URL http://www.opc.gov.au/
 
Description XML London 2016 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact John Sheridan, principal investigator for the Big Data for Law project, submitted a joint paper with Jim Mangiafico, on the structure aware search of UK legislation. The paper, and attendance at the XML London event, reached a specialist technical audience of XML Developers, semantic web and linked data experts and businesses, sharing the underpinning technology approach behind the Query Builder tool developed by the Big Data for Law project. The event provided an opportunity to demonstrate the Query Builder tool and its capacity for transforming legal research.
Year(s) Of Engagement Activity 2016
URL http://xmllondon.com/
 
Description openlaws.eu Advisory Board 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Openlaws.eu is an integrated analysis and data portal development project funded by DG Justice from April 2014 to March 2016. It is helping Europe to innovate in the legal field, providing better access for individuals, businesses, legal experts and public bodies, and creating a network between them. John Sheridan is a member of the Advisory Board, whose role is to influence and advise on activities, network and share expertise, and participate in major public project events, conferences and workshops. He described the main findings of the research, the needs of researchers in the legal field, and the opportunities afforded by the big data technologies. It helped to shape the direction of the openlaws.eu initiative.
Year(s) Of Engagement Activity 2014,2015,2016
URL http://openlaws.eu
 
Description www.socialtech.org.uk/blog, May 2014 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact John Sheridan, Principal Investigator for the Big Data for Law project authored a blog post for the respected law/tech blog on socialtech.org.uk, on the subject of people, patterns and data, specifically highlighting the work of the Big Data for Law project. The site is run by the Nominet Trust to recognise the pioneers who are using digital technology to make a real difference to millions of lives. It offers the most comprehensive, up-to-date and authoritative collection of the world's most inspiring applications of digital technology for social good.

Dissemination; Promotion; Shared understanding.
Year(s) Of Engagement Activity 2014
URL http://www.socialtech.org.uk/blog/people-patterns-and-data-the-story-of-legislation-gov-uk/