Digital records as evidence to underpin global development goals. (HN)

Lead Research Organisation: University of London
Department Name: Inst of Commonwealth Studies

Abstract

This project examines the crucial role of records management (and increasingly digital records) in the attainment of current development goals. The launch of the UN Sustainable Development Goals (SDGs) in September 2015, has made it increasingly essential to recognise and address weakness in the way governments' official evidence base is managed and used. Plans to measure and monitor the SDGs are based on the assumption that it will be possible to access meaningful data as a basis for benchmarking and eradicating poverty. Unfortunately, it is often the case that records are incomplete, inaccurate, inaccessible or lost completely, with the result that data and statistics derived from the records are flawed.

The rapid transition to the use of digital records has encouraged and been closely linked to 'open government' initiatives, aimed at making records and data publicly available. Again, this has a direct bearing on the achievement of current development agendas in ensuring that government is more transparent and accountable. Yet in the digital environment, records created by computerised information systems do not remain reliable or accessible, even for short periods of time, without a control regime of laws, policies, practices and skills as defined in international standards. If digital repositories are not developed, where digital records and data can be managed securely through time, there is no guarantee that they will remain available or reliable. Unfortunately, it is precisely those countries that stand to gain most from progress towards the SDGs that have the greatest problems putting effective data management systems in place.

The objective of this, which project builds on the expertise within the School of Advanced Study (SAS), its networks and partner organisations, is to bring together academics, records managers and archivists and a range of NGOs to consider how the management of digital records and data can assist in the achievement international development goals, particularly the SDGs, in sub-Saharan Africa. An important element in the achievement of these goals is the combatting of corruption, something that severely hinders justice, good-governance and development in many countries. Managing records efficiently will not in itself halt corruption, but anti-corruption goals cannot be achieved without trustworthy records as evidence. As such this will be an important thematic focus of the project. It aims to disseminate its findings via and dedicated web resource and an edited volume of articles. The project will also consider the feasibility of a globally-applicable set of protocols for digital records management to help ensure that records management policies, principles and practices become a core aspect of development in the digital environment.

A key feature of the project is a two-stage cross-fertilisation of expertise. In the first stage the workshops will be to encourage a dialogue between experts in the area of records management (from both the developed and the developing world) and academic experts on the politics, economics and history of post-colonial Africa. The second stage will allow them to engage with those actively involved in implementing international development goals. As such, measurable outcomes will be extent of the follow-up initiatives that emerge from these dialogues, and the extent to which the immediate outputs of the research network are fed into the policy-making process.

Planned Impact

As the Case for support for explains, the effective preservation and management of digital records has the potential to make a major impact on the tracking of achievement of a range of these goals, particularly in terms of the fight against corruption. Sub-Saharan Africa, on which the project focuses faces major challenges in terms of the resourcing of its records management systems, and the broader impediments to anti-corruption. As such, we believe that the project has a broad range of potential non-academic beneficiaries, not just in Africa, but across the developing world.

The CI, Dr Anne Thurston, has worked closely in the past with the Open Government Partnership (OGP), and the final report of the project, on 'Digital Records Protocols' will be disseminated through the OGP's many influential partner organizations to ensure maximum impact.
The report will also be distributed via the partners of the United Nations Development Porgramme (UNDP) Global Anti-Corruption Initiative Network. These include the African Development Bank Group, the Council of Europe and the International Monetary Fund.

The project recognises the need to reach different users through different forms of output. The main project report, which will have clear practical recommendations for the more effective management of digital records, is aimed principally at policy-makers and members of NGOs. Along with the main report, there will be an executive summary, which will be circulated to the international media by the SAS communications office at the time of the report's release at the international symposium. There is also a broad academic community including scholars working on records management and international development, and historians and political scientists working in the area of African studies, who stand to benefit greatly from the findings of the research network. The edited collection of articles is more closely focused on their needs. This is expected to be published in a special edition of a leading peer-reviewed journal. The project website, which will contain all significant materials generated by the project, is aimed at both sets of users.

The ICwS will also utilize its close partnership with the Commonwealth Secretariat to disseminate the project's findings. We will be closely in touch with the Secretary General's Office to brief its staff on the project from an early stage. The Commonwealth Heads of Government Meeting 2018 (which is due to be held in the UK) provides an excellent and inexpensive opportunity to acquaint Ministers, officials and members of affiliated organisations from across the Commonwealth with the project's preliminary findings. The Institute also has close links with the FCO and DFiD, which it will utilise to generate interest in the findings of the project. The International Records Management Trust of which Dr Thurston is director, has worked directly with a wide range of governments, international organisations and development bodies, particularly in Africa, and will use those contacts to disseminate the project's findings. Additionally, the project seeks to maximise its impact within sub-Saharan Africa itself by working closely with its partner organisation, the School of Information Science at Moi University in Kenya. Its Professor of Records and Archival Science, Professor Justus Wamukoya, is one of the leading figures in records and archives management in Africa. He has extensive contacts in the worlds of academia, government, records management across East, Central and West Africa. These will be vital to the dissemination of the project findings.

Publications

10 25 50
 
Description Development of a deeper understanding of practical issues affecting the integrity and credibility (quality, trustworthiness and authenticity) of statistics, data and records, the implications for measuring the SDGs and the implications for trust between government and citizens particularly in lower resource countries, especially in Africa.
The major findings of the study are broadly that:
* Poorly managed records make it hard to verify the quality and integrity of data generated to measure SDG indicators; this will undermine the government's e?orts to report on progress to the UN and jeopardise its ability to make good use of the findings.
* It is therefore vital that governments develop procedures to ensure that records documenting data and statistics activities are captured, managed and integrated with procedures for conducting surveys, analysing data, merging data and reporting statistics.
Exploitation Route A second workshop was held in May of 2018 to broaden the discussion that had taken place at the previous workshop in April 2017. The groundwork was prepared for an edited book (2020) and for the report, 'A Matter of Trust', published in November 2018. The report set out in practical terms the procedures governments need to set in place to ensure that development policy is carried out on a sound evidential basis and is measured using robust statistics. The edited book, published online on open access in December 2020, discusses these issues in greater detail. It explores, through a series of case studies, the substantial challenges for assembling reliable data and statistics to address pressing development challenges, particularly in Africa. By highlighting the enormous potential value of creating and using high quality data, statistics and records as an interconnected resource and describing how this can be achieved, the book aims to contribute to defining meaningful and realistic global and national development policies in the critical period to 2030. There is ample scope for follow-up projects, exploring in greater detail how the trends identified in the book have been played out and providing updates on subsequent developments.
Sectors Agriculture, Food and Drink,Communities and Social Services/Policy,Digital/Communication/Information Technologies (including Software),Education,Energy,Environment,Healthcare,Government, Democracy and Justice,Transport

URL https://humanities-digital-library.org/index.php/hdl/catalog/book/amot
 
Description Blog Posts 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A series of blog posts demonstrating the potential damage that can be done by poor data as policy and funding streams become increasingly dependent on data as evidence of both need and progress. Published in the run up to the Open Government Partnership Summit in Paris in 2016, this blog series addressed Information Integrity through Metadata: Publishing Data with Context; Information Integrity through Systems: Building Audit Trails; and Information Integrity through Web Archiving: Capturing Data Releases. The blogs argued the tools and techniques developed in fields such as data curation, records management and digital preservation offer approaches to establishing and protecting the integrity of information. The posts called for the incorporation of these tools and techniques into open government initiatives, creating and publishing robust data. Information with integrity provides the evidence needed for accountability and participation.
Year(s) Of Engagement Activity 2016
URL https://www.opengovpartnership.org/stories/summit-series-information-integrity
 
Description Event at the House of Lords to launch 'A Matter of Trust: Records as the Foundation for Building Integrity and Accountability into Data and Statistics to Support the UN Sustainable Development Goals' 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Event at the House of Lords on 1 November 2018, hosted by the Earl of KINNOULL to launch 'A Matter of Trust: Records as the Foundation for Building Integrity and Accountability into Data and Statistics to Support the UN Sustainable Development Goals'. There was a panel discussion which included academics and practitioners from the UK and Kenya who had taken part in the 2 workshops which had helped to shape the contents of the report. The event was attended by the Vice-Chancellor of the University of London, Professor Peter Kopelman, and brought together representatives from a number of domestic and international organizations with an interest in data management and development including the Commonwealth Secretariat, the Open Data Institute, the Institute for Internet and Society and the Institute for Advanced Studies on Science, Technology and Society.
Year(s) Of Engagement Activity 2018
 
Description Workshop - Managing Digital Information as Evidence to underpin Global Development Goals 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact As the project is focussing on the quality and integrity of public information available to measure the goals, the workshop (in April 2017) explored the relationship between statistics, data and records as primary types of information for measuring the goals and to initiate an interdisciplinary dialogue between humanities scholars, development experts and information professionals, including statisticians, data experts and records managers.

The discussions focussed on practical issues affecting the integrity and credibility (quality, trustworthiness and authenticity) of statistics, data and records, the implications for measuring the SDGs and the implications for trust between government and citizens particularly in lower resource countries, especially in Africa. The papers presented at the workshop and the discussions are summarized as follows:

Difficulties of Gathering Reliable Data and Statistics

The Expert Advisory Group on the Data Revolution has had a wide range of contributing specialists, most of them representing international agencies but also including a number of national statisticians and experts from civil society and academia. Many did not have experience of the realities and causes of broken systems, so the indicators do not always reflect what is realistically achievable. Nevertheless, the data revolution focuses on improving the way that data is produced, collected and used, on closing data gaps to prevent discrimination, and on building capacity and data literacy for both small data and big data analytics.

Measuring the SDGs will be a fundamentally challenging task. Official statistics gathered by statistical offices are often so weak that in most developing areas of the world there is little prospect of using them to measure the goals in the foreseeable future. In addition to major gaps, there are fundamental integrity issues. Census data tend to be incomplete, limited, out of date, inaccurate, irretrievable or simply have not survived; undocumented changes in data structures and data entry errors make it difficult to compare data through time.

Morten Jerven, whose study of economic development statistics in Africa, Poor Numbers, noted, 'International development actors are making judgements based on erroneous statistics. Governments are not able to make informed decisions because the existing data are weak, or the data they need do not exist.' As one of the workshop participants noted, 'I can cite some countries where there has been 10 years or 20 without doing a census. That is going to be a real problem in terms of achieving or monitoring development goals. If there is a volatile situation, an economic crisis, most developing countries would not be able to provide any data about the current situation.' The SDG approach, therefore, is twofold: firstly, strengthen national statistic offices to increase their capacity to collect data in a timely and efficient manner, and secondly, gather and amalgamate data from a range of government, civil society and corporate sources.

Training statistical staff to capture high-quality data and interrogate it to evaluate the indicators in low resource environments is a lengthy process, both in terms of identifying funding sources and of delivering relevant and comprehensive training programmes. Statisticians need to follow rigorous codes of practice for applying, analysing and measuring issues in relation to error rates, and they need to be able to demonstrate how the statistics were compiled. Without that level of rigorous methodology, the results can be questionable. If the SDGs are to provide an accurate basis for evidence-based policy decisions or for donors to make aid allocations, there is a need for a major and sustained investment of resources.

In the meantime, governments and the data community must take a blended approach to compiling statistics, bringing together different data sources in new ways, including data from GIS applications, social media platforms, crowd sourcing, satellite videos, mobile devices and a range of other sources. As a participant pointed out, 'Since the 1980s, lots of data and records that previously would have been created by state actors in developing countries have been developed by non-state actors, such as international agencies and local and international NGOs.'

The UN is hosting a series of workshops at the country level as a basis for building country data road maps. The workshops aim to galvanize political commitment, align strategic priorities, foster collaboration, spur innovation and support the process of combining data in new ways. In 2016, for instance, workshops in Kenya and Tanzania generated awareness about the SDGs by bringing together several hundred different stakeholders, including staff of National Statistics Offices, government officials, civil society partners and academics to examine the roles of different stakeholders and facilitate understanding about the emerging data ecosystem, including capacity and budget aspects. They highlighted the value in overlaying the SDGs indicators and national development plans in order to assess data gaps. Awareness also has been expanded through a series of 15 sub-national pilots organised through the Open Government Partnership and through local community workshops, which aimed not only to support the SDG process but to open information to citizens, governments and businesses.

Initially, the emphasis was on making the data available, rather than on how it was compiled. Now, the approach is to move toward quality data that can be trusted and reused. Essentially, data quality rests on how the data has been managed. As one of the participants noted, 'We need to be asking: Where has the data come from? How was it compiled? Why was it compiled? What was the sample set? How was it amalgamated? What methods were used to analyse and interrogate it? What algorithms were used to enable its interpretation? Who published the data and when? When multiple data sets are combined, who owns the amalgamated data set? Has the process been carried out transparently? How effectively has the data been anonymised?'

Moreover, data must be documented transparently if it is to drive policy and service delivery. Amalgamating data from two or more sets of administrative or survey data from different organisations can enhance the quality of existing data and maximise its value for research and statistical purposes. However, mapping together different data structures from multiple data sets, many of them with broken data linkages, to arrive at reliable composite statistical findings can be a challenging and complex process. For instance, as a participant remarked, 'Data sets are often created in silos. It is quite difficult to find a way to unify them to make better sense of what the data is telling us. The fact is that these data sets are on different servers in different locations.' There may also be multiple versions of the same data sets.

If the statistics are flawed, they may not provide a reliable basis for policy decisions or for determining aid priorities. The situation is complicated by the fact that many members of the data community are self-trained and may not follow rigorous data science methodologies. As a participant noted, 'Doubts about the identity and integrity of the data are introduced by partial metadata, opaque provenance, undocumented custody, particularly during aggregation as well as by the lack of information about the systems for the data's management.' In the future, it would be valuable if metadata indicating the controls used to manage data integrity could be captured, at least for key data sets.

Common core metadata vocabularies, which are key to reliability and integrity, are starting to be created, as people from the open data community join forces with technology experts and experts in the data management field. The emerging norm is that each piece of information should be associated with agreed core metadata. There is also an emerging requirement to document who was in charge of the data, with alternative contacts in case the person leaves. 'When you open up a data set, you start to go through a process of: Where did this come from? Does it meet metadata criteria? Have I actually been able to document it? Is it free of privacy and security concerns?' Once the data is online, it is seen by many people from many backgrounds, who ask questions that help to strengthen the validity of the data. As a participant noted, 'The philosophy is that if you are able to have multiple sector participants, multiple organisations, individuals actually interacting with data, you will improve its quality and you'll actually be able to figure out what its usefulness is.'

As public sector transparency, accountability and openness have emerged as predominant international development themes, it has been widely accepted that opening data to citizens will enable them to participate in state affairs, monitor how government money is spent, hold public officials accountable for their actions, and participate in good decision-making. By focussing on non-personal data, non-proprietary data and data to which national security restrictions do not apply, open data can bypass the restrictions on opening records that could cause harm or embarrassment to individuals, making it possible to move beyond official secrets acts and 20 or 30-year rules to immediate use.

As governments turn to big data analytics, the same principles apply: the methodologies used to interrogate the data need to be documented and transparent and the sample should be clear. Without understanding the methodology used to collect and interpret the statistics, visualisations can reflect distorted pictures. For instance, censuses in Africa sometimes exclude nomadic tribes, and even though the portion of the population may be small, without this information the picture can be distorted.

The UN is well aware of the importance of creating an environment where the public trust the use of big data for official statistics and where privacy and confidentiality of personal information can be assured. Among the most significant issues that need to be addressed are: Who owns the data? Who is responsible to managing it? How will it be preserved through time?


Relationship between Reliable Records and Data

Records make an essential contribution to sustainable development that is complementary to, but different from, data and statistics. Data and statistics document trends and patterns, with personal information anonymised to protect the individual. Records do the opposite. They document individuals' rights and entitlements, and they provide specific evidence to document accountability and transparency. The metadata associated with records describes their context, custodianship, content and structure and their management through time. It gives them value as authoritative evidence of specific policies, actions, decisions, precedents, transactions, which are the foundation for the rule of law.

Well-kept official records contribute fundamentally to all aspects of national development. They provide the audit trail for official financial transactions and the documentary evidence for pay and personnel management, income tax collection, corruption control and land ownership. For instance, records are the basis for loans, whether this is documented by records of credit histories or by records collected by private sector actors. They are a key to land management and development. Subsequent changes to a record, when documented through metadata, create an audit trail that makes it possible to identify fraud or illegal actions.

Moreover, only the records profession has developed the means of protecting, preserving and accessing digital information through time. Neither data analysts nor statisticians have the skills or have developed the structures needed to preserve digital information so that it can be used reliably in the future. In the digital environment, it is the synergies between the professional approaches to managing records and data that provide the key to maximising the use of the information and building confidence that the public can trust it. Managing records and data in a similar light is of fundamental importance.

When digital records began to be created in growing quantities, few people, in the records profession, in government or in international agencies, realised how quickly digital information would become a major source of government documentation or how easily it could be lost. Citizens and governments rapidly came to rely on digital records created on desktop computers, in databases, in email, on mobile devices, on websites and via social media platforms, but there tended to be little understanding of the skills and structures be needed to manage these records even of which government agency should be responsible. Often management responsibility was split between several government agencies, for instance the one responsible for ICT development, the one responsible for access to information and the one responsible for culture.

Governments and donors worldwide tend to believe that information produced in computerised systems will offer the basis for planning, monitoring and measuring national and international development goals. Most do not realise that IT systems create records but lack the full functionality needed to keep them reliable and authentic for as long as they are needed. As a result, IT systems have been developed without the supporting framework of policies and systems needed to protect, preserve and make digital evidence available through time.

Digital records are fragile. If they are not managed professionally, their integrity and their value as legal and historical evidence can be compromised or they can be lost completely. Their integrity depends upon a quickly changing array of hardware and software. Digital media deteriorate, software changes and hardware becomes obsolete. Digital records may be stored on personal drives, un-networked computers, unmanaged network drives or mobile devices, which can make them unavailable as a national resource and unlikely to survive. Different versions of digital records may be stored without adequate identification, making it difficult to rely on the evidence as authoritative. They can be altered, deleted, fragmented or corrupted through malicious interference or inadequate management; their meaning may be lost when metadata is not captured, is imprecise or becomes separated from the records when technology changes. They can be difficult to retrieve after a few years, months or even days.

Very large volumes of digital records are being lost regularly in many countries through the lack of systematic approaches to preservation. The World Bank's 2016 World Development Report noted, it 'is fair to say that long-term preservation of digital records and information in most countries in the world is at serious risk'. Fortunately, the international records community has worked steadily to develop standards, requirements and management tools for protecting and preserving records integrity, and this work is available as international standards that can be shared. Unfortunately, it is little known or understood outside the records profession and sometimes not even within it. Development planners sometimes think of records as outdated sources of government information, not realising that many records are now 'born digital' and must be managed from the time that they are created if they are to survive.

Records are defined in international standards as 'information created, received, and maintained as evidence and as an asset by an organization or person, in pursuit of legal obligations or in the transaction of business. They may be in any medium, form, or format.' Nevertheless, the words data and records now tend to be used almost interchangeably, and what the records community has called records are often referred to as data. For instance, records of disease rates compiled in hospitals are often referred to as disease data, and birth and death records are referred to as bi
Year(s) Of Engagement Activity 2017