Developing a data discovery and sharing infrastructure for quantitative and qualitative socio-economic data via the WISERD GeoPortal

Lead Research Organisation: CARDIFF UNIVERSITY
Department Name: Sch of Social Sciences

Abstract

The Wales Institute for Social and Economic Research, Data and Methods (WISERD) was established in 2008 with the aim of boosting the strength of social science research in Wales. The concept behind WISERD was to create a large group of researchers spread throughout the five main universities in Wales with the aim of drawing on collaborative inter-institutional working and wide-ranging expertise to improve the sustainability of social science research in Wales, and to produce major innovations in methods of conducting research.
One of the key parts of WISERD's research agenda concerns data, which is a hugely important resource for social science research projects. WISERD recognises that, while there are a large and diverse number of datasets in Wales (and the UK) that are potentially available to social scientists, economists and analysts, there is no effective mechanism for discovering these data. As a result, potentially useful data that has been collected (often at great expense) are often under-used or not used at all for research and analysis. Given the current economic climate, it is crucial that the value of the data held in various repositories throughout Wales is optimised. In response to this, WISERD is developing computer software that will help researchers discover socio-economic research data for Wales and beyond, and in doing so encourage the re-use of existing data and help stimulate collaborative research between researchers.
The software that WISERD are developing is called the 'WISERD GeoPortal' - 'Geo' because the datasets that are contained in its database can be linked to real-world locations and can be search for using a map, and 'Portal' since the interface to the software takes the form of a web portal/web site. At the heart of the WISERD GeoPortal is a large database of socio-economic datasets that have been built-up over the past three years. The database contains detailed information, or 'meta-data', describing the various datasets that exist for Wales, including government surveys, administrative data such as school records; unofficial 'grey' data such as local authority surveys and data collected as part of academic research. The web interface of the WISERD GeoPortal provides access, via a map or text search, to this 'meta-database' allowing researchers to discover what data is 'out-there' and supplying them with a wealth of information about data resources, including details on how to access their source data.
This ESRC Follow-on Fund application is seeking an investment to continue the development of the WISERD Geoportal to enable it to be made more widely available to researchers and analysts in Wales and UK, and to investigate the potential for long-term development and funding opportunities. WISERD believe that the GeoPortal has the potential to become a major resource for social science research and this proposed project will aim to ensure, via a set of clearly defined objectives, that this potential is realised.
The objectives of this next phase of GeoPortal development are to i) enhance the current content and functionality of the geoportal (it is still in its 'prototype' stages of development) by continuing technical development while working closely with partners and members of the potential user base; ii) establish it as an indispensible tool for researchers and analysts by creating user panels and technical working groups and engaging in training, 'marketing' and GeoPortal dissemination activities; iii) explore further methodological and technical enhancements including exchanging data with other data repositories via the Internet and improving the computing infrastructure of the Geoportal in order to improve its stability and performance; iv) develop a business plan to ensure that development of the GeoPortal is sustained into the future; v) evaluate the impact of the GeoPortal as it is rolled-out amongst end users and stakeholders.

Planned Impact

The Geoportal will have a wide range of beneficiaries amongst public and third sectpr research and analytical communities. Demonstrations of the proto-type and discussions with public and third sector stakeholders in Wales (including the Welsh Government, Wales Council for Voluntary Action, Welsh Local Government Association) have evidenced the potential value of the Geoportal in helping analysts to locate Welsh socio-economic data in support of a number of national and local interests, including policy development and evaluation, service and community planning. Key impacts include:-

- significant improvements in the availability and accessibility of social science data relating to Wales
- increased use and re-use of existing socio-economic data - the "collect once, use many times" principle
- stronger evidence base for new data collections and reduction in respondent burden including supporting the 'Beyond 2011' initiative
- support of Welsh Government data responsibilities, including survey control and the duty to provide a catalogue of Welsh data sources
- greater cross-sector collaboration amongst analysts and researchers on data access/infrastructure issues .

A key long-range policy impact is stronger, more evidence based policymaking and service delivery, in turn leading to better socio-economic outcomes for the people of Wales.

The on-going involvement of stakeholders will be key. In terms of strategic input, key stakeholders will sit on a Geoportal Steering Group. The Steering Group will over-see the further development and roll-out of the Geoportal and develop a business plan for its long-term sustainability. The Steering Group will be an important mechanism for ensuring the Geo-portal is developed in tandem with other Wales and UK data initiatives.

More operational input will come from user panels. These panels, which will be drawn from existing research networks, will advise on content (including sources not previously made publicly available), design/functionality and dissemination. Panel members will keep their networks informed of progress with the Geoportal. The Welsh Government also envisage using the user panels for their own consultations on data issues, an added impact which further strengthens links between the two organisations.

The Welsh Government will have an important role to play given their responsibilities for developing Welsh data infrastructure. How they will work in partnership with WISERD is outlined in their letter of support.

It will also be important to encourage the use of the Geoportal via a number of dissemination mechanisms, including:-

- Development of a dedicated marketing and publicity campaign for the GeoPortal
- Enhancing the GeoPortal's presence on the WISERD website, and facilitating access via plug-ins on user group websites
- Holding a series of workshops across Wales to showcase and demonstrate the GeoPortal to users in a range of organisational settings
- Providing training to potential users
- Working with Welsh Government to promote the GeoPortal as part of the publicity campaign for the release of initial findings from 'Census 2011' / 'New National Survey for Wales'
- Producing a series of Find the Facts briefings, summarising the data available on different topics, its potential and limitations.

Evaluating impacts

Impact evaluation will be undertaken in two stages. A base-line will be taken when the Geoportal goes "live" and a follow-up evaluation will be undertaken 12 months later. The evaluation will cover:-

- Geoportal usage
- User and non-user feedback
- Level of demand for source data for Wales, and re-use of existing data sources for Wales in published reports/journals
- Evaluation of workshops and training events
- Strategic assessment (via key interviews with senior stakeholders) covering extent to which the Geo-portal is achieving its high level objectives

Publications

10 25 50
 
Description The main output of the research is a web-based software (WISERD DataPortal - WDP) that allows users to search rich meta-data in order to discover and access social and economic data sources relating to Wales.



The key findings of the project are that social science researchers and policy makers find it difficult to know what data exists relevant to their research. This will no doubt become more pressing as the data landscape in Wales and the UK becomes more complex.



The workshops and user engagement organised events demonstrated that there is significant appetite for tools and methods that make the search, discovery and access of secondary data easier as this not only increases the likelihood of re-use and re-purposing of existing data sources but also reduces the costs associated with social science research. Knowing what data is available and how these can be accessed can help in the formulation of research ideas, development of methodologies as well as in the analytical side of the research.



A clear outcome of the knowledge exchange organised event with academic users was how the WDP was a very useful tool for generating and managing meta-data for Ph.D. research and other academic projects and for archiving and eventual deposition of these data at the UK Data Service.



Feedback from the organised events demonstrated the importance of simple user-interfaces and intelligent displays of outputs. In this context, outputs should be ranked in some way and perhaps categorised to help the user decide which data source is important. Currently meta-data output is displayed in the WDP in the order in which it is pulled from the database by the search criteria. This is evidently not the best way to display the data and sorting is needed to ensure that the most relevant meta-data is prominent in any list of results.



Being able to access rich meta-data is an important starting point. However, it became clear that what most researchers want is easy access to the source data. Work on data feeds and linking the meta-data to external APIs such as the ONS API is an important step-forward here, although our early work indicates that more needs to be done on both the implementation of APIs and encouraging the consistent use of UK data standards by national data providers. More can be found on the WISERD DataPortal development blog.



The WDP is continuing to be promoted via WISERD events and publication channels and will take on an enhanced role with the launch of the ESRC funded WISERD / Civil Society research centre. Here the emphasis will be to increase engagement with civil society stakeholders and organisations in relation to their secondary data needs.



In terms of future research, the WDP has been costed into the successful WISERD/Civil Society programme of work and will also have a role in the new ESRC funded Wales Administrative Data Research Centre (ADRC) in developing links between administrative and survey meta-data and in running knowledge exchange and training events associated with searching and discovering administrative data
Exploitation Route The WDP knowledge exchange events were aimed at policy makers, public sector practitioners and third sector social researchers. This demonstrated that the WDP was a very useful tool for allowing these users groups to search, discover and access socio-economic data for Wales. A potential impact is that these organisations will use secondary data more in their research and monitoring activities.



We are continuing to develop links with these non-academic organisations and discussing with a small number of them how they can use the WDP as a way of creating and exposing meta-data relating to their own data. This could encourage collaboration within the user community via data and knowledge sharing and also potentially reduce duplication of data and the need to carry-out new surveys. The latter is an important issue for the Welsh Government.
The knowledge exchange events and publication channels such as the development blog and twitter feed has increased engagement across different user groups, including users outside of Wales. The users of the WDP were generally positive and enthusiastic with regards accessing social science meta-data and there were follow-on meetings with stakeholders from Shelter Cymru and Sports Wales about using the WDP for their own meta-data use. The Welsh Government has also expressed an interest in using the WDP in helping inform their survey work and especially survey version control with regards questions asked and response categories and identifying duplication across the various surveys they undertake. These engagements are on-going via WISERD networks and activities.



The WDP forms the basis of WISERD's data management plan and provides tools to allow researchers across the five WISERD universities to create, manage and share meta-data associated with their projects and also help researchers to share and archive this data at the UK Data Service when their project is complete. This will be developed further in terms of tools and training under the auspices of WISERD Civil Society research centre.



The WDP team has been awarded up to £15000 development monies as part of the second stage of the ESRC UK Data Service Innovation funding process. The intention is to develop a proposal to re-purpose the WDP interface to develop a cartographic search, discovery and mapping front-end for UK Data Service data with the emphasis on Census 2011 data via the INFUSE API.
Sectors Communities and Social Services/Policy,Digital/Communication/Information Technologies (including Software),Education,Transport

URL http://www.wiserd.ac.uk/resources/wiserd-dataportal/
 
Description Developing a data discovery and sharing infrastructure for quantitative and qualitative socio-economic data via the WISERD GeoPortal Background The knowledge exchange project was from December 2012 - December 2013 and had for two distinct work packages: i) the continued development of the WISERD DataPortal as a data discovery and sharing infrastructure for quantitative and qualitative socio-economic data and ii) the promotion of the DataPortal through training materials, resources and knowledge exchange events with the academic, policy and third sectors. For the twelve months after the funding period, basic maintenance of the DataPortal has been supported through the WISERD Hub but without resources for its active promotion or for training events. The DataPortal has now been successfully re-funded as part of the WISERD Civil Society large centre bid. There is a cross-cutting programme of work dedicated to developing and re-orientating the DataPortal so it aligns more strongly with data on civil society for users within civil society. The work starts in April 2015 for 2 years and has a dedicated knowledge exchange programme of events aimed at civil society organisations. Hence the impact of the DataPortal is on-going and this narrative simply reports the impact within the first twelve months after the end of the funding, with the expectation that this will continue and become more significant over the next two years as the new programme of works begin. How have your findings been used? It is often difficult to know how the different users who access the DataPortal use the information that they find to inform their own work as this is rarely acknowledged in their outputs; however, it is possible to gain an insight into the impact that the DataPortal has had within the first year after the completion of the grant by looking at the types of users accessing the DataPortal and by three indicative case studies. Registered users and user activity To-date there are around 100 hundred unique registered users on the DataPortal. Around a half of users registered within the last 14 months and principally between October 2013 and February 2014. Activity in the past twelve months was concentrated within the first half of the year. Both the new registrations and the activity reflect the impacts of the DataPortal Knowledge Exchange events that ran in 2013. Fewer users accessed the portal in the second half of this year, reflecting the decline in DataPortal promotional activities and events as a result of limited funding and resources. Analysis of user activity within the past year showed that there has been an increase in the number of users associated with civil society and third sector organisations, probably as a result of WISERD Civil Society initial activities. But it is not currently possible to map user activity when they access the portal in terms of what they are searching for or how they subsequently use this information. One of the first tasks of work in the re-funded period of the DataPortal is to survey existing users to discover this information and to use this to inform the DataPortal's subsequent development. Impact in the policy sector We do have an indicative of how the DataPortal has had an impact in the policy sector - how the Welsh Government have been appraising and piloting the DataPortal as a tool to help them manage and consolidate their numerous social surveys across Wales. By identifying questions and responses from existing Welsh surveys and seeing how these map onto other Welsh surveys and Welsh localities, the Welsh Government can determine whether a new survey is needed or data from existing surveys can be re-used and repurposed to plug the data gap. The DataPortal has also been appraised by the Welsh Government as a versioning tool to help them track the changes in questions and responses with annual surveys so as to maintain consistency and also help interpret analysis of the data over time. Impact in the third sector The DataPortal has been used by both Sport Wales and Shelter Cymru as a tool for generating and curating meta-data for their own organisations. Sport Wales undertakes large annual surveys and the DataPortal has been piloted as method for maintaining a question bank that can be used internally and also promoted externally to encourage data re-use and collaboration. Shelter Cymru have trialled the DataPortal as a tool for maintaining meta-data on data it collects routinely on homelessness. Again, this can be used internally to improve in-house research and analysis and also promoted externally to encourage collaboration. Impact in Civil Society The focus of the DataPortal on civil society is new so the impact thus far is limited. Nonetheless, the DataPortal has been the focus of some early collaboration with the Wales Council for Voluntary Action (WCVA) in terms of geocoding and mapping their routinely collected data. This has added value and insight into their data and has help shaped some early collaborative work between the WCVA and WISERD. Other impacts The DataPortal has been used to inform Cardiff University's Research Data Management Programme, and especially the meta-data and meta-data generation tools, the experience of social scientists in creating, sharing and curating meta-data and research data and the use of meta-data discovery tools outside of academia. The DataPortal was also the basis of a bid to the UK Data Service Innovation Fund 2014 for developing a cartographic front-end to Census 2011 data where we were successful in procuring development funds but were unsuccessful in the final stage in winning full funding.
First Year Of Impact 2013
Sector Communities and Social Services/Policy,Education,Leisure Activities, including Sports, Recreation and Tourism,Government, Democracy and Justice
Impact Types Policy & public services

 
Title CaCHE Data Navigator meta-database 
Description CaCHE Data Navigator meta-database stores searchable meta-data records on over 150 key sources of housing and housing-related data in the UK. It conforms to international meta-data standards. The meta-database will develop during the lifetime of the project in terms of both its size and content. 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
Impact The CaCHE meta-database is still in development and will not have notable impact until later in the project 
 
Title CaCHE Data Navigator 
Description The CaCHE Data Navigator is a web-based software application which will enhance a researcher's ability to discover housing and housing-related data, with the aim of encouraging re-use and re-purposing of existing data. It is being designed to make the search and discovery of these data easier by proving access to meta-data on a wide variety of housing data sets including Government Surveys, routinely collected administrative data, private sector data and industry data. The data will be for a variety of spatial scales from the whole of the UK down to the four nations, regions, local authorities and smaller spatial units. As well as providing access to meta-data, where possible the CaCHE Data Navigator will also provide access to existing open data through various APIs and Data Centres. The CaCHE Data Navigator will also host a small number of bespoke open source data sets that have been identified as being valuable to the housing research community. The CaCHE Data Navigator is still in development and it's technical specification will change over time. 
Type Of Technology Webtool/Application 
Year Produced 2018 
Open Source License? Yes  
Impact The CaCHE Data Navigator is a beta version and been released for user testing. The user testing is currently being evaluated and updates made to the software 
URL https://cache-web-live.cf.ac.uk/
 
Title WISERD Data Portal 
Description The WISERD Dataportal is a web-based software application which will enhance a researcher's ability to discover socio-economic research data, with the aim of encouraging re-use and re-purposing of existing data. It uses free and open-source software (FOSS) components and services to capture standards compliant metadata for a variety of socio-economic data sources, and provides map-based and text-based search tools for accessing this database. Thus far the software has been developed with a focus on data relating to Wales, however, it is hoped that this will expand in the near future. 
Type Of Technology Webtool/Application 
Year Produced 2012 
Impact The Dataportal formed a basis for the data management and dissemination tool for the successful WISERD Civil Society large centre bid. The software is being developed to allow access to civil society data relating to the large centre projects and also to data collated by civil society organisations. 
URL http://data.wiserd.ac.uk/dataportal/
 
Description WISERD DataPortal Social Research Association seminar 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Participants in your research and patient groups
Results and Impact A seminar hosted by the Social Research Association where we gave a live demonstration of the WISERD DataPortal as an example of a free data tool for social science research

After the demonstration, the number of registrations and activity on the Dataportal increased
Year(s) Of Engagement Activity 2013
 
Description WISERD DataPortal Wales Statistical Liaison Committee seminar 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact A seminar providing an overview of the WISERD DataPortal to members of the Wales Statistical Liaison Committee, including a live demonstartion

After the seminar and demonstration, the number of registrations and the amount of activity on the DataPortal increased and the PI was invited to become a member of the Welsh Index of Multiple Deprivation (WIMD) Advisory Group to advise on Welsh data issues regarding the new 2014 index
Year(s) Of Engagement Activity 2013