Data Management through e-Social Science: Case studies, Provision and Support (DAMES)

Lead Research Organisation: University of Stirling
Department Name: Applied Social Science

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Publications

10 25 50
publication icon
Turner K (2014) Workflows for quantitative data analysis in the social sciences in International Journal on Software Tools for Technology Transfer

publication icon
Lambert P (2014) Using occupation-based social classifications in Work, Employment and Society

publication icon
Sinnott, R.O.; (2008) Towards a virtual anonymisation grid for unified access to remote clinical data in 6th International Healthgrid Conference

publication icon
Lambert P (2008) The importance of specificity in occupationbased social classifications in International Journal of Sociology and Social Policy

publication icon
Blum, J. (2008) The DAMES metadata approach

publication icon
Sinnott, R.O.; (2009) Supporting security-oriented, inter-disciplinary research : crossing the social, clinical and geospatial domains in Fifth International Conference on e-Social Science

publication icon
Sinnott RO (2008) Supporting grid-based clinical trials in Scotland. in Health informatics journal

publication icon
Lambert, P.S. (2009) Standards setting when standardizing categorical data in Fifth International Conference on e-Social Science

publication icon
Watt, J. (2010) Security-oriented infrastructures for social simulation in UK e-Science All Hands Meeting

publication icon
Watt, J. (2008) Portal-based access to advanced security infrastructures in UK e-Science All Hands Meeting

publication icon
Dougall, N. (2012) Modelling health and social risk factors for suicide in Scotland : a 30 year record linkage study in Population Health Methods and Challenges conference

publication icon
Dougall, N. (2011) Modelling health and social risk factors for suicide in Scotland : a 30 year record linkage study in Scottish Health Informatics Platform conference

 
Description The main resources generated by the project involve a group of free online 'portals' which provide access to specialist information on occupations, ethnicity and educational qualifications (all available from www.dames.org.uk); the distribution of training materials on important but under-studied aspects of 'data management' (see also the above website); and new innovations in computer science relevant to social science data management (including development of a system for metadata curation and organisation, new tools for summarising workflows, and new systems demonstrating the secure handling of complex data resources). Substantive research was undertaken alongside these resource provisions and generated publications on occupational and educational inequalities, suicide and other mental health outcomes, and social care provision and needs.



Much of the Node's work addressed methodological issues. A long-standing challenge in social science research concerns 'getting the best' out of data resources: many rich data resources are available to researchers, but not all analyses do a good job in taking advantage of the information held on them. We argue that this often occurs because researchers simply aren't aware of alternative strategies for handling the data, and/or lack skills in enhancing and documenting data processing, with the end result being that we quite often see analysis based on unsatisfactory simplifications to complex information resources. There also remain certain unresolved but important methodological challenges in social science data, particularly concerned with making appropriate comparisons over time or between countries when using large scale survey data, or of comparing between alternative options for measuring and analysing popular social science concepts such as 'class' or 'ethnicity'. At the start of the Node, we claimed that there would be ways of exploiting emergent approaches from computer science research which would help us to improve upon the exploitation of data resources in social science research. During the project, we developed new online resources to contribute to data management practices, and we undertook new research on both social and computer science topics which involved significant volumes of data management and served to test out our new provisions as well as to generate new research findings.



The Node included a significant component of computer science research. Original research was conducted in areas covering secure handling of complex data resources, the development of portal systems for social science data organisation and collaboration, and workflow modelling. As one example, the research on the design and support of workflows for social science led to the CRESS methodology and associated toolset for workflows (http://www.cs.stir.ac.uk/cress). The approach supports the definition of Web-based and Grid-based workflows as high-level combinations of individual social science solutions. A variety of social science applications have been used to demonstrate the usefulness of the work and a complete methodology is supported whereby workflows are described graphically, are automatically checked for errors, and are automatically realised as online applications.
Exploitation Route Several contributions from the project involve resource provision and knowledge exchange which are relevant to researchers involved in the use of social science data across different sectors. Relevant provisions cover information and resources concerned with access to and handling of specialist data resources associated with occupations, educational qualifications and ethnicity; dissemination of training materials covering handling large and complex quantitative data; the publication of information on special features of data associated with mental health records and with research on social care; and the publication of generic materials on computing approaches to metadata, workflows and security infrastructures. All of these contributions offer resources relevant to users from outside the academic research sector; in most instances accessible materials have been made available online via www.dames.org.uk, facilitating easy access to resources.





Certain specialist topics within the Node have direct relevance to non-academic practitioners. For instance, the Node's work on linked and secure e-Health data focussing on the theme of mental health has relevance for various stakeholders in health research, and there are various non-academic groups directly involved in using specialist data covered by the three GESDE services (e.g. the UK's Office for National Statistics and local authorities who use social statistics in these areas). Indeed, the general model used by the GESDE services has now been replicated in other UK data services which provide support for non-academic and academic users alike in health survey research (see the 'Methodbox' project, also part of the UK's e-Social Science programme, and to which DAMES contributed inputs and suggestions) and to users of the Administrative Data Liaison Service which developed the 'P-ADLS' service after suggestions from and collaborative meetings with members of the DAMES Node).
Since the Node was concerned with providing online resources in a range of social science data scenarios, there are obvious potential exploitation routes in using the online resources to the benefit of further research or understanding. Access to the online resources if free to all, and the online 'portals' have a basic 'guest' level of access from which any user can search the system and download resources obtained. The portals also have a 'registered user' level of access for which individual authentication is required, after which certain other resources can be made available, including the important opportunity to deposit data for dissemination to other researchers.



It remains difficult to demonstrate in a systematic way the scale of further exploitation of online resources provided by the Node, as we do not have representative data on how researchers exploit resources from the Node, and from other sources, for data management. Our webpages have recorded 'guest level' hits on a daily basis since the portals have been available, but the number of registered users of the three 'GESDE' portals is not as high as we would have hoped (there are currently 17registered users for the GEODE and GEEDE services (combined) and 25 for the GEMDE resources). Guest users may download data from the services, but only registered users may deposit new data, and accordingly the volume of deposited resources in the three 'GESDE' portals is also not as high as we would hope (approximately 300 unique resources at GEODE and GEEDE (combined) and 47 unique resources at GEMDE), most of which have deposited by members of the DAMES Node itself).



Hitherto, registered users have all been from academic research organisations, from the UK and from other EU countries, but this need not follow automatically. The scope of the resources covered through the Node is very wide, covering data from many different countries, and covering all time periods for which social science data is available (for instance, several resources at GEODE concern data on occupations from the 19th Century). The Node has already enjoyed some productive cross-national data collaborations (e.g. in collaborative meetings with representatives from CESSDA including authoring reports for that important international project - see deliverables D11.1a and D11.ab at http://www.cessda.org/project/deliverables.html), and in research collaborations with a funded project led by Dr Erik Bihagen at the University of Stockholm - e.g. Lambert and Bihagen 2012). Equally there are clear further exploitation possibilities involving uptake of, and contributions to, resources by researchers from other nations.



One feature of the Node was its use of collaborative meetings to facilitate further research connections. Numerous experts visited the Node during its lifetime, leading to further collaborations with important academic staff and other research organisations (e.g. the UK's Office for National Statistics and 'Scotcen', a branch of the National Centre for Survey Research). The project has also led to the establishment of further long-term collaborative research groups, such as the University of Stirling's 'E-Health Data Linkage Research Group', chaired by Maxwell and involving over 15 staff from four different University schools, which in term led to Maxwell and Lambert being included as collaborators within the E-HIRC bid for a Scottish e-Health Research Centre in 2012.



We are able to point to various examples where social science researchers have exploited the resources generated by DAMES (see the 'Impact report'). Nevertheless, an important lesson from the Node's work concerned the challenge of moving from technical capacity to practical uptake. Whilst we have a compelling argument that our new resources offer clear benefits in terms of scientific quality through their support for activities such as sensitivity analysis, aspects of the resources that we have developed are probably still quite challenging for many users. Additionally, our own online portals have experienced more functional errors during their development than we anticipated, which must also have been off-putting to prospective users. Comments received in feedback forms have noted that the new approaches which we advocate (e.g. more sensitivity analysis and greater attention to 'variable operationalisations') are effectively asking other researchers to do something significantly different, and apparently harder, than they are used to. All of these factors may mean that extended exploitation of the online resources would benefit from further inducement and support rather than simply by making the resources available - to this end, members of the Node have been very active in pursuing further funding opportunities to allow continued work on maintaining and promoting the online services. Ideally, funds will be secured with will allow dedicated staff commitment to provide the manual monitoring of use and corrections to any emergent requirements (such as when other software is upgraded), along with to support efforts to promote the resources to applied researchers.



The computer science research within the Node has the potential to make many further contributions to research. Findings have been summarised in papers, including Warner et al. (2010) on data fusion, Jones et al. 2011 on metadata organisation, McCafferty et al. (2010) on secure data infrastructures, and Turner and Tan (2012) on social science workflows. Taking the example of analysing social science workflows described above, the research has led to the development of an understanding of social science workflows, to creating a researcher-friendly graphical notation for workflows, to devising new techniques for analysing and implementing workflows, and to supporting these through a comprehensive software package. The results have been widely disseminated to social science and computer science researchers and the outcomes are also significant for other disciplines - for example, work has begun on adapting the approach for use in environmental science and in neuroscience.
Sectors Communities and Social Services/Policy,Digital/Communication/Information Technologies (including Software),Education

URL http://www.dames.org.uk/
 
Description MSc level training provisions
Geographic Reach National 
Policy Influence Type Influenced training of practitioners or researchers
Impact Inputs of materials on data management and on social networks and social distance to course materials for MSc level training at: - Essex Summer School in Social Science Data Analysis (2010->) - MSc Social Statistics and Social Research (2014 ->, University of Stirling) Participants in these initiatives are now exposed to materials on advanced data management and on social networks and social distance which impacts upon their understanding and research capacity
 
Description SOC 2010 revision steering group (Paul Lambert)
Geographic Reach National 
Policy Influence Type Participation in advisory committee
Impact Contribution to updating Office for National Statistics measurement instruments (Standard Occupational Classification 2010)
URL http://www.ons.gov.uk/ons/guide-method/classifications/current-standard-classifications/soc2010/inde...
 
Description AQMeN : Applied Quantitative Methods Network
Amount £1,300,000 (GBP)
Funding ID ES/G032408/1 
Organisation Economic and Social Research Council 
Sector Public
Country United Kingdom
Start  
 
Description Australian Urban Research Infrastructure Network (AURIN)
Amount £13,400,000 (GBP)
Organisation Australian Research Council 
Sector Public
Country Australia
Start  
 
Description DAMES node linked PhD studentships
Amount £240,000 (GBP)
Organisation Economic and Social Research Council 
Sector Public
Country United Kingdom
Start 10/2009 
End 09/2013
 
Description E-stat node
Amount £850,000 (GBP)
Funding ID ES/G034834/1 
Organisation Economic and Social Research Council 
Sector Public
Country United Kingdom
Start 09/2009 
End 12/2012
 
Description ESRC Centre for Population Change
Amount £5,000,000 (GBP)
Organisation Economic and Social Research Council 
Sector Public
Country United Kingdom
Start  
 
Description NeISS : national e-infrastructure for social simulation
Amount £1,350,000 (GBP)
Organisation Jisc 
Sector Public
Country United Kingdom
Start 04/2009 
End 03/2012
 
Description SIMSAM/SUNSTRAT, guest research program : visitorship
Amount £4,000 (GBP)
Organisation Swedish Initiative for research on Microdata in the Social and Medical Sciences 
Sector Academic/University
Country Sweden
Start 08/2012 
End 09/2012
 
Description Scottish health informatics platform
Amount £4,000,000 (GBP)
Organisation Economic and Social Research Council 
Sector Public
Country United Kingdom
Start  
 
Description Social networks and occupational structure
Amount £199,000 (GBP)
Funding ID ES/H030360/1 
Organisation Economic and Social Research Council 
Sector Public
Country United Kingdom
Start 10/2010 
End 09/2012
 
Title GESDE resources 
Description A series of 'Grid Enabled Specialist Data Environments' were deveopled that made available a large volume of data and metadata that was collected and constructed during the project and was concerned with the analysis of occupational records, of educational qualifications, and of ethnicity. The data was initially made available via an online portal system, but has more recently been available as a downloadable collection of resources from standard format websites. Most of the information is potentially available from other online sources as well, but the coordinated model used in these services makes locating data easier. 
Type Of Material Database/Collection of data 
Year Produced 2007 
Provided To Others? Yes  
Impact We know of numerous researchers who accessed information from these systems and used it in their own research. 
URL http://www.dames.org.uk/themes.html#theme1_1
 
Description Microclasses : how useful are they? 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Invited presentation: An illustration of detailed parental occupational differences and their effects on children's educational attainment in Britain

Evidence presented raised new issues, and interest from attendees in approaches that had not been previously exploited.
Year(s) Of Engagement Activity 2012
 
Description Security-oriented portals for the life sciences 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Delivered by Richard Sinnott at the First International Workshop on Portals for Life Sciences
Year(s) Of Engagement Activity 2010
 
Description Tool support for security-oriented virtual research collaborations 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Delivered by Richard Sinnott at the 2009 IEEE International Workshop on Security in e-Science and e-Research
Year(s) Of Engagement Activity 2009