Description |
The main resources generated by the project involve a group of free online 'portals' which provide access to specialist information on occupations, ethnicity and educational qualifications (all available from www.dames.org.uk); the distribution of training materials on important but under-studied aspects of 'data management' (see also the above website); and new innovations in computer science relevant to social science data management (including development of a system for metadata curation and organisation, new tools for summarising workflows, and new systems demonstrating the secure handling of complex data resources). Substantive research was undertaken alongside these resource provisions and generated publications on occupational and educational inequalities, suicide and other mental health outcomes, and social care provision and needs.
Much of the Node's work addressed methodological issues. A long-standing challenge in social science research concerns 'getting the best' out of data resources: many rich data resources are available to researchers, but not all analyses do a good job in taking advantage of the information held on them. We argue that this often occurs because researchers simply aren't aware of alternative strategies for handling the data, and/or lack skills in enhancing and documenting data processing, with the end result being that we quite often see analysis based on unsatisfactory simplifications to complex information resources. There also remain certain unresolved but important methodological challenges in social science data, particularly concerned with making appropriate comparisons over time or between countries when using large scale survey data, or of comparing between alternative options for measuring and analysing popular social science concepts such as 'class' or 'ethnicity'. At the start of the Node, we claimed that there would be ways of exploiting emergent approaches from computer science research which would help us to improve upon the exploitation of data resources in social science research. During the project, we developed new online resources to contribute to data management practices, and we undertook new research on both social and computer science topics which involved significant volumes of data management and served to test out our new provisions as well as to generate new research findings.
The Node included a significant component of computer science research. Original research was conducted in areas covering secure handling of complex data resources, the development of portal systems for social science data organisation and collaboration, and workflow modelling. As one example, the research on the design and support of workflows for social science led to the CRESS methodology and associated toolset for workflows (http://www.cs.stir.ac.uk/cress). The approach supports the definition of Web-based and Grid-based workflows as high-level combinations of individual social science solutions. A variety of social science applications have been used to demonstrate the usefulness of the work and a complete methodology is supported whereby workflows are described graphically, are automatically checked for errors, and are automatically realised as online applications.
|
Exploitation Route |
Several contributions from the project involve resource provision and knowledge exchange which are relevant to researchers involved in the use of social science data across different sectors. Relevant provisions cover information and resources concerned with access to and handling of specialist data resources associated with occupations, educational qualifications and ethnicity; dissemination of training materials covering handling large and complex quantitative data; the publication of information on special features of data associated with mental health records and with research on social care; and the publication of generic materials on computing approaches to metadata, workflows and security infrastructures. All of these contributions offer resources relevant to users from outside the academic research sector; in most instances accessible materials have been made available online via www.dames.org.uk, facilitating easy access to resources.
Certain specialist topics within the Node have direct relevance to non-academic practitioners. For instance, the Node's work on linked and secure e-Health data focussing on the theme of mental health has relevance for various stakeholders in health research, and there are various non-academic groups directly involved in using specialist data covered by the three GESDE services (e.g. the UK's Office for National Statistics and local authorities who use social statistics in these areas). Indeed, the general model used by the GESDE services has now been replicated in other UK data services which provide support for non-academic and academic users alike in health survey research (see the 'Methodbox' project, also part of the UK's e-Social Science programme, and to which DAMES contributed inputs and suggestions) and to users of the Administrative Data Liaison Service which developed the 'P-ADLS' service after suggestions from and collaborative meetings with members of the DAMES Node). Since the Node was concerned with providing online resources in a range of social science data scenarios, there are obvious potential exploitation routes in using the online resources to the benefit of further research or understanding. Access to the online resources if free to all, and the online 'portals' have a basic 'guest' level of access from which any user can search the system and download resources obtained. The portals also have a 'registered user' level of access for which individual authentication is required, after which certain other resources can be made available, including the important opportunity to deposit data for dissemination to other researchers.
It remains difficult to demonstrate in a systematic way the scale of further exploitation of online resources provided by the Node, as we do not have representative data on how researchers exploit resources from the Node, and from other sources, for data management. Our webpages have recorded 'guest level' hits on a daily basis since the portals have been available, but the number of registered users of the three 'GESDE' portals is not as high as we would have hoped (there are currently 17registered users for the GEODE and GEEDE services (combined) and 25 for the GEMDE resources). Guest users may download data from the services, but only registered users may deposit new data, and accordingly the volume of deposited resources in the three 'GESDE' portals is also not as high as we would hope (approximately 300 unique resources at GEODE and GEEDE (combined) and 47 unique resources at GEMDE), most of which have deposited by members of the DAMES Node itself).
Hitherto, registered users have all been from academic research organisations, from the UK and from other EU countries, but this need not follow automatically. The scope of the resources covered through the Node is very wide, covering data from many different countries, and covering all time periods for which social science data is available (for instance, several resources at GEODE concern data on occupations from the 19th Century). The Node has already enjoyed some productive cross-national data collaborations (e.g. in collaborative meetings with representatives from CESSDA including authoring reports for that important international project - see deliverables D11.1a and D11.ab at http://www.cessda.org/project/deliverables.html), and in research collaborations with a funded project led by Dr Erik Bihagen at the University of Stockholm - e.g. Lambert and Bihagen 2012). Equally there are clear further exploitation possibilities involving uptake of, and contributions to, resources by researchers from other nations.
One feature of the Node was its use of collaborative meetings to facilitate further research connections. Numerous experts visited the Node during its lifetime, leading to further collaborations with important academic staff and other research organisations (e.g. the UK's Office for National Statistics and 'Scotcen', a branch of the National Centre for Survey Research). The project has also led to the establishment of further long-term collaborative research groups, such as the University of Stirling's 'E-Health Data Linkage Research Group', chaired by Maxwell and involving over 15 staff from four different University schools, which in term led to Maxwell and Lambert being included as collaborators within the E-HIRC bid for a Scottish e-Health Research Centre in 2012.
We are able to point to various examples where social science researchers have exploited the resources generated by DAMES (see the 'Impact report'). Nevertheless, an important lesson from the Node's work concerned the challenge of moving from technical capacity to practical uptake. Whilst we have a compelling argument that our new resources offer clear benefits in terms of scientific quality through their support for activities such as sensitivity analysis, aspects of the resources that we have developed are probably still quite challenging for many users. Additionally, our own online portals have experienced more functional errors during their development than we anticipated, which must also have been off-putting to prospective users. Comments received in feedback forms have noted that the new approaches which we advocate (e.g. more sensitivity analysis and greater attention to 'variable operationalisations') are effectively asking other researchers to do something significantly different, and apparently harder, than they are used to. All of these factors may mean that extended exploitation of the online resources would benefit from further inducement and support rather than simply by making the resources available - to this end, members of the Node have been very active in pursuing further funding opportunities to allow continued work on maintaining and promoting the online services. Ideally, funds will be secured with will allow dedicated staff commitment to provide the manual monitoring of use and corrections to any emergent requirements (such as when other software is upgraded), along with to support efforts to promote the resources to applied researchers.
The computer science research within the Node has the potential to make many further contributions to research. Findings have been summarised in papers, including Warner et al. (2010) on data fusion, Jones et al. 2011 on metadata organisation, McCafferty et al. (2010) on secure data infrastructures, and Turner and Tan (2012) on social science workflows. Taking the example of analysing social science workflows described above, the research has led to the development of an understanding of social science workflows, to creating a researcher-friendly graphical notation for workflows, to devising new techniques for analysing and implementing workflows, and to supporting these through a comprehensive software package. The results have been widely disseminated to social science and computer science researchers and the outcomes are also significant for other disciplines - for example, work has begun on adapting the approach for use in environmental science and in neuroscience.
|