Developing trusted data systems for citizen science

Lead Research Organisation: Lancaster University
Department Name: Computing & Communications

Abstract

There are numerous data sources to advance scientific knowledge, yet there are a greater amount of avenues uncertainty and trust to be brought into question. Therefore, the primary aim of this research is to understand how users of secondary data come to place trust in these contemporary sources - that have not been collected by themselves and may contain potential uncertainties, and how we can foster well-placed trust in these sources.

- How do researchers view trust in data? Are they cognizant of the trust and uncertainty issues of data? Having seen from the literature that the definition of trust varies, how do researchers define trust in this context?
- How trust is performed versus how it is verbalised. What do researchers actually do when they use these data sources? Are they apprehensive and take caution in these scenarios, or do they take a pragmatic approach and utilise this data (regardless of trust) if it is a necessity?
- How do researchers account for potentially uncertain data in methods and methodologies? I.e. as uncertainty or untrustworthy data could potentially affect results and should be accounted for in order to be rigorous and reproducible. Are researchers aware of the effects of uncertainty and ambiguity in data and data science techniques?
- Under which conditions is this data sufficiently trustworthy enough for the purposes that one might hope to use it? Are there varying contexts and purposes for this?
- Finally, I will seek to understand the communication of uncertainty and trust. What forms of supplementary information are necessary for data users to formulate trust, and to use this data in their research? How can this be effectively presented and communicated?

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R512564/1 30/09/2017 29/09/2022
2080703 Studentship EP/R512564/1 30/09/2017 30/05/2022 Lauren Victoria Thornton
 
Description Trust and uncertainty are both complex concepts that can have many definitions dependent on a person and their context. Within environmental data science there are many different trusts, for instance in data and models, in decisions and policy, or in people - and there are different reasons for this trust. Given that this is such a large area this work has been primarily focused on how scientific researchers view trust, determine whether something is trustworthy, and how we can design technology to promote well-placed trust.

There is a large volume of trust definitions, and therefore a key finding of this work is that, whilst one single definition is challenging, several themes which have emerged, such as: transparency, provenance, level of detail, reputation, and evidence of trustworthiness. There is no specific definition of trust, but these themes are common and can help in designing technology. Additionally, the research has found that trustworthiness and trust do vary from person-to-person and context-to-context. For instance, there are different levels of understanding, e.g. a great and deep level of understanding is required in order to be able to trust when the data is being used for scientific work in comparison to a general level of understanding where all details are not required and only the 'bigger picture' is needed.

When thinking about evidence of trustworthiness for researchers, a key finding is that there are many different types. There are different levels of documentation, for instance, meta-data is often technical and contains lots of specific information. Additionally, supplementary information may also be available which includes more contextual material and may provide more reasoning about the decisions made and why certain things were done. Current empirical work is also seeking to understand abstractions of this information for different stakeholders who may not need to see all information, e.g. it is beyond their expertise or they don't have a lot of time to read it, but would nonetheless want to attain a certain level of understanding.

Finally, this work has led to the creation of theory that is concerned with the design of technology to promote well-placed trust via different features and properties of a system. This theory, 'trust affordances', can be useful to consider lots of different stakeholders, what they may need in order to be able to trust, and how to balance these interests out when they compete e.g. one feature may afford trust for one type of user but at the same time the same feature may not afford trust for another type of user.
Exploitation Route Both the empirical data and the theory related to this funding may be taken forward and used by others who are designing technological systems. The research is specifically applied to the domain of environmental data science, but will be applicable to other domains. The theory in particular may be useful as it encourages careful consideration and a nuanced approach to designing systems throughout the design process, ensuring that trustworthiness of the system and its features are user-centred.
Sectors Digital/Communication/Information Technologies (including Software)