Semi-automatic Data Tours to Support Data Exploration and Visualisation Literacy for Novice Analysts

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Informatics

Abstract

Data analysis is key to understanding timely phenomena from climate change to social media, from diseases to political conflicts, from the human brain to migration. In order to complement statistical analysis and modern machine learning approaches for data analysis, visualisation techniques and interactive interfaces support human-in-the-loop control over these systems as well as human sensemaking in cases where data is uncertain, requires greater overview for the generation of hypotheses, and effective communication to larger audiences. While more and more tools, such as Tableau, Gephi or Microsoft's PowerBI are democratising the use of data visualisation, using data visualisations to their full extend requires training novice analysts in tools, techniques, and interactive exploration, as well as communication and presentation.

This project aims to free the analyst from their burden of exploring a data set from the beginning while having to chose among tools, learn their workflows, and create visualisations themselves. Rather, it aims to support novice analysts through a system that automatically displays information about a data set to an analyst while explaining visualisation techniques and findings. In such a "data tour", an analyst starts as a passive reader following a set of visualisations and textual explanations. Respective visualisations will be explained to the analyst. As the analyst becomes familiar with visualisations and their data, they are invited to explore the data by themselves through an interactive interface and communicate the system in which aspects they are most interested in.

Creating effective data tours draws inspiration from previous work on using comics for data-driven storytelling (htttp://datacomics.net), visualisation cheatsheets (http://visualizationcheatsheets.github.io) and approaches to data visualisation literacy, data mining for networks, and human-computer interaction.

To provide for specific data sets and contact with novice analysts for evaluating our tool, this project involves collaborators in history, archeology, sociology and network science and their complex geo-temporal networks including social networks, archeological trading networks, family networks, and Twitter networks.

To create compelling data tours for these data sets we lack significant understanding of
- current exploration strategies employed by analysts and their barriers to analysis,
- ways of automatically extracting and annotating patterns-of-interest in networks, and
- ways of creating meaningful explanatory sequences and high-level structures for data tours.

This research involves a coordinated approach of field studies, visualisation and interface design, implementation, and user-centered evaluation. During a brief first phase, we will closely work with experts in Humanities research to create effective visualisations for their networks; in a second phase we mine and present insights from these data sets, and in the last phase, we investigate ways to structure and present findings in data tours.

Our research will open new questions in how far storytelling and explaining visualisations can be supported by intelligent agents, i.e., computer programs, that partner with humans and engage in a dialogue. Our research may inspire new forms of intelligent interfaces that foresee an analyst's tasks and understand their specific interest in the data. Researchers in the digital humanities, social sciences, and network analysis will benefit from better support for visualising their geo-temporal networks and semi-automatic ways to analyse and lead to a better understanding of their data and new collaborative research agendas using visual analysis. Our project aims to provide impulses for commercial products and recommendation engines and will provide companies with knowledge and techniques to build customised data tours for their clients.
 
Description 1) Guidance : we could build and verify one a specific way to guide novice analysts, i.e., people without specific training in network analysis. The results show our user interface and recommender system are successful. That verifies one of the core working hypotheses in our project.
2) Toolkit design: We could build a highly expressive and flexible network visualization toolkit that enables us to create a wide range of visualization techniques. This is supporting a second working hypothesis of quickly creating visualizations. This achievement helps us disseminating our visualization to (novice) analysts and to address our last open research question / workpackage.
Exploitation Route - Our Toolkit is open source and will help developers quickly build network visualizations for diverse applications (e.g., neuroscience, social networks, ... ). Researchers will find the toolkit useful to run more studies on the perceptual characteristics of specific visualizations it supports. Likewise, they are invited to contribute to the toolkit and extend its expressiveness and functions.

- our Guidance mechanisms show the potential of that approach and provide ideas for further investigation. Likewise, since the general concept seems to be promising, it inspires futures research on better ways to understand how to support novice analysts in their data exploration.
Sectors Digital/Communication/Information Technologies (including Software),Education

URL https://networknarratives.github.io
 
Description We could apply our Toolkit to a related research project to design and build a network application. The resulting visualization interface will soon be live online and inform people about social networks in peace process negotiations. With further external funding, we are currently in the process to apply the toolkit to two more case studies.
First Year Of Impact 2023
Sector Security and Diplomacy
Impact Types Policy & public services

 
Description Collaboration with Hong Kong University 
Organisation The Hong Kong University of Science and Technology
Country Hong Kong 
Sector Academic/University 
PI Contribution Planning, overseeing, and designing NetworkNarratives tool (see software contributions) as well as helping with implementation and linking it to The Vistorian (see Software contributions). We are also heavily involved in the ongoing evaluation and paper writing.
Collaborator Contribution Help with large parts of the implementation and some parts of the design. Close collaboration on evaluation planning and conduction. Paper writing
Impact Publication in progress after failed submission earlier in 2021.
Start Year 2021
 
Description Inria Collaboration 
Organisation Inria Saclay - Île-de-France Research Centre
Country France 
Sector Public 
PI Contribution Designing and implementing The Vistorian, implementing interaction log functionality, collecting data, running workshops, analyzing data, paper writing.
Collaborator Contribution Helping with high-level research advice, helping with log implementation, helping with findings discussions.
Impact In progress
Start Year 2021
 
Title Network Visualization Grammar 
Description This is a javascript toolkit that allows to declaratively specify network visualizations by what they show, rather than dealing with lower-level information such as layouts and specific visual encodings. The library uses JSON to formalize a network visualization specification. Our system consequently renders the network according to its specification. We are currently in the progress of extensing the current grammar and User interface. As outlined in the project proposal, this grammar will help us to quickly create a wide variety of network visualizations required for the other tools such as The Vistorian and NetworkNarratives. The tool is scheduled for publication as full paper this summer. 
Type Of Technology Webtool/Application 
Year Produced 2022 
Open Source License? Yes  
Impact In progress 
 
Title NetworkNarratives: Guided Tours for Network Exploration 
Description NetworkNarratives is an online application and module for The Vistorian. It's based on the concept of curated walkthroughs that represent common exploration walkthroughs in network analysis. The tool presents a user interface and underlying mechanisms to create these walkthroughs. Example walkthoughts include: exploration of an ego-network, comparing communities, or understand a entwork over time. From a given network, the system extracts facts (node-degree, clusters, etc.). Each fact is consequently shown on a slide in a slide show while the user can flick through the slides, stopping and dwelling on the slides / facts of most interest to them. The tool is currently under evaluation and planned for submission as full paper at the end of March 2022. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact In progress 
URL https://networknarratives.github.io
 
Title The Vistorian: An open platform for multivariate network visualization 
Description The Vistorian is an open application and research platform for interactive network visualizations. It supports networks with geographic, temporal, and multidimensional data. The platform provides for a data upload wizard and data management interface and a set of four interactive network visualizations and provides data uploading routines for a variety of common network formats. The tool is meant to both popularize novel network visualizations from our research and to build an international community of users to inform future research questions around network visualizations. It is entirely written in in JavaScript and WebGL using state-of-the-art libraries such as d3.js and package managers such as NPM. The Vistorian is extensible in that new visualizations can be added easily, using the common API that holds the network data. The platform does not require a server, but instead stores users' data persistently in local storage. That avoids us transmitting and storing potentially sensitive user data. The website contains information about how to use the tools and eventual workshops and courses we run. A dedicated mailing list distributes updates. 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact We are currently working with users to assess impact of the tool, i.e., running workshops, courses, and tracking users interactions with the tool. 
URL http://vistorian.net
 
Description 6 Weeks online Network Visualization Course 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This 6 weeks course featured 6 interactive sessions to explain exploratory network visualization to 24 novice analysts. The course featured the visualization platform we are building as part of this grant. The course is part of an ongoing evaluation of challenges in network exploration. We also use the pool of participants to evaluate the first stage of a novel tool for guidance in networks as outlined in proposal. The course yielded a lot of challenges we are currently preparing for publication. These challenges as well as our other observations during the course help us better understanding analysts new to visualization and help us better inform future tools. We plan to re-run the course later this year to increase the user base of our visualization tools.
Year(s) Of Engagement Activity 2022
URL https://vistorian.github.io/courses.html
 
Description DH RSE summer school 2021 Vistorian Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact We ran an afternoon workshop of an earlier version of our network visualization tool The Vistorian at a summer school for Digital Humanities. We taught the audience about network visualization and visual network exploration, explaining our tools. The discussions and engagements following the course informed our current research on better understanding existing exploration behavior as well as inform the design of our tool Network Narratives (see Software contributions)
Year(s) Of Engagement Activity 2021
URL https://www.cdcs.ed.ac.uk/news/DH-RSE-summer-school-2021
 
Description Network visualization coaching sessions 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact These sessions help better understand novice analysts and how they manage data. This helps fine-tune our research questions.
Year(s) Of Engagement Activity 2023
URL https://vistorian.github.io/upcoming_courses.html