To what extent are Google search queries predictive of user intent?
Lead Research Organisation:
University of Oxford
Department Name: Oxford Internet Institute
Abstract
In 2014, The Atlantic headline stated: \Google knows you better than you know yourself"
[1]. This expression refers to the lack of reservation with which online users expose their inquisitiveness, interests and desires to the online environment | especially to search engines. However, are users' intentions unambiguously reflected in their search behaviour and can also easily be derived from it? Furthermore, can search queries be used as distinct predictors for users' subsequent online actions?
I suggest in the following three key research questions that are necessary to address my main research question:
Q1: Which statistical methods are most applicable for generating search queries with a predictive ability for user intent?
Q2: To what extent does search query behaviour depend on temporal affects? Q3: Does web search behaviour represent an unrestricted reflection of user intent?
[1]. This expression refers to the lack of reservation with which online users expose their inquisitiveness, interests and desires to the online environment | especially to search engines. However, are users' intentions unambiguously reflected in their search behaviour and can also easily be derived from it? Furthermore, can search queries be used as distinct predictors for users' subsequent online actions?
I suggest in the following three key research questions that are necessary to address my main research question:
Q1: Which statistical methods are most applicable for generating search queries with a predictive ability for user intent?
Q2: To what extent does search query behaviour depend on temporal affects? Q3: Does web search behaviour represent an unrestricted reflection of user intent?
Organisations
People |
ORCID iD |
Scott Hale (Primary Supervisor) | |
Katharina Anders (Student) |
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
ES/P000649/1 | 01/10/2017 | 30/09/2027 | |||
1923258 | Studentship | ES/P000649/1 | 01/10/2017 | 25/08/2021 | Katharina Anders |
Description | Within the investigations around the Cambridge Analytica scandal in the beginning of 2018, it was found that personal Facebook profile data were used to target political adverts. Consequently, the missing transparency in privacy data rights of online media platforms was criticised and there was a demand for stricter regulations on how user identifiable data are used online and in how this needs to be communicated to the users. Although the scandal only involved Facebook data, it is unclear whether the general perception of online platform data protection has been affected. In particular, I am interested in the effect that data privacy scandals have on users' perceived anonymity online and how this leads to a change in online search behaviour, particularly regarding sensitive topics. Furthermore, I want to understand what implications a change in search behaviour has for social sensing models that use search volume indices as explanatory input factors. Milestones - Discovered a literature gap in the impact of privacy loss after major data scandals on online bahaviour. - Gathered appropriate sensitive areas that can be for researched online to be examined on their change in online research volume after a data privacy event. - Phrased a research plan in my transfer of status document |
Exploitation Route | Seeing an overall significant decrease of search volume after each of the data privacy scandals in at least one of the query selection methods would give evidence that the data privacy scandals might have influenced users' online search behaviour of sensitive topics. This would align with existing theory (Lufkin 2017; Nissenbaum 2016; Walther 1996) and recent surveys (Economist Intelligence Unit 2019; Pew 2013; Rainie et al. 2013) regarding how I expect human behaviour to change after perceived anonymity decreases due to the privacy scandals. However, it would contradict with the 'privacy paradox' (Baruh, Secinti and Cemalcilar 2017) if I detect that users actually act differently online after privacy scandals. I will also be able to detect the duration of weeks the scandal influenced the search behaviour and with that, when the impact of data privacy scandals fades off. This can be used to compare with the findings from Preibusch (2015). |
Sectors | Digital/Communication/Information Technologies (including Software),Financial Services, and Management Consultancy,Healthcare,Pharmaceuticals and Medical Biotechnology,Retail |
URL | https://docs.google.com/document/d/1TBCDrAZkhUy5PshCMqj7oxvK3t74Nsec_gvYz1sVUAI/edit# |
Description | Data Science Summit Presentation |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | Gave a presentation about how to run and automate experiments. |
Year(s) Of Engagement Activity | 2019 |
Description | Google Maps Analytics Presentation |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | Gave presentation about my research to group of Google Maps Analysts to get their feedback. |
Year(s) Of Engagement Activity | 2020 |
Description | Panelist Data Career Discussion |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | I attended an expert data scientist career panel as a panelist for young professionals who want to go ahead with an analyst pathway. |
Year(s) Of Engagement Activity | 2020 |