Self-attention context and content embeddings for better and more explainable prediction of news effects on stockmarkets

Lead Research Organisation: University of Oxford
Department Name: Engineering Science

Abstract

Confidence from the general public and from the investor market is a feature companies constantly attempt to build, secure while predicting its' evolution. Conversely, public and private investors are keen on being able to forecast how well a company will do in relation to what is said about it in the press. Thus, forecasting the impact of "news which moves the market" (Chiang 2010) on a company's bottom line and stock price is of great timeliness to industry, investors and businesses alike. Moreover, research has shown that news effect on a given company correlates to companies in similar or interlinked industries (Ran et al. 2019), this implies that studying how news impact stock prices is of great importance from the standpoint of structural market stability, and thus of quantitative financial research. Existing forecasting methods to forecast employ content-based information (from the content of the news piece). More recent techniques explore the network structure of news pieces as news which spread faster may have more impact on a company's stock price. However, these techniques are very rarely put together in news impact prediction for finance, despite the fact that in closely related news-mining fields such as disinformation detection, this combination of techniques from both natural language processing and graph theory has shown its' worth. My proposed solution would be to merge content-based and context-based representations of a text to train a Machine Learning classifier from the family of self-attention models which have both shown their high accuracy and high degree of explain ability. Once trained, my solution has great potential in application across finance and business in order to empower companies to better predict their stock price and investors to build their portfolios. Moreover, the self-attention model I propose also allows my model to be more easily interpretable than existing forecasting techniques as self-attention models contain a representation of what information in the input data the model "looks at" to make its' prediction. This is far from a token feature, it is key to be able to back up forecasts based on news sentiment. Indeed, without it, forecasts of stock changes based on news article sentiments may be resisted, being perceived as a black box. Thus, given this explain ability, my solution would also have useful applications in governance and central banking institutions to help foresee and explain potential market downturns to proactively shape fiscal, monetary and financial policy.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
ES/P000649/1 01/10/2017 30/09/2027
2597209 Studentship ES/P000649/1 01/10/2021 30/09/2024 Dragos Gorduza