Big Data Complex Query Analytics and Query Processing
Lead Research Organisation:
University of Glasgow
Department Name: School of Computing Science
Abstract
It is of no doubt that we are living in the Big Data era. Vast amounts of data are available
to us and every day the number of data is increasing. This gave rise to platforms such as
Hadoop/MapReduce and Spark which are systems designed specifically to tackle the
computation issues that arise when using conventional platforms. However, running
complex analytics queries over huge datasets still remains a problem. The aim of this
project is to make use of advanced Machine Learning models to improve efficiency. This
models can be trained using a query-driven approach. Meaning that the training of such
models will be based on previously executed queries. This approach would reduce costs
as it will reduce the number of calls to a cloud management platform. It will also greatly
improve efficiency in such systems and continue providing the same level of performance
when data size becomes larger.
to us and every day the number of data is increasing. This gave rise to platforms such as
Hadoop/MapReduce and Spark which are systems designed specifically to tackle the
computation issues that arise when using conventional platforms. However, running
complex analytics queries over huge datasets still remains a problem. The aim of this
project is to make use of advanced Machine Learning models to improve efficiency. This
models can be trained using a query-driven approach. Meaning that the training of such
models will be based on previously executed queries. This approach would reduce costs
as it will reduce the number of calls to a cloud management platform. It will also greatly
improve efficiency in such systems and continue providing the same level of performance
when data size becomes larger.
Organisations
People |
ORCID iD |
PETER TRIANTAFILLOU (Primary Supervisor) | |
Fotis Savva (Student) |
Publications
Savva F
(2018)
Explaining Aggregates for Exploratory Analytics
Savva F
(2019)
Aggregate Query Prediction under Dynamic Workloads
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
EP/N509668/1 | 30/09/2016 | 29/09/2021 | |||
1804140 | Studentship | EP/N509668/1 | 02/10/2016 | 30/03/2020 | Fotis Savva |
Description | Developed a new way of estimating the results of queries executed over large datasets. This method leverages Machine Learning to provide high efficiency in query execution and can return a result in milliseconds compared to normal execution which normally takes minutes or hours. As this method essentially learns query patterns exhibited by the user a mechanism was also developed to detect and adapt the current Machine Learning models in case query patterns change. |
Exploitation Route | Large technology organizations can use our findings to adopt Machine Learning for large-scale query execution. They can also use our proposed mechanisms for adapting their Machine Learning models that are currently in production. |
Sectors | Digital/Communication/Information Technologies (including Software) |
URL | https://arxiv.org/abs/1812.11346 |