Dig that lick: Analysing large-scale data for melodic patterns in jazz performances
Lead Research Organisation:
Queen Mary University of London
Department Name: Sch of Electronic Eng & Computer Science
Abstract
The recorded legacy of jazz spans a century and provides a vast corpus of data documenting
its development. Recent advances in digital signal processing and data analysis technologies
enable automatic recognition of musical structures and their linkage through metadata to
historical and social context. Automatic metadata extraction and aggregation give unprecedented
access to large collections, fostering new interdisciplinary research opportunities.
This project aims to develop innovative technological and music-analytical methods to gain
fresh insight into jazz history by bringing together renowned scholars and results from several
high-profile projects. Musicologists and computer scientists will together create a deeper and
more comprehensive understanding of jazz in its social and cultural context. We exemplify our
methods via a full cycle of analysis of melodic patterns, or "licks", from audio recordings to an
aesthetically contextualised and historically situated understanding.
its development. Recent advances in digital signal processing and data analysis technologies
enable automatic recognition of musical structures and their linkage through metadata to
historical and social context. Automatic metadata extraction and aggregation give unprecedented
access to large collections, fostering new interdisciplinary research opportunities.
This project aims to develop innovative technological and music-analytical methods to gain
fresh insight into jazz history by bringing together renowned scholars and results from several
high-profile projects. Musicologists and computer scientists will together create a deeper and
more comprehensive understanding of jazz in its social and cultural context. We exemplify our
methods via a full cycle of analysis of melodic patterns, or "licks", from audio recordings to an
aesthetically contextualised and historically situated understanding.
Planned Impact
The study of jazz requires insights from, and feeds knowledge back into, African American Studies,
Anthropology, Art History, Literary Studies, Music, Philosophy, Political Science, and Sociology.
A thorough analysis of a century's worth of jazz recordings, and the practices the music entails,
is now possible thanks to recent advances in the computational analysis of audio content, or
Music Information Retrieval (MIR), and to progress in processing large datasets and information
management with Semantic Web technologies. The former enables the automatic description
of audio recordings in terms of high-level or structural musical aspects, and the latter allows
such analyses to be linked to discographic metadata, distributed over multiple sites, describing
performers and composers, listeners, performance venues, and production and consumption
factors, and general historic, cultural and geographic information from external resources. These
technologies can now facilitate access to large collections by researchers from the many disciplines
interested in the evolution of musical expression.
The Dig that Lick project will: enhance infrastructures for semantic audio analyses
of large collections; facilitate access to large collections of audio and associated metadata via
interfaces for content selection, semantic analysis, and aggregation of results that humanities
researchers can easily use; develop this infrastructure to analyse melodic patterns across large
corpora of jazz audio; and relate the results to metadata and background knowledge in order
to trace and interpret musical influence across time and space as well as cultures and societies.
We will develop tool sets and resources that allow researchers to perform studies over wide
time-spans and geographic locations, for example to trace the evolution or spread of certain
musical phenomena. This will enable cross-historical or comparative geographical music research
with direct reference to data and metadata on music performance and creation, an approach
rarely attempted in the musicology of jazz, or of non-notated music.
Our target audiences include: academic communities in jazz studies, whether in music, cultural
studies, social sciences, or business management; MIR practitioners in engineering or library and
information sciences as they relate to music; the J-DISC user community; that is, researchers
and educators who require structured, comprehensive search capabilities in investigating the
cultural background and social networks of jazz performance, accessed via the recorded legacy
of jazz; the Jazzomat community of musicologists and engineers interested in musical and cognitive
questions derived from jazz solo analysis; jazz musicians interested in a topic they wish to document
or explore for professional reasons; and jazz fans or students wishing to know more about an artist they follow.
We will engage with our target audiences via academic publications and presentations, spe-
cial events, software and data releases, and communication to the public via non-academic
channels, including the Dig that Lick web site, blogs, outreach events, social media and press
releases, as appropriate.
Our audiences will benefit in the following ways: jazz researchers - our tools and resources
will provide a powerful new paradigm for evidence-based research; jazz musicians - an unusually
analytical community, we expect many of them to be interested in the software developed by
this project to investigate and reflect on their own performances, helping to develop a better
understanding of the process of transmission and assimilation of patterns that are sometimes
conscious, but often opaque to the artists themselves; jazz aficionados - our software tools and
resources will be available through the web platform operated by the Center for Jazz Studies,
giving deeper understanding of their favorite artists.
Anthropology, Art History, Literary Studies, Music, Philosophy, Political Science, and Sociology.
A thorough analysis of a century's worth of jazz recordings, and the practices the music entails,
is now possible thanks to recent advances in the computational analysis of audio content, or
Music Information Retrieval (MIR), and to progress in processing large datasets and information
management with Semantic Web technologies. The former enables the automatic description
of audio recordings in terms of high-level or structural musical aspects, and the latter allows
such analyses to be linked to discographic metadata, distributed over multiple sites, describing
performers and composers, listeners, performance venues, and production and consumption
factors, and general historic, cultural and geographic information from external resources. These
technologies can now facilitate access to large collections by researchers from the many disciplines
interested in the evolution of musical expression.
The Dig that Lick project will: enhance infrastructures for semantic audio analyses
of large collections; facilitate access to large collections of audio and associated metadata via
interfaces for content selection, semantic analysis, and aggregation of results that humanities
researchers can easily use; develop this infrastructure to analyse melodic patterns across large
corpora of jazz audio; and relate the results to metadata and background knowledge in order
to trace and interpret musical influence across time and space as well as cultures and societies.
We will develop tool sets and resources that allow researchers to perform studies over wide
time-spans and geographic locations, for example to trace the evolution or spread of certain
musical phenomena. This will enable cross-historical or comparative geographical music research
with direct reference to data and metadata on music performance and creation, an approach
rarely attempted in the musicology of jazz, or of non-notated music.
Our target audiences include: academic communities in jazz studies, whether in music, cultural
studies, social sciences, or business management; MIR practitioners in engineering or library and
information sciences as they relate to music; the J-DISC user community; that is, researchers
and educators who require structured, comprehensive search capabilities in investigating the
cultural background and social networks of jazz performance, accessed via the recorded legacy
of jazz; the Jazzomat community of musicologists and engineers interested in musical and cognitive
questions derived from jazz solo analysis; jazz musicians interested in a topic they wish to document
or explore for professional reasons; and jazz fans or students wishing to know more about an artist they follow.
We will engage with our target audiences via academic publications and presentations, spe-
cial events, software and data releases, and communication to the public via non-academic
channels, including the Dig that Lick web site, blogs, outreach events, social media and press
releases, as appropriate.
Our audiences will benefit in the following ways: jazz researchers - our tools and resources
will provide a powerful new paradigm for evidence-based research; jazz musicians - an unusually
analytical community, we expect many of them to be interested in the software developed by
this project to investigate and reflect on their own performances, helping to develop a better
understanding of the process of transmission and assimilation of patterns that are sometimes
conscious, but often opaque to the artists themselves; jazz aficionados - our software tools and
resources will be available through the web platform operated by the Center for Jazz Studies,
giving deeper understanding of their favorite artists.
Publications
K. Frieler
(2018)
Two Web Applications for Exploring Melodic Patterns in Jazz Solos
D. Basaran
(2018)
Main Melody Estimation with Source-Filter NMF and CRNN
Frieler K.
(2018)
Two web applications for exploring melodic patterns in jazz solos
in Proceedings of the 19th International Society for Music Information Retrieval Conference, ISMIR 2018
T. Weyde
(2019)
Dig That Lick: A Technical Primer for Big Data Jazz Studies
F. Höger
(2019)
Dig That Lick: Exploring Melodic Patterns in Jazz Improvisation
K. Gabbard
(2019)
What We Are Digging Out of the Data?
K. Frieler
(2019)
Towards a History of Melodic Patterns in Jazz Performance
Description | Initial work involved analysis of a set of ~450 transcriptions of jazz solos (the Weimar Jazz Database), which identified a large number of melodic patterns repeated within and between improvisations. This work has been extended by automatic analysis of a collection of over 1000 jazz recordings, where the solos were transcribed by computer. We developed and tested an automatic transcription method that works reasonably well for the purposes of this project. The metadata for these tunes was identified and linked with various online resources, according to new ontology that we developed, the Jazz Ontology. This ontology was populated with metadata from several large scale audio and bibliographic corpora (the Jazz Encyclopedia, the Jazz Discography), and the resulting datasets were merged and linked to existing Linked Open Data resources. These datasets are publicly available and have been integrated into the main showcase of our project, the Dig That Lick Pattern SImilarity Search web site. Users can search for patterns across several datasets, and view the metadata and listen to the relevant excerpts for all instances of the query pattern found in the datasets. This online application is being used by jazz researchers and music lovers for the systematic study of jazz. |
Exploitation Route | This work is being taken forward in studies of patterns automatically extracted from audio recordings. Musicians could use these resources to learn about jazz, and in particular to learn idiomatic phrases that could be built into their own improvisations. |
Sectors | Creative Economy Digital/Communication/Information Technologies (including Software) Education Leisure Activities including Sports Recreation and Tourism Culture Heritage Museums and Collections |
URL | http://dig-that-lick.eecs.qmul.ac.uk/index.html |
Description | Since the release of the Dig That Lick Pattern Similarity Search web site, we have been contacted by a number of amateur musicians who are using the site to explore jazz improvisation and the dataset that we have analysed. The feedback received was very positive. Likewise professional archivists and jazz industry workers responded enthusiastically to demonstrations of our research outputs at a workshop and expert panel session. |
First Year Of Impact | 2019 |
Sector | Creative Economy,Leisure Activities, including Sports, Recreation and Tourism,Culture, Heritage, Museums and Collections |
Impact Types | Cultural |
Description | New Directions in Digital Jazz Studies: Music Information Retrieval and AI Support for Jazz Scholarship in Digital Archives |
Amount | £199,685 (GBP) |
Funding ID | AH/V009699/1 |
Organisation | Arts & Humanities Research Council (AHRC) |
Sector | Public |
Country | United Kingdom |
Start | 02/2021 |
End | 08/2023 |
Title | The Jazz Ontology: Ontology and software for processing jazz metadata |
Description | Jazz is a musical tradition with about 100 years of history; unlike in other Western musical traditions, improvisation plays a central role in jazz. Modelling the domain of jazz poses some ontological challenges due to specificities in musical content and performance practice, such as prevalence of recording sessions, band lineup fluidity and importance of short melodic patterns for improvisation. The Jazz Ontology is a semantic model that addresses these challenges, and also describes workflows for annotating melody transcriptions and for pattern search. The Jazz Ontology incorporates existing standards and ontologies such as FRBR and the Music Ontology. The ontology has been assessed by examining how well it supports describing and merging existing datasets and whether it facilitates novel discoveries in a music browsing application. The Jazz Ontology has been populated with the metadata from several large scale audio and bibliographic corpora (the Jazz Encyclopedia, the Jazz Discography). The resulting RDF datasets were merged and linked to existing Linked Open Data resources. These datasets are publicly available and are driving an online application that is being used by jazz researchers and music lovers for the systematic study of jazz. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | The Jazz Ontology is used in the Dig That Lick Pattern Similarity Search website, which is one of the major deliverables of our 2-year research project. It can be found here: https://dig-that-lick.hfm-weimar.de/similarity_search/ |
URL | https://osf.io/rqk7z/ |
Title | History of Recorded Jazz: DTL1000, 1920-2020 |
Description | We present the DTL1000 dataset, which was created in the "Dig That Lick" project and covers the history of recorded jazz with a sample of 1,750 improvisations extracted from 1,060 audio tracks. The dataset contains a mixture of collected (editorial metadata), manually annotated (structure, style), and automatically generated (main melody transcriptions of solos) data describing the recordings. The motivation for creating this dataset was the study of patterns in jazz improvisation, but there are many other applications for this resource. The accompanying paper presents the dataset creation process, data structure and contents with descriptive statistics and discusses the origin and process of the annotations, as well as general use cases and specifically the case of pattern analysis. These components and their combinations enable a number of use cases for jazz studies as well as algorithm development for music analysis. The DTL1000 dataset provides a rich resource for a variety of disciplines, and constitutes a contribution to a field where large datasets with rich annotations are scarce. |
Type Of Material | Database/Collection of data |
Year Produced | 2021 |
Provided To Others? | Yes |
Impact | The dataset is incorporated in the Pattern Search and Pattern Similarity Search web sites for exploring patterns in jazz solos. These are public-facing resources for researchers, students and enthusiasts to analyse the use of melodic patterns in jazz improvisation. See https://dig-that-lick.hfm-weimar.de/pattern_search/ and https://dig-that-lick.hfm-weimar.de/similarity_search/ . |
URL | https://dx.doi.org/10.5255/UKDA-SN-854781 |
Title | The Dig That Lick Pattern Similarity Search website |
Description | The Dig That Lick Pattern Similarity Search website is one of the major deliverables of our 2-year research project. Key features: Currently we support four melodic databases: The new DTL1000 database, comprising 300000 tone events in 1736 monophonic solos from over 600 jazz tunes spanning the 100 years of jazz history. The solos have been extracted automatically from audio using a newly developed CRNN-based algorithm specialised for jazz. The well-known Weimar Jazz Database with about 200000 tone events from 456 monophonic solos by 78 jazz masters. The Charlie Parker Omnibook with about 18000 tones taken from 52 solos by the co-inventor of bebop. The Essen Folk Song Collection, comprising about 350000 notes from 7352 folk songs. Similarity search can be carried out using interval, refined contour, and pitch patterns (n-grams). The underlying similarity measure is based on the Levenshtein Distance, which gives a reasonable approximation to true perceptual similarity. Various user-definable search parameters, a virtual keyboard for query input, and extensive metadata filters are also available. The result list shows all pattern instances for a given query in the user-defined similarity range with essential metadata and audio snippets for quick aural control. The search results can be grouped by performer (or folk song collection) and by pattern. Extra information for each pattern instance can be displayed by the user according to his or her needs. Result sets can be exported to CSV files by a single click. Furthermore, we provide several visualisation options for result sets such as a pattern timeline and various kinds of pattern networks. Global and personal search histories are available for quick retrieval of previous searches and for exploration of other users' queries. And, of course, there is extensive documentation available. |
Type Of Material | Computer model/algorithm |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | The algorithms, interfaces and data provided by this web site enable investigation of musical improvisation by musicologists and fans of jazz. |
URL | https://dig-that-lick.hfm-weimar.de/similarity_search/ |
Description | Final workshop for New Directions in Digital Jazz Studies (also known as JazzDAP, the Jazz Digital Archives Project), Jan 12 2024 at the Institute of Jazz Studies, Rutgers University |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Third sector organisations |
Results and Impact | A workshop was held for invited leaders of jazz archives and libraries in the US, to report and discuss the outputs of our current project New Directions in Digital Jazz Studies and the previous project Dig That Lick. The participants were enthusiastic about the potential of the technologies developed in and demonstrated from our projects, and we plan to build new collaborations with some of the participants in a future grant proposal. |
Year(s) Of Engagement Activity | 2024 |
Description | Panel discussion at the Jazz Congress 2024 (Lincoln Centre, New York City, NY, USA, 11 Jan 2024): Artificial Intelligence and New Directions in Digital Jazz Studies |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A panel session was held at the the Jazz Congress 2024 to present and discuss outputs from the project New Directions in Digital Jazz Studies, which builds on the project Dig That Lick. Our projects use state-of-the-art music information retrieval and artificial intelligence algorithms for the analysis of jazz recordings and linked data to enable novel approaches to co-creative use of materials in the archival collections of the Institute of Jazz Studies and Scottish Jazz Archive. The session featured a presentation of the AI tools in action, as well as a broad discussion with the research team and the audience on new directions in jazz scholarship and performance. Moderator: Adriana Cuervo (Institute of Jazz Studies, Rutgers University, USA) Panelists: Tillman Weyde (Principal Investigator UK, Department of Computer Science, City University of London, UK), Gabriel Solis (Principal Investigator US, Dean, College of Arts & Sciences, University of Washington, USA), Haftor Medboe (Scottish Jazz Archives, Edinburgh Napier University, UK), Pedro Cravinho (Keeper of the Archives, Faculty of Arts, Design and Media, Birmingham City University, UK), Simon Dixon (Queen Mary University of London) |
Year(s) Of Engagement Activity | 2024 |
URL | https://jazzcongress.org/sessions |