Optical Music Recognition from Multiple Sources

Lead Research Organisation: Lancaster University

Department Name: Lancaster Inst for the Contemporary Arts

Abstract

Digital images of scores are now available in very large numbers from services such as IMSLP (International Music Scores Library Project), which has images of about a quarter of a million scores of over 70,000 pieces of music. Most of this information is opaque to computers, because a score image is a picture rather than a source of musical information. To get at the musical information, the score needs to be 'read'. While software to read digital images of words, turning them into encoded text (Optical Character Recognition, OCR), is now common and underlies such services as Google Books, the equivalent task for music (Optical Music Recognition, OMR) has proven to be a very difficult. The accuracy of the results is often low, and some researchers have reported that the time taken to find and correct errors is so great as to make it just as quick to enter the data by hand. In other cases, considerable time is required in training OMR software to get good results from a particular source.

This project aims to turn the problem of large quantities of inaccessible data into an advantage, and make use of the quantity of information available to improve the accuracy of OMR. Multiple images of the score of a piece of music are often available, and to use more than one image increases the information available for the OMR task. This project will take two approaches to improving OMR through the use of multiple sources. The first, post-processing, approach will result in software to take the outputs of multiple runs of OMR software on different sources for the same piece of music, and combine these outputs into a single, more accurate, representation of the piece. The second approach will use multiple information within the OMR software (adapting existing open-source software) to improve the accuracy of recognition. A part of one score which is difficult to read will become easier if information can be used from part of another score which represents the same part of the piece.

The outcome will be new techniques and software modules for OMR (both post-processing of results and new multi-input OMR software), protocols for the use of OMR with multiple sources, and information about the levels of accuracy which can be expected from OMR software.

Planned Impact

There is enormous public interest in music, and while much of this is directed at music which, like much popular music, does not rely heavily on scores, there remains substantial interest in music for which the score is central. Making Music, an umbrella group for voluntary music in the UK, has over 3,000 member organisations. A survey in 2008 in the USA found that 3.1% of the adult population was involved in performing or creating classical music and 5.2% in choirs or choruses (2008 Survey of Public Participation in the Arts, National Endowment for the Arts, Washington DC, 2009, p.44). Access to scores and the information in scores is vital for these activities. Reliable and widespread OMR has the potential to transform access to and use of scores in the same way that OCR has transformed access to books. Peachnote.com already gives some flavour of what can be achieved by allowing scores on IMSLP to be searched (to some degree) by a melodic sequence of pitches. Better OMR would both allow this to be more accurate and allow other kinds of searching. A choir director who knows that the sopranos can barely sing above a G but who has some basses capable of singing very low notes could search for music by the ranges of the voices, for example.

There are similar potential benefits for music education. At the moment a teacher who wants to give an example of a particular contrapuntal pattern or procedure needs to know of examples in advance. OMR on a large dataset such as IMSLP would allow examples to be found by search. Students will also benefit from being able to search for music by content.

New ways of work will be facilitated in the music industry. It is common practice now for music for video and film to be selected from existing sources, either 'production music' (pre-recorded short sequences intended to suit particular circumstances, such as 'Scottish open road') or existing compositions. (Newly composed music is typically only used for the biggest productions, for reasons of cost.) This has led to the repeated use of a small number of pieces of music, and the prevalence of low-quality low-impact 'canned' music. The ability to search scores by content will give easier access to a wider range of music, and hopefully lead to better film and video music.

In the creative field, access to scores by content could lead to new ways of putting together pieces, perhaps in a kind of 'score mashup' by analogy with audio and video mashups made from selecting sounds and images by content. (See for example the research on 'audio mosaicing', special issue of Journal of New Music Research, 2006.)

We make a case above that the access to large datasets, which accurate OMR would facilitate, will lead to better musicology. The consequences of this will feed through to impact for the wider community also. We can expect an indirect impact of this research to be better music education and better wider understanding of music.

Funded Value:

£77,811

Funded Period:

Jan 14 - Mar 15

Funder:

AHRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

AH/L009870/1

Principal Investigator:

Alan Marsden

Research Subject:

Info. & commun. Technol. (50%)

Music (50%)

Research Topic:

Classical Music (25%)

Image & Vision Computing (50%)

Musicology (25%)

Organisations

Lancaster University (Lead Research Organisation)

People	ORCID iD
Alan Marsden (Principal Investigator)
Kia Ng (Co-Investigator)
Alex McLean (Researcher)

Publications

Author Name

Title Publication Date Published

10 25 50

Padilla, V (2015) Improving optical music recognition by combining outputs from multiple sources

Padilla V (2014) Improving OMR for Digital Music Libraries with Multiple Recognisers and Multiple Sources

Nazarova A (2014) How can Typography be Represented in an Alter-modernity Context?

Key Findings
Software and Technical Products


Description	Optical music recognition (OMR) is the process of reading a score by a computer to put the information it contains into a machine-readable form such as MusicXML. By using several editions of a score, plus both parts and score when available, and by combining the outputs of four different pieces of OMR software, we have been able to approximate halve the number of errors in pitch and rhythm typically made by such software. However, a substantial number of errors remain (accuracy is typically 85-95%) and we believe that therefore a new approach to OMR is required.
Exploitation Route	Our improvements will be useful for musicologists and all those who want to enter music into a computer from a score. Our results suggest that the time necessary for this will be approximately halved. However, the accuracy of the results remains low, and we believe our findings indicate that a new approach to OMR is required.
Sectors	Creative Economy,Culture, Heritage, Museums and Collections


Title	MultiOMR
Description	This software achieves more accurate results from optical music recognition software by combining the outputs from several recognition engines and several sources. The software includes components to pre-process scans of musical scores, to submit those to multiple optical-music-recognition engines, and to combine the resulting MusicXML files to produce a single, more accurate, result.
Type Of Technology	Software
Year Produced	2015
Open Source License?	Yes
Impact	Impacts are pending.
URL	https://github.com/MultiOMR

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications