Abstract
A rhythm transcription algorithm converts a MIDI performance to a music score representation. We show results of comparing three HMM-based algorithms; note HMM [1,2], metrical HMM [3,4], and merged-output HMM [5]. As an evaluation measure, we use the rhythm correction cost (RCC), which is defined as the least number of operations to correct an estimated result. For details, see our paper [5,6].You can download programming codes and see the list of training data here.
References
[1] T. Otsuki, N. Saitou, M. Nakai, H. Shimodaira and S. Sagayama, "Musical Rhythm Recognition Using Hidden Markov Model (in Japanese)," J. Information Processing Society of Japan, 43(2), pp. 245-255, 2002.[2] H. Takeda, T. Nishimoto and S. Sagayama, "Rhythm and Tempo Analysis Toward Automatic Music Transcription," Proc. ICASSP, vol. 4, pp. 1317-1320, 2007.
[3] C. Raphael, "Automated Rhythm Transcription," Proc. ISMIR, pp. 99-107, 2001.
[4] M. Hamanaka, M. Goto, H. Asoh and N. Otsu, "A Learning-Based Quantization: Unsupervised Estimation of the Model Parameters," Proc. ICMC, pp. 369-372, 2003.
[5] E. Nakamura, K. Yoshii and S. Sagayama, "Rhythm Transcription of Polyphonic MIDI Performances Based on a Merged-Output HMM for Multiple Voices," Proc. SMC, pp. 338-343, 2016.
[6] E. Nakamura, K. Yoshii and S. Sagayama, "Rhythm Transcription of Polyphonic Piano Music Based on Merged-Output HMM for Multiple Voices," IEEE/ACM TASLP, 25(4), pp. 794-806, 2017.
Test data sets
The used data consist of two sets, the polyrhythmic data set and the non-polyrhythmic data set, each of which contains 30 excerpts of classical piano music. The polyrhythmic data set is composed of pieces that contained 2 against 3 or 3 against 4 polyrhythmic passages, and the non-polyrhythmic data set is composed of pieces that did not contain polyrhythmic passages.Polyrhythmic data set
- Brahms: Intermezzo Op. 118-2 Bar 48-55
- Chopin: Fantasie Impromptu Bar 13-16
- Debussy: Arabesque No. 1 Bar 6-9
- Debussy: Arabesque No. 1 Bar 19-21
- Debussy: Arabesque No. 1 Bar 76-79
- Debussy: Arabesque No. 1 Bar 89-94
- Brahms: Intermezzo Op. 118-2 Bar 48-55 (2nd time)
- Brahms: Intermezzo Op. 118-2 Bar 64-71
- Chopin: Fantaisie Impromptu Bar 5-8
- Chopin: Fantaisie Impromptu Bar 31-34
- Beethoven: 32 Variations, Var 9
- Beethoven: 32 Variations, Var 16
- Brahms: Paganini Variations Book 1 Var 5 Bar 1-8
- Brahms: Paganini Variations Book 1 Var 9 Bar 1-8
- Brahms: Paganini Variations Book 1 Var 12 Bar 5-15
- Brahms: Paganini Variations Book 2 Var 1 Bar 5-8
- Brahms: Paganini Variations Book 2 Var 1 Bar 9-16
- Brahms: Paganini Variations Book 2 Var 2 Bar 1-8
- Brahms: Paganini Variations Book 2 Var 5 Bar 5-10
- Brahms: Paganini Variations Book 2 Var 14 Coda
- Brahms: Paganini Variations Book 1 Var 5 Bar 9-16
- Brahms: Paganini Variations Book 1 Var 9 Bar 5-12
- Brahms: Paganini Variations Book 2 Var 2 Bar 9-24
- Rachmaninoff: Corelli Variations Coda
- Chopin: Piano Sonata No. 3 Mov. 1 Bar 41-50
- Schubert: Piano Sonata No. 21 Mov. 1 Bar 167-172
- Schubert: Piano Sonata No. 21 Mov. 4 Bar 309-324
- Liszt: Dante Sonata, Piu Tosto
- Ravel: Jeu d'Eau Bar 41-45
- Ravel: Jeu d'Eau Bar 46-47
Non-polyrhythmic data set
- Bach: Invention No. 1 Bar 15-18
- Bach: Invention No. 2 Bar 19-22
- Bach: Well Tempered Clavier Book 1 Prelude No. 13 Bar 24-30
- Bach: Well Tempered Clavier Book 2 Fugue No. 2 Bar 14-23
- Bach: Well Tempered Clavier Book 2 Fugue No. 19 Bar 1-6
- Bartok: Romanian Dance No. 2 Bar 1-16
- Bartok: Romanian Dance No. 4 Bar 1-18
- Beethoven: Piano Sonata No. 14 Mov. 1 Bar 1-14
- Beethoven: Piano Sonata No. 14 Mov. 2 Bar 1-16
- Beethoven: Piano Sonata No. 14 Mov. 2 Bar 56-64
- Beethoven: Piano Sonata No. 17 Mov. 1 Bar 29-45
- Beethoven: Piano Sonata No. 17 Mov. 1 Bar 52-63
- Beethoven: Piano Sonata No. 20 Mov. 1 Bar 15-19
- Beethoven: Piano Sonata No. 20 Mov. 1 Bar 20-28
- Beethoven: Piano Sonata No. 23 Mov. 2 Bar 1-8
- Beethoven: Piano Sonata No. 23 Mov. 2 Bar 17-24
- Chopin: Ballade No. 2 Bar 1-10
- Chopin: Etude No. 23 Bar 5-12
- Chopin: Mazurka Op. 7-1 Bar 25-28
- Debussy: Prelude No. 8 Bar 1-7
- Mozart: Piano Sonata No. 11 Mov. 1 Bar 1-8
- Mozart: Piano Sonata No. 17 Mov. 2 Bar 1-4
- Beethoven: 32 Variations, Var 15
- Schumann: Kreisleriana Op. 16-1, Bar 1-8
- Schumann: Kreisleriana Op. 16-8, Bar 1-16
- Schumann: Carnaval Op. 9-2, Bar 1-31
- Schumann: Carnaval Op. 9-4, Bar 1-16
- Schumann: Carnaval Op. 9-16, Bar 1-25
- Mendelssohn: Variations Serieuses Var 10
- Mendelssohn: Variations Serieuses Var 14
Demonstrations
Examples of rhythm transcription are shown. Each algorithm outputs quantised score times of the notes in an input MIDI performance. Score typesetting (e.g. beaming) is done manually to make the results clearer. For the merged-output HMM, the notes are grouped into two voices shown in the upper and lower staves.Good examples
Examples in which the merged-output HMM outperformed the other HMMs. These examples contain 2 against 3 or 3 against 4 polyrhythms.Example 1 (Chopin: Fantaisie Impromptu)
・Input performance ( mp3 MIDI )
・Correct transcription (original score) ( mp3 MIDI )
Upper staff: mp3 MIDI
Lower staff: mp3 MIDI
・Result by merged-output HMM ( Upper staff: mp3 MIDI Lower staff: mp3 MIDI ) RCC = 7
・Result by note HMM ( mp3 MIDI ) RCC = 66
・Result by metrical HMM ( mp3 MIDI ) RCC = 64
Example 2 (Debussy: Arabesque No. 1)
・Input performance ( mp3 MIDI )
・Correct transcription (original score) ( mp3 MIDI )
Upper staff: mp3 MIDI
Lower staff: mp3 MIDI
・Result by merged-output HMM ( Upper staff: mp3 MIDI Lower staff: mp3 MIDI ) RCC = 1
・Result by note HMM ( mp3 MIDI ) RCC = 12
・Result by metrical HMM ( mp3 MIDI ) RCC = 12
Example 3 (Beethoven: 32 Variations, Var. 9)
・Input performance ( mp3 MIDI )
・Correct transcription ( mp3 MIDI )
Upper staff: mp3 MIDI
Lower staff: mp3 MIDI
・Result by merged-output HMM ( Upper staff: mp3 MIDI Lower staff: mp3 MIDI ) RCC = 7
・Result by note HMM ( mp3 MIDI ) RCC = 75
・Result by metrical HMM ( mp3 MIDI ) RCC = 96
Bad examples
Examples in which the merged-output HMM performed worse than the other HMMs.Example 4 (Bach: Two-part invention No. 1 C-dur)
・Input performance ( mp3 MIDI )
・Correct transcription (original score) ( mp3 MIDI )
Upper staff: mp3 MIDI
Lower staff: mp3 MIDI
・Result by merged-output HMM ( Upper staff: mp3 MIDI Lower staff: mp3 MIDI ) RCC = 2
・Result by note HMM ( mp3 MIDI ) RCC = 0
・Result by metrical HMM ( mp3 MIDI ) RCC = 0
Example 5 (Bach: The well tempered clavier book II No. 19 A-dur Fugue)
・Input performance ( mp3 MIDI )
・Correct transcription (original score) ( mp3 MIDI )
Upper staff: mp3 MIDI
Lower staff: mp3 MIDI
・Result by merged-output HMM ( Upper staff: mp3 MIDI Lower staff: mp3 MIDI ) RCC = 5
・Result by note HMM ( mp3 MIDI ) RCC = 2
・Result by metrical HMM ( mp3 MIDI ) RCC = 0
Contact
Eita NakamuraResearch Bldg. No. 7 Room 417, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501, Japan
e-mail: enakamura[at]sap.ist.i.kyoto-u.ac[dot]jp