Polyphonic rhythm transcription

Abstract

A rhythm transcription algorithm converts a MIDI performance to a music score representation. We show results of comparing three HMM-based algorithms; note HMM [1,2], metrical HMM [3,4], and merged-output HMM [5]. As an evaluation measure, we use the rhythm correction cost (RCC), which is defined as the least number of operations to correct an estimated result. For details, see our paper [5,6].

You can download programming codes and see the list of training data here.

References

 [1] T. Otsuki, N. Saitou, M. Nakai, H. Shimodaira and S. Sagayama, "Musical Rhythm Recognition Using Hidden Markov Model (in Japanese)," J. Information Processing Society of Japan, 43(2), pp. 245-255, 2002.
 [2] H. Takeda, T. Nishimoto and S. Sagayama, "Rhythm and Tempo Analysis Toward Automatic Music Transcription," Proc. ICASSP, vol. 4, pp. 1317-1320, 2007.
 [3] C. Raphael, "Automated Rhythm Transcription," Proc. ISMIR, pp. 99-107, 2001.
 [4] M. Hamanaka, M. Goto, H. Asoh and N. Otsu, "A Learning-Based Quantization: Unsupervised Estimation of the Model Parameters," Proc. ICMC, pp. 369-372, 2003.
 [5] E. Nakamura, K. Yoshii and S. Sagayama, "Rhythm Transcription of Polyphonic MIDI Performances Based on a Merged-Output HMM for Multiple Voices," Proc. SMC, pp. 338-343, 2016.
 [6] E. Nakamura, K. Yoshii and S. Sagayama, "Rhythm Transcription of Polyphonic Piano Music Based on Merged-Output HMM for Multiple Voices," IEEE/ACM TASLP, 25(4), pp. 794-806, 2017.

Test data sets

The used data consist of two sets, the polyrhythmic data set and the non-polyrhythmic data set, each of which contains 30 excerpts of classical piano music. The polyrhythmic data set is composed of pieces that contained 2 against 3 or 3 against 4 polyrhythmic passages, and the non-polyrhythmic data set is composed of pieces that did not contain polyrhythmic passages.

Polyrhythmic data set

  1. Brahms: Intermezzo Op. 118-2 Bar 48-55
  2. Chopin: Fantasie Impromptu Bar 13-16
  3. Debussy: Arabesque No. 1 Bar 6-9
  4. Debussy: Arabesque No. 1 Bar 19-21
  5. Debussy: Arabesque No. 1 Bar 76-79
  6. Debussy: Arabesque No. 1 Bar 89-94
  7. Brahms: Intermezzo Op. 118-2 Bar 48-55 (2nd time)
  8. Brahms: Intermezzo Op. 118-2 Bar 64-71
  9. Chopin: Fantaisie Impromptu Bar 5-8
  10. Chopin: Fantaisie Impromptu Bar 31-34
  11. Beethoven: 32 Variations, Var 9
  12. Beethoven: 32 Variations, Var 16
  13. Brahms: Paganini Variations Book 1 Var 5 Bar 1-8
  14. Brahms: Paganini Variations Book 1 Var 9 Bar 1-8
  15. Brahms: Paganini Variations Book 1 Var 12 Bar 5-15
  16. Brahms: Paganini Variations Book 2 Var 1 Bar 5-8
  17. Brahms: Paganini Variations Book 2 Var 1 Bar 9-16
  18. Brahms: Paganini Variations Book 2 Var 2 Bar 1-8
  19. Brahms: Paganini Variations Book 2 Var 5 Bar 5-10
  20. Brahms: Paganini Variations Book 2 Var 14 Coda
  21. Brahms: Paganini Variations Book 1 Var 5 Bar 9-16
  22. Brahms: Paganini Variations Book 1 Var 9 Bar 5-12
  23. Brahms: Paganini Variations Book 2 Var 2 Bar 9-24
  24. Rachmaninoff: Corelli Variations Coda
  25. Chopin: Piano Sonata No. 3 Mov. 1 Bar 41-50
  26. Schubert: Piano Sonata No. 21 Mov. 1 Bar 167-172
  27. Schubert: Piano Sonata No. 21 Mov. 4 Bar 309-324
  28. Liszt: Dante Sonata, Piu Tosto
  29. Ravel: Jeu d'Eau Bar 41-45
  30. Ravel: Jeu d'Eau Bar 46-47

Non-polyrhythmic data set

  1. Bach: Invention No. 1 Bar 15-18
  2. Bach: Invention No. 2 Bar 19-22
  3. Bach: Well Tempered Clavier Book 1 Prelude No. 13 Bar 24-30
  4. Bach: Well Tempered Clavier Book 2 Fugue No. 2 Bar 14-23
  5. Bach: Well Tempered Clavier Book 2 Fugue No. 19 Bar 1-6
  6. Bartok: Romanian Dance No. 2 Bar 1-16
  7. Bartok: Romanian Dance No. 4 Bar 1-18
  8. Beethoven: Piano Sonata No. 14 Mov. 1 Bar 1-14
  9. Beethoven: Piano Sonata No. 14 Mov. 2 Bar 1-16
  10. Beethoven: Piano Sonata No. 14 Mov. 2 Bar 56-64
  11. Beethoven: Piano Sonata No. 17 Mov. 1 Bar 29-45
  12. Beethoven: Piano Sonata No. 17 Mov. 1 Bar 52-63
  13. Beethoven: Piano Sonata No. 20 Mov. 1 Bar 15-19
  14. Beethoven: Piano Sonata No. 20 Mov. 1 Bar 20-28
  15. Beethoven: Piano Sonata No. 23 Mov. 2 Bar 1-8
  16. Beethoven: Piano Sonata No. 23 Mov. 2 Bar 17-24
  17. Chopin: Ballade No. 2 Bar 1-10
  18. Chopin: Etude No. 23 Bar 5-12
  19. Chopin: Mazurka Op. 7-1 Bar 25-28
  20. Debussy: Prelude No. 8 Bar 1-7
  21. Mozart: Piano Sonata No. 11 Mov. 1 Bar 1-8
  22. Mozart: Piano Sonata No. 17 Mov. 2 Bar 1-4
  23. Beethoven: 32 Variations, Var 15
  24. Schumann: Kreisleriana Op. 16-1, Bar 1-8
  25. Schumann: Kreisleriana Op. 16-8, Bar 1-16
  26. Schumann: Carnaval Op. 9-2, Bar 1-31
  27. Schumann: Carnaval Op. 9-4, Bar 1-16
  28. Schumann: Carnaval Op. 9-16, Bar 1-25
  29. Mendelssohn: Variations Serieuses Var 10
  30. Mendelssohn: Variations Serieuses Var 14

Demonstrations

Examples of rhythm transcription are shown. Each algorithm outputs quantised score times of the notes in an input MIDI performance. Score typesetting (e.g. beaming) is done manually to make the results clearer. For the merged-output HMM, the notes are grouped into two voices shown in the upper and lower staves.

Good examples

Examples in which the merged-output HMM outperformed the other HMMs. These examples contain 2 against 3 or 3 against 4 polyrhythms.

Example 1 (Chopin: Fantaisie Impromptu)

・Input performance ( mp3 MIDI )
Picture
・Correct transcription (original score) ( mp3 MIDI )
Picture
Upper staff: mp3 MIDI
Lower staff: mp3 MIDI
・Result by merged-output HMM ( Upper staff: mp3 MIDI  Lower staff: mp3 MIDI ) RCC = 7
Picture
・Result by note HMM ( mp3 MIDI ) RCC = 66
Picture
・Result by metrical HMM ( mp3 MIDI ) RCC = 64
Picture

Example 2 (Debussy: Arabesque No. 1)

・Input performance ( mp3 MIDI )
Picture
・Correct transcription (original score) ( mp3 MIDI )
Picture
Upper staff: mp3 MIDI
Lower staff: mp3 MIDI
・Result by merged-output HMM ( Upper staff: mp3 MIDI  Lower staff: mp3 MIDI ) RCC = 1
Picture
・Result by note HMM ( mp3 MIDI ) RCC = 12
Picture
・Result by metrical HMM ( mp3 MIDI ) RCC = 12
Picture

Example 3 (Beethoven: 32 Variations, Var. 9)

・Input performance ( mp3 MIDI )
Picture
・Correct transcription ( mp3 MIDI )
Picture
Upper staff: mp3 MIDI
Lower staff: mp3 MIDI
・Result by merged-output HMM ( Upper staff: mp3 MIDI  Lower staff: mp3 MIDI ) RCC = 7
Picture
・Result by note HMM ( mp3 MIDI ) RCC = 75
Picture
・Result by metrical HMM ( mp3 MIDI ) RCC = 96
Picture

Bad examples

Examples in which the merged-output HMM performed worse than the other HMMs.

Example 4 (Bach: Two-part invention No. 1 C-dur)

・Input performance ( mp3 MIDI )
Picture
・Correct transcription (original score) ( mp3 MIDI )
Picture
Upper staff: mp3 MIDI
Lower staff: mp3 MIDI
・Result by merged-output HMM ( Upper staff: mp3 MIDI  Lower staff: mp3 MIDI ) RCC = 2
Picture
・Result by note HMM ( mp3 MIDI ) RCC = 0
Picture
・Result by metrical HMM ( mp3 MIDI ) RCC = 0
Picture

Example 5 (Bach: The well tempered clavier book II No. 19 A-dur Fugue)

・Input performance ( mp3 MIDI )
Picture
・Correct transcription (original score) ( mp3 MIDI )
Picture
Upper staff: mp3 MIDI
Lower staff: mp3 MIDI
・Result by merged-output HMM ( Upper staff: mp3 MIDI  Lower staff: mp3 MIDI ) RCC = 5
Picture
・Result by note HMM ( mp3 MIDI ) RCC = 2
Picture
・Result by metrical HMM ( mp3 MIDI ) RCC = 0
Picture

Contact

Eita Nakamura
Research Bldg. No. 7 Room 417, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501, Japan
e-mail: enakamura[at]sap.ist.i.kyoto-u.ac[dot]jp