High-resolution Piano Transcription with Pedals by Regressing Onset and Offset Times

Kong, Qiuqiang; Li, Bochen; Song, Xuchen; Wan, Yuan; Wang, Yuxuan

Computer Science > Sound

arXiv:2010.01815 (cs)

[Submitted on 5 Oct 2020 (v1), last revised 31 Jul 2021 (this version, v3)]

Title:High-resolution Piano Transcription with Pedals by Regressing Onset and Offset Times

Authors:Qiuqiang Kong, Bochen Li, Xuchen Song, Yuan Wan, Yuxuan Wang

View PDF

Abstract:Automatic music transcription (AMT) is the task of transcribing audio recordings into symbolic representations. Recently, neural network-based methods have been applied to AMT, and have achieved state-of-the-art results. However, many previous systems only detect the onset and offset of notes frame-wise, so the transcription resolution is limited to the frame hop size. There is a lack of research on using different strategies to encode onset and offset targets for training. In addition, previous AMT systems are sensitive to the misaligned onset and offset labels of audio recordings. Furthermore, there are limited researches on sustain pedal transcription on large-scale datasets. In this article, we propose a high-resolution AMT system trained by regressing precise onset and offset times of piano notes. At inference, we propose an algorithm to analytically calculate the precise onset and offset times of piano notes and pedal events. We show that our AMT system is robust to the misaligned onset and offset labels compared to previous systems. Our proposed system achieves an onset F1 of 96.72% on the MAESTRO dataset, outperforming previous onsets and frames system of 94.80%. Our system achieves a pedal onset F1 score of 91.86\%, which is the first benchmark result on the MAESTRO dataset. We have released the source code and checkpoints of our work at this https URL.

Comments:	12 pages
Subjects:	Sound (cs.SD); Audio and Speech Processing (eess.AS)
Cite as:	arXiv:2010.01815 [cs.SD]
	(or arXiv:2010.01815v3 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.2010.01815

Submission history

From: Qiuqiang Kong [view email]
[v1] Mon, 5 Oct 2020 06:57:11 UTC (379 KB)
[v2] Sat, 17 Oct 2020 15:06:31 UTC (379 KB)
[v3] Sat, 31 Jul 2021 13:30:30 UTC (480 KB)

Computer Science > Sound

Title:High-resolution Piano Transcription with Pedals by Regressing Onset and Offset Times

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:High-resolution Piano Transcription with Pedals by Regressing Onset and Offset Times

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators