TY - GEN
T1 - The Deepscoresv2 dataset and benchmark for music object detection
AU - Tuggener, Lukas
AU - Satyawan, Yvan Putra
AU - Pacha, Alexander
AU - Schmidhuber, Jürgen
AU - Stadelmann, Thilo
N1 - Generated from Scopus record by KAUST IRTS on 2022-09-14
PY - 2020/1/1
Y1 - 2020/1/1
N2 - In this paper, we present DeepScoresV2, an extended version of the DeepScores dataset for optical music recognition (OMR). We improve upon the original DeepScores dataset by providing much more detailed annotations, namely (a) annotations for 135 classes including fundamental symbols of non-fixed size and shape, increasing the number of annotated symbols by 23%; (b) oriented bounding boxes; (c) higher-level rhythm and pitch information (onset beat for all symbols and line position for noteheads); and (d) a compatibility mode for easy use in conjunction with the MUSCIMA++ dataset for OMR on handwritten documents. These additions open up the potential for future advancement in OMR research. Additionally, we release two state-of-the-art baselines for DeepScoresV2 based on Faster R-CNN and the Deep Watershed Detector. An analysis of the baselines shows that regular orthogonal bounding boxes are unsuitable for objects which are long, small, and potentially rotated, such as ties and beams, which demonstrates the need for detection algorithms that naturally incorporate object angles. The dataset, code and pre-trained models, as well as user instructions, are publicly available at https://zenodo.org/record/4012193.
AB - In this paper, we present DeepScoresV2, an extended version of the DeepScores dataset for optical music recognition (OMR). We improve upon the original DeepScores dataset by providing much more detailed annotations, namely (a) annotations for 135 classes including fundamental symbols of non-fixed size and shape, increasing the number of annotated symbols by 23%; (b) oriented bounding boxes; (c) higher-level rhythm and pitch information (onset beat for all symbols and line position for noteheads); and (d) a compatibility mode for easy use in conjunction with the MUSCIMA++ dataset for OMR on handwritten documents. These additions open up the potential for future advancement in OMR research. Additionally, we release two state-of-the-art baselines for DeepScoresV2 based on Faster R-CNN and the Deep Watershed Detector. An analysis of the baselines shows that regular orthogonal bounding boxes are unsuitable for objects which are long, small, and potentially rotated, such as ties and beams, which demonstrates the need for detection algorithms that naturally incorporate object angles. The dataset, code and pre-trained models, as well as user instructions, are publicly available at https://zenodo.org/record/4012193.
UR - https://ieeexplore.ieee.org/document/9412290/
UR - http://www.scopus.com/inward/record.url?scp=85110419500&partnerID=8YFLogxK
U2 - 10.1109/ICPR48806.2021.9412290
DO - 10.1109/ICPR48806.2021.9412290
M3 - Conference contribution
SN - 9781728188089
SP - 9188
EP - 9195
BT - Proceedings - International Conference on Pattern Recognition
PB - Institute of Electrical and Electronics Engineers Inc.
ER -