The goal of the SeReCo2 project is develop matrix decomposition and source separation techniques for decomposing a music recording into musically meaningful (e.g. note-related) sound components. The project is funded by the German Research Foundation. On this website, we summarize the project's main objectives and provide links to project-related resources (data, demonstrators, websites) and publications.
Source Separation and Restoration of Sound Components in Music Recordings (SeReCo)
This is a follow-up project, which continues the previous DFG-funded project "Source Separation and Restoration of Drum Sound Components in Music Recordings" [MU 2686/10-1] aiming at the development of techniques for separating and restoring sound events as occurring in complex music recordings. In the first phase ([MU 2686/10-1]), we focused on percussive sound sources, where we decomposed a drum recording into individual drum sound events. Using Non-Negative Matrix Factor Deconvolution (NMFD) as our central methodology, we studied how to generate and integrate audio- and score-based side information to guide the decomposition. We tested our approaches within concrete application scenarios, including audio remixing (redrumming) and swing ratio analysis of jazz music. In the second phase of the project ([MU 2686/10-2]), our goals are significantly extended. First, we want to go beyond the drum scenario by considering other challenging music scenarios, including piano music (e.g., Beethoven Sonatas, Chopin Mazurkas), piano songs (e.g., Klavierlieder by Schubert), and string music (e.g., Beethoven String Quartets). In these scenarios, our goal is to decompose a music recording into individual note-related sound events. As our central methodology, we develop a unifying audio decomposition framework that combines classical signal processing and machine learning with recent deep learning (DL) approaches. Furthermore, we adopt generative DL techniques for improving the perceptual quality of restored sound events. As a general goal, we investigate how prior knowledge, such as score information can be integrated into DL-based learning to improve the interpretability of the trained models.
Quellentrennung und Wiederherstellung von Klangkomponenten in Musikaufnahmen
Dieses Projekt ist eine Fortsetzung von [MU 2686/10-1] mit dem Ziel, Techniken zur Trennung und Wiederherstellung von Klangereignissen, wie sie bei komplexen Musikaufnahmen auftreten, zu entwickeln. In der ersten Phase ([MU 2686/10-1]) konzentrierten wir uns auf die Separation von Schlagzeugaufnahmen in individuelle Schlagzeugklangkomponenten. Unter Verwendung von Techniken der nicht-negativen Matrixzerlegung haben wir systematisch untersucht, wie sich Audio- und Notentext-basierte Seiteninformation generieren, integrieren und zur Steuerung der Zerlegung ausnutzen lässt. Unsere Verfahren wurden im Kontext konkreter Anwendungsszenarien wie dem Audio-Remixing (Redrumming) und der Swing-Analyse von Jazzmusik getestet. In der zweiten Projektphase ([MU 2686/10-2]) erweitern wir unsere Ziele erheblich. Zunächst gehen wir über das Schlagzeugszenario hinaus, indem wir andere komplexe Musikszenarien betrachten, einschließlich Klaviermusik (z.B. Beethoven-Sonaten, Chopin-Mazurkas), Klavierlieder (z. B. von Schubert) und Streichmusik (z.B. Beethoven-Streichquartette). In diesen Szenarien besteht unser Ziel darin, eine Musikaufnahme in einzelne notenbezogene Klangereignisse zu zerlegen. Als zentrale Methodik kombinieren wir klassische Techniken der Signalverarbeitung und des maschinellen Lernens mit aktuellen Deep-Learning-Ansätzen (DL). Weiterhin entwickeln wir generative DL-basierte Methoden, um die perzeptuelle Qualität der separierten Klangereignisse zu verbessern. Als ein übergeordnetes Ziel widmen wir uns der Frage, wie sich musikalische Vorkenntnisse in DL-basierte Lernverfahren integrieren lassen, um auf diese Weise die Interpretierbarkeit der trainierten Modelle zu verbessern.
The following list provides an overview of the most important publicly accessible sources created in the SeReCo2 project:
Notewise Evaluation for Music Source Separation: A Case Study for Separated Piano Tracks (ISMIR 2024)
Source Separation of Piano Concertos Using Musically Motivated Augmentation Techniques (TASLP 2024)
libsoni: Python Toolbox for Sonifying Music Annotations and Feature Representations (JOSS 2024)
Piano Concerto Dataset (PCD): A Multitrack Dataset of Piano Concertos (TISMIR 2023)
Source Separation of Piano Concertos with Test-Time Adaptation (ISMIR 2022)
Sync Toolbox: Python package with reference implementations for efficient, robust, and accurate music synchronization based on dynamic time warping (JOSS 2021)
The following publications reflect the main scientific contributions of the work carried out in the SeReCo2 project.
@inproceedings{OezerBASM24_NotewiseEvalPiano_ISMIR, author = {Yigitcan {\"O}zer and Hans-Ulrich Berendes and Vlora Arifi-M{\"u}ller and Fabian{-}Robert St{\"o}ter and Meinard M{\"u}ller}, title = {Notewise Evaluation for Music Source Separation: A Case Study for Separated Piano Tracks}, booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR}) (to appear)}, address = {San Francisco, USA}, year = {2024}, url-demo = {https://www.audiolabs-erlangen.de/resources/MIR/2024-ISMIR-PianoSepEval}, }
@article{OezerBSM24_SonificationToolbox_JOSS, author = {Yigitcan {\"O}zer and Leo Br{\"u}tting and Simon Schw{\"a}r and Meinard M{\"u}ller}, title = {libsoni: {A} {P}ython Toolbox for Sonifying Music Annotations and Feature Representations}, journal = {Journal of Open Source Software ({JOSS})}, volume = {9}, number = {96}, year = {2024}, pages = {1--6}, doi = {10.21105/joss.06524}, url-demo = {https://github.com/groupmm/libsoni}, url-pdf = {2024_OezerBSM_SonificationToolbox_JOSS_ePrint.pdf} }
@article{OezerM24_PianoSourceSep_TASLP, author = {Yigitcan {\"O}zer and Meinard M{\"u}ller}, title = {Source Separation of Piano Concertos Using Musically Motivated Augmentation Techniques}, journal = {{IEEE}/{ACM} Transactions on Audio, Speech, and Language Processing}, volume = {32}, pages = {1214--1225}, year = {2024}, doi = {10.1109/TASLP.2024.3356980}, url-details = {}, url-demo = {https://audiolabs-erlangen.de/resources/MIR/PCD}, url-pdf = {2024_OezerM_PCSeparation_TASLP_ePrint.pdf} }
@article{OezerSAJEM_PCD_TISMIR, author = {Yigitcan {\"O}zer and Simon Schw{\"a}r and Vlora Arifi-M{\"u}ller and Jeremy Lawrence and Emre Sen and Meinard M{\"u}ller}, title = {Piano Concerto Dataset ({PCD}): A Multitrack Dataset of Piano Concertos}, journal = {Transaction of the International Society for Music Information Retrieval ({TISMIR})}, volume = {6}, number = {1}, pages = {75--88}, year = {2023}, doi = {10.5334/tismir.160}, url-details = {https://transactions.ismir.net/articles/10.5334/tismir.160}, url-pdf = {2023_OezerSALSM_PianoConcertoDataset_TISMIR_ePrint.pdf}, url-demo = {https://audiolabs-erlangen.de/resources/MIR/PCD} }
@inproceedings{TamerOMS23_ViolinTranscription_ISMIR, author = {Nazif Can Tamer and Yigitcan {\"O}zer and Meinard M{\"u}ller and Xavier Serra}, title = {High-Resolution Violin Transcription Using Weak Labels}, booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})}, address = {Milano, Italy}, year = {2023}, pages = {223--230}, doi = {10.5281/ZENODO.10265263}, url-details = {https://doi.org/10.5281/zenodo.10265263}, url-pdf = {2023_TamerOMS_ViolinTranscription_ISMIR_ePrint.pdf} }
@inproceedings{OezerM23_PianoTracks_DAGA, author = {Yigitcan {\"O}zer and Meinard M{\"u}ller}, title = {A Computational Approach for Creating Orchestra Tracks from Piano Concerto Recordings}, booktitle = {Proceedings of the {D}eutsche {J}ahrestagung f{\"u}r {A}kustik ({DAGA})}, address = {Hamburg, Germany}, year = {2023}, pages = {1370--1373}, url-pdf = {2023_OezerM_PCPipeline_DAGA_ePrint.pdf} }
@inproceedings{TamerSOM23_TAPE_ICASSP, author = {Nazif Can Tamer and Xavier Serra and Yigitcan {\"O}zer and Meinard M{\"u}ller}, title = {{TAPE}: {A}n End-to-End Timbre-Aware Pitch Estimator}, booktitle = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})}, address = {Rhodes Island, Greece}, year = {2023}, pages = {1--5}, doi = {10.1109/ICASSP49357.2023.10096762} }
@article{MuellerBN22_MusicDL_DagstuhlReport, author = {Meinard M{\"u}ller and Rachel Bittner and Juhan Nam and Michael Krause and Yigitcan {\"O}zer}, title = {Deep Learning and Knowledge Integration for Music Audio Analysis ({D}agstuhl {S}eminar 22082)}, pages = {103--133}, journal = {Dagstuhl Reports}, ISSN = {2192-5283}, year = {2022}, volume = {12}, number = {2}, editor = {Meinard M{\"u}ller and Rachel Bittner Juhan Nam}, publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik}, address = {Dagstuhl, Germany}, URL = {https://drops.dagstuhl.de/opus/volltexte/2022/16933}, doi = {10.4230/DagRep.12.2.103}, url-pdf = {2022_MuellerBN_DagRep22082_ePrint.pdf}, url-details={https://www.dagstuhl.de/22082} }
@inproceedings{OezerM22_PianoSepAdapt_ISMIR, author = {Yigitcan {\"O}zer and Meinard M{\"u}ller}, title = {Source Separation of Piano Concertos with Test-Time Adaptation}, booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})}, address = {Bengaluru, India}, year = {2022}, pages = {493--500}, doi = {}, url-demo = {https://www.audiolabs-erlangen.de/resources/MIR/2022-PianoSep/}, url-pdf = {2022_OezerM_PianoSepAdapt_ISMIR_ePrint.pdf} }
@inproceedings{OezerKM21_SyncToolbox_ISMIR-LBD, author = {Yigitcan {\"O}zer and Michael Krause and Meinard M{\"u}ller}, title = {Using the Sync Toolbox for an Experiment on High-Resolution Music Alignment}, booktitle = {Demos and Late Breaking News of the International Society for Music Information Retrieval Conference ({ISMIR})}, address = {Online}, year = {2021}, url-pdf = {2021_OezerKM_SyncToolbox_ISMIR-LBD.pdf} }
@inproceedings{OezerIAM22_ActivationMusicSync_ISMIR, author = {Yigitcan {\"O}zer and Matej Istvanek and Vlora Arifi-M{\"u}ller and Meinard M{\"u}ller}, title = {Using Activation Functions for Improving Measure-Level Audio Synchronization}, booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})}, address = {Bengaluru, India}, year = {2022}, pages = {749--756}, doi = {}, url-pdf = {2022_OezerIAM_MusicSync_ISMIR_ePrint.pdf} }
@inproceedings{OezerHZM22_NAE_EUSIPCO, author = {Yigitcan {\"O}zer and Jonathan Hansen and Tim Zunner and Meinard M{\"u}ller}, title = {Investigating Nonnegative Autoencoders for Efficient Audio Decomposition}, booktitle = {Proceedings of the European Signal Processing Conference ({EUSIPCO})}, year = {2022}, pages = {254--258}, url-details = {https://ieeexplore.ieee.org/document/9909787} }
@article{MuellerZ21_SyncToolbox_JOSS, author = {Meinard M{\"u}ller and Yigitcan {\"O}zer and Michael Krause and Thomas Pr{\"a}tzlich and Jonathan Driedger}, title = {{S}ync {T}oolbox: {A} {P}ython Package for Efficient, Robust, and Accurate Music Synchronization}, journal = {Journal of Open Source Software ({JOSS})}, volume = {6}, number = {64}, year = {2021}, pages = {3434:1--4}, doi = {10.21105/joss.03434}, url-pdf = {2021_MuellerOKPD_SyncToolbox_JOSS.pdf}, url-demo = {https://github.com/meinardmueller/synctoolbox} }
@phdthesis{Oezer24_Thesis_PhD, author = {Yigitcan {\"O}zer}, title = {Source Separation of Piano Music Recordings}, school = {Friedrich-Alexander-Universit{\"a}t Erlangen-N{\"u}rnberg (FAU)}, address = {Erlangen, Germany}, year = {2024}, url-pdf = {2024_Oezer_PianoSourceSeparation_ThesisPhD.pdf}, url-details = {https://open.fau.de/handle/openfau/31319} }