Source Separation and Restoration of Sound Components in Music Recordings (SeReCo)

Logo_DFG Teaser_SeReCo2_small Logo_FAU

The goal of the SeReCo2 project is develop matrix decomposition and source separation techniques for decomposing a music recording into musically meaningful (e.g. note-related) sound components. The project is funded by the German Research Foundation. On this website, we summarize the project's main objectives and provide links to project-related resources (data, demonstrators, websites) and publications.

Project Description

Teaser_SeReCo2

Source Separation and Restoration of Sound Components in Music Recordings (SeReCo)

This is a follow-up project, which continues the previous DFG-funded project "Source Separation and Restoration of Drum Sound Components in Music Recordings" [MU 2686/10-1] aiming at the development of techniques for separating and restoring sound events as occurring in complex music recordings. In the first phase ([MU 2686/10-1]), we focused on percussive sound sources, where we decomposed a drum recording into individual drum sound events. Using Non-Negative Matrix Factor Deconvolution (NMFD) as our central methodology, we studied how to generate and integrate audio- and score-based side information to guide the decomposition. We tested our approaches within concrete application scenarios, including audio remixing (redrumming) and swing ratio analysis of jazz music. In the second phase of the project ([MU 2686/10-2]), our goals are significantly extended. First, we want to go beyond the drum scenario by considering other challenging music scenarios, including piano music (e.g., Beethoven Sonatas, Chopin Mazurkas), piano songs (e.g., Klavierlieder by Schubert), and string music (e.g., Beethoven String Quartets). In these scenarios, our goal is to decompose a music recording into individual note-related sound events. As our central methodology, we develop a unifying audio decomposition framework that combines classical signal processing and machine learning with recent deep learning (DL) approaches. Furthermore, we adopt generative DL techniques for improving the perceptual quality of restored sound events. As a general goal, we investigate how prior knowledge, such as score information can be integrated into DL-based learning to improve the interpretability of the trained models.

Projektbeschreibung

Teaser_SeReCo2

Quellentrennung und Wiederherstellung von Klangkomponenten in Musikaufnahmen

Dieses Projekt ist eine Fortsetzung von [MU 2686/10-1] mit dem Ziel, Techniken zur Trennung und Wiederherstellung von Klangereignissen, wie sie bei komplexen Musikaufnahmen auftreten, zu entwickeln. In der ersten Phase ([MU 2686/10-1]) konzentrierten wir uns auf die Separation von Schlagzeugaufnahmen in individuelle Schlagzeugklangkomponenten. Unter Verwendung von Techniken der nicht-negativen Matrixzerlegung haben wir systematisch untersucht, wie sich Audio- und Notentext-basierte Seiteninformation generieren, integrieren und zur Steuerung der Zerlegung ausnutzen lässt. Unsere Verfahren wurden im Kontext konkreter Anwendungsszenarien wie dem Audio-Remixing (Redrumming) und der Swing-Analyse von Jazzmusik getestet. In der zweiten Projektphase ([MU 2686/10-2]) erweitern wir unsere Ziele erheblich. Zunächst gehen wir über das Schlagzeugszenario hinaus, indem wir andere komplexe Musikszenarien betrachten, einschließlich Klaviermusik (z.B. Beethoven-Sonaten, Chopin-Mazurkas), Klavierlieder (z. B. von Schubert) und Streichmusik (z.B. Beethoven-Streichquartette). In diesen Szenarien besteht unser Ziel darin, eine Musikaufnahme in einzelne notenbezogene Klangereignisse zu zerlegen. Als zentrale Methodik kombinieren wir klassische Techniken der Signalverarbeitung und des maschinellen Lernens mit aktuellen Deep-Learning-Ansätzen (DL). Weiterhin entwickeln wir generative DL-basierte Methoden, um die perzeptuelle Qualität der separierten Klangereignisse zu verbessern. Als ein übergeordnetes Ziel widmen wir uns der Frage, wie sich musikalische Vorkenntnisse in DL-basierte Lernverfahren integrieren lassen, um auf diese Weise die Interpretierbarkeit der trainierten Modelle zu verbessern.

Projected-Related Resources and Demonstrators

The following list provides an overview of the most important publicly accessible sources created in the SeReCo2 project:

Projected-Related Publications

The following publications reflect the main scientific contributions of the work carried out in the SeReCo2 project.

  1. Yigitcan Özer, Hans-Ulrich Berendes, Vlora Arifi-Müller, Fabian-Robert Stöter, and Meinard Müller
    Notewise Evaluation for Music Source Separation: A Case Study for Separated Piano Tracks
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR) (to appear), 2024. Demo
    @inproceedings{OezerBASM24_NotewiseEvalPiano_ISMIR,
    author    = {Yigitcan {\"O}zer and Hans-Ulrich Berendes and Vlora Arifi-M{\"u}ller and Fabian{-}Robert St{\"o}ter and Meinard M{\"u}ller},
    title     = {Notewise Evaluation for Music Source Separation: A Case Study for Separated Piano Tracks},
    booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR}) (to appear)},
    address   = {San Francisco, USA},
    year      = {2024},
    url-demo  = {https://www.audiolabs-erlangen.de/resources/MIR/2024-ISMIR-PianoSepEval},
    }
  2. Yigitcan Özer, Leo Brütting, Simon Schwär, and Meinard Müller
    libsoni: A Python Toolbox for Sonifying Music Annotations and Feature Representations
    Journal of Open Source Software (JOSS), 9(96): 1–6, 2024. PDF Demo DOI
    @article{OezerBSM24_SonificationToolbox_JOSS,
    author    = {Yigitcan {\"O}zer and Leo Br{\"u}tting and Simon Schw{\"a}r and Meinard M{\"u}ller},
    title     = {libsoni: {A} {P}ython Toolbox for Sonifying Music Annotations and Feature Representations},
    journal   = {Journal of Open Source Software ({JOSS})},
    volume    = {9},
    number    = {96},
    year      = {2024},
    pages     = {1--6},
    doi       = {10.21105/joss.06524},
    url-demo  = {https://github.com/groupmm/libsoni},
    url-pdf   = {2024_OezerBSM_SonificationToolbox_JOSS_ePrint.pdf}
    }
  3. Yigitcan Özer and Meinard Müller
    Source Separation of Piano Concertos Using Musically Motivated Augmentation Techniques
    IEEE/ACM Transactions on Audio, Speech, and Language Processing, 32: 1214–1225, 2024. PDF Details Demo DOI
    @article{OezerM24_PianoSourceSep_TASLP,
    author      = {Yigitcan {\"O}zer  and Meinard M{\"u}ller},
    title       = {Source Separation of Piano Concertos Using Musically Motivated Augmentation Techniques},
    journal     = {{IEEE}/{ACM} Transactions on Audio, Speech, and Language Processing},
    volume      = {32},
    pages       = {1214--1225},
    year        = {2024},
    doi         = {10.1109/TASLP.2024.3356980},
    url-details = {},
    url-demo = {https://audiolabs-erlangen.de/resources/MIR/PCD},
    url-pdf = {2024_OezerM_PCSeparation_TASLP_ePrint.pdf}
    }
  4. Yigitcan Özer, Simon Schwär, Vlora Arifi-Müller, Jeremy Lawrence, Emre Sen, and Meinard Müller
    Piano Concerto Dataset (PCD): A Multitrack Dataset of Piano Concertos
    Transaction of the International Society for Music Information Retrieval (TISMIR), 6(1): 75–88, 2023. PDF Details Demo DOI
    @article{OezerSAJEM_PCD_TISMIR,
    author = {Yigitcan {\"O}zer and Simon Schw{\"a}r and Vlora Arifi-M{\"u}ller and Jeremy Lawrence and Emre Sen and Meinard M{\"u}ller},
    title = {Piano Concerto Dataset ({PCD}): A Multitrack Dataset of Piano Concertos},
    journal = {Transaction of the International Society for Music Information Retrieval ({TISMIR})},
    volume = {6},
    number = {1},
    pages = {75--88},
    year = {2023},
    doi = {10.5334/tismir.160},
    url-details = {https://transactions.ismir.net/articles/10.5334/tismir.160},
    url-pdf   = {2023_OezerSALSM_PianoConcertoDataset_TISMIR_ePrint.pdf},
    url-demo = {https://audiolabs-erlangen.de/resources/MIR/PCD}
    }
  5. Nazif Can Tamer, Yigitcan Özer, Meinard Müller, and Xavier Serra
    High-Resolution Violin Transcription Using Weak Labels
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 223–230, 2023. PDF Details DOI
    @inproceedings{TamerOMS23_ViolinTranscription_ISMIR,
    author    = {Nazif Can Tamer and Yigitcan {\"O}zer and Meinard M{\"u}ller and Xavier Serra},
    title     = {High-Resolution Violin Transcription Using Weak Labels},
    booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})},
    address   = {Milano, Italy},
    year      = {2023},
    pages     = {223--230},
    doi       = {10.5281/ZENODO.10265263},
    url-details = {https://doi.org/10.5281/zenodo.10265263},
    url-pdf   = {2023_TamerOMS_ViolinTranscription_ISMIR_ePrint.pdf}
    }
  6. Yigitcan Özer and Meinard Müller
    A Computational Approach for Creating Orchestra Tracks from Piano Concerto Recordings
    In Proceedings of the Deutsche Jahrestagung für Akustik (DAGA): 1370–1373, 2023. PDF
    @inproceedings{OezerM23_PianoTracks_DAGA,
    author    = {Yigitcan {\"O}zer and Meinard M{\"u}ller},
    title     = {A Computational Approach for Creating Orchestra Tracks
    from Piano Concerto Recordings},
    booktitle = {Proceedings of the {D}eutsche {J}ahrestagung f{\"u}r {A}kustik ({DAGA})},
    address   = {Hamburg, Germany},
    year      = {2023},
    pages     = {1370--1373},
    url-pdf   = {2023_OezerM_PCPipeline_DAGA_ePrint.pdf}
    }
  7. Nazif Can Tamer, Xavier Serra, Yigitcan Özer, and Meinard Müller
    TAPE: An End-to-End Timbre-Aware Pitch Estimator
    In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP): 1–5, 2023. DOI
    @inproceedings{TamerSOM23_TAPE_ICASSP,
    author    = {Nazif Can Tamer and Xavier Serra and Yigitcan {\"O}zer and Meinard M{\"u}ller},
    title     = {{TAPE}: {A}n End-to-End Timbre-Aware Pitch Estimator},
    booktitle = {Proceedings of the {IEEE} International Conference on Acoustics, Speech, and Signal Processing ({ICASSP})},
    address   = {Rhodes Island, Greece},
    year      = {2023},
    pages     = {1--5},
    doi       = {10.1109/ICASSP49357.2023.10096762}
    }
  8. Meinard Müller, Rachel Bittner, Juhan Nam, Michael Krause, and Yigitcan Özer
    Deep Learning and Knowledge Integration for Music Audio Analysis (Dagstuhl Seminar 22082)
    Dagstuhl Reports, 12(2): 103–133, 2022. PDF Details DOI
    @article{MuellerBN22_MusicDL_DagstuhlReport,
    author =    {Meinard M{\"u}ller and Rachel Bittner and Juhan Nam and Michael Krause and Yigitcan {\"O}zer},
    title = {Deep Learning and Knowledge Integration for Music Audio Analysis ({D}agstuhl {S}eminar 22082)},
    pages = {103--133},
    journal =   {Dagstuhl Reports},
    ISSN =  {2192-5283},
    year =  {2022},
    volume =    {12},
    number =    {2},
    editor =    {Meinard M{\"u}ller and Rachel Bittner Juhan Nam},
    publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
    address =   {Dagstuhl, Germany},
    URL =       {https://drops.dagstuhl.de/opus/volltexte/2022/16933},
    doi =       {10.4230/DagRep.12.2.103},
    url-pdf   = {2022_MuellerBN_DagRep22082_ePrint.pdf},
    url-details={https://www.dagstuhl.de/22082}
    }
  9. Yigitcan Özer and Meinard Müller
    Source Separation of Piano Concertos with Test-Time Adaptation
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 493–500, 2022. PDF Demo DOI
    @inproceedings{OezerM22_PianoSepAdapt_ISMIR,
    author    = {Yigitcan {\"O}zer and Meinard M{\"u}ller},
    title     = {Source Separation of Piano Concertos with Test-Time Adaptation},
    booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})},
    address   = {Bengaluru, India},
    year      = {2022},
    pages     = {493--500},
    doi       = {},
    url-demo = {https://www.audiolabs-erlangen.de/resources/MIR/2022-PianoSep/},
    url-pdf   = {2022_OezerM_PianoSepAdapt_ISMIR_ePrint.pdf}
    }
  10. Yigitcan Özer, Michael Krause, and Meinard Müller
    Using the Sync Toolbox for an Experiment on High-Resolution Music Alignment
    In Demos and Late Breaking News of the International Society for Music Information Retrieval Conference (ISMIR), 2021. PDF
    @inproceedings{OezerKM21_SyncToolbox_ISMIR-LBD,
    author      = {Yigitcan {\"O}zer and Michael Krause and Meinard M{\"u}ller},
    title       = {Using the Sync Toolbox for an Experiment on High-Resolution Music Alignment},
    booktitle   = {Demos and Late Breaking News of the International Society for Music Information Retrieval Conference ({ISMIR})},
    address     = {Online},
    year        = {2021},
    url-pdf     = {2021_OezerKM_SyncToolbox_ISMIR-LBD.pdf}
    }
  11. Yigitcan Özer, Matej Istvanek, Vlora Arifi-Müller, and Meinard Müller
    Using Activation Functions for Improving Measure-Level Audio Synchronization
    In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR): 749–756, 2022. PDF DOI
    @inproceedings{OezerIAM22_ActivationMusicSync_ISMIR,
    author    = {Yigitcan {\"O}zer and Matej Istvanek and Vlora Arifi-M{\"u}ller and Meinard M{\"u}ller},
    title     = {Using Activation Functions for Improving Measure-Level Audio Synchronization},
    booktitle = {Proceedings of the International Society for Music Information Retrieval Conference ({ISMIR})},
    address   = {Bengaluru, India},
    year      = {2022},
    pages     = {749--756},
    doi       = {},
    url-pdf   = {2022_OezerIAM_MusicSync_ISMIR_ePrint.pdf}
    }
  12. Yigitcan Özer, Jonathan Hansen, Tim Zunner, and Meinard Müller
    Investigating Nonnegative Autoencoders for Efficient Audio Decomposition
    In Proceedings of the European Signal Processing Conference (EUSIPCO): 254–258, 2022. Details
    @inproceedings{OezerHZM22_NAE_EUSIPCO,
    author    = {Yigitcan {\"O}zer and Jonathan Hansen and Tim Zunner and Meinard M{\"u}ller},
    title     = {Investigating Nonnegative Autoencoders for Efficient Audio Decomposition},
    booktitle = {Proceedings of the European Signal Processing Conference ({EUSIPCO})},
    year      = {2022},
    pages     = {254--258},
    url-details   = {https://ieeexplore.ieee.org/document/9909787}
    }
  13. Meinard Müller, Yigitcan Özer, Michael Krause, Thomas Prätzlich, and Jonathan Driedger
    Sync Toolbox: A Python Package for Efficient, Robust, and Accurate Music Synchronization
    Journal of Open Source Software (JOSS), 6(64): 1–4, 2021. PDF Demo DOI
    @article{MuellerZ21_SyncToolbox_JOSS,
    author    = {Meinard M{\"u}ller and Yigitcan {\"O}zer and Michael Krause and Thomas Pr{\"a}tzlich and Jonathan Driedger},
    title     = {{S}ync {T}oolbox: {A} {P}ython Package for Efficient, Robust, and Accurate Music Synchronization},
    journal   = {Journal of Open Source Software ({JOSS})},
    volume    = {6},
    number    = {64},
    year      = {2021},
    pages     = {3434:1--4},
    doi         = {10.21105/joss.03434},
    url-pdf   = {2021_MuellerOKPD_SyncToolbox_JOSS.pdf},
    url-demo = {https://github.com/meinardmueller/synctoolbox}
    }

Projected-Related Ph.D. Theses

  1. Yigitcan Özer
    Source Separation of Piano Music Recordings
    PhD Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), 2024. PDF Details
    @phdthesis{Oezer24_Thesis_PhD,
    author    = {Yigitcan {\"O}zer},
    title     = {Source Separation of Piano Music Recordings},
    school    = {Friedrich-Alexander-Universit{\"a}t Erlangen-N{\"u}rnberg (FAU)},
    address   = {Erlangen, Germany},
    year      = {2024},
    url-pdf   = {2024_Oezer_PianoSourceSeparation_ThesisPhD.pdf},
    url-details = {https://open.fau.de/handle/openfau/31319}
    }