Peter Meier, Meinard Müller, Stefan Balke
This website is related to the following publication:
@inproceedings{MeierMB25_PitchCrosstalkChoraleBricks_SMC, author = {Peter Meier and Meinard M{\"u}ller and Stefan Balke}, title = {Analyzing Pitch Estimation Accuracy in Cross-Talk Scenarios: A Study with Wind Instruments}, booktitle = {Proceedings of the Sound and Music Computing Conference ({SMC})}, address = {Graz, Austria}, pages = {}, year = {Accepted, 2025} }
Intonation accuracy is crucial for wind instrument ensembles, where pitch deviations affect harmonic coherence. Music Information Retrieval (MIR) techniques, particularly pitch estimation, offer potential for real-time intonation monitoring. However, in natural ensemble settings, microphone cross-talk can compromise pitch accuracy. In this article, we systematically investigate the impact of cross-talk on pitch estimation for wind instruments using the ChoraleBricks [1] dataset, which contains multi-track recordings of isolated choral performances. By simulating cross-talk scenarios with Gaussian noise, single- and multi-instrument interference, we assess the robustness of lightweight, real-time capable estimators like YIN and SWIPE against more advanced methods like PYIN and CREPE. Our results show that pitch estimation accuracy declines significantly below an SNR threshold of 15 dB. To address this, we identify instrument-specific challenges and propose frequency filtering to mitigate cross-talk interference. These findings inform the development of robust, real-time intonation monitoring systems for wind ensembles, with applications in music education, performance analysis, and rehearsal optimization.
In the first experiment, we introduce Gaussian Noise at various SNR levels to all tracks in the ChoraleBricks [1] dataset. For this specific audio example, we use a clarinet as the target instrument for the chorale "Auf, auf, mein Herz, mit Freuden" by Crüger.
(Note that an SNR value of -10 dB indicates that the interfering noise signal is 10 dB above the level of the target instrument.)
In our second experiment, we incorporated an additional instrument to interfere with the target. In this particular audio example, we selected a clarinet as the target instrument and a trumpet as the interfering instrument. Once again, we used the chorale "Auf, auf, mein Herz, mit Freuden" by Crüger for this purpose.
In our final experiment, we transition from single-instrument to multi-instrument interference, simulating typical ensemble performances where instruments are positioned closely together. In this specific audio example, we chose the clarinet as the target instrument while using the trumpet, trombone, and tuba as the interfering instruments. We once again selected the chorale "Auf, auf, mein Herz, mit Freuden" by Crüger.
Pitch estimators such as SWIPE are typically designed to identify the lowest pitch in a signal. However, this task becomes challenging when target instruments are mixed with interfering instruments that play significantly lower notes. A straightforward solution to this problem is to apply a high-pass filter to the mix, which favors the target instrument.
For each mix, we apply a high-pass filter with a cut-off frequency set at the median frequency of the target instrument and a slope of 24 dB per octave. For instance, in our audio example where the clarinet is the target instrument, it has a median frequency of 354 Hz (as shown in Figure 2 of the SMC paper). Consequently, all subsequent mixes are filtered with this cut-off frequency.
(Note, that the high-pass filter decreases the energy of lower-frequency notes from interfering instruments, enhancing the audibility of the target instrument.)
@article{BalkeBM24_ChoraleBricks_TISMIR, author = {Stefan Balke and Axel Berndt and Meinard M{\"u}ller}, title = {{ChoraleBricks}: A Modular Multitrack Dataset for Wind Music Research}, journal = {Transactions of the International Society for Music Information Retrieval ({TISMIR})}, year = {2025} }
The International Audio Laboratories Erlangen are a joint institution of the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraunhofer Institute for Integrated Circuits IIS. This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Grant No. 500643750 (MU 2686/15-1).