This is the accompanying website for the paper "Extending Harmonic-Percussive Separation of Audio Signals" by Jonathan Driedger, Meinard Müller and Sascha Disch [pdf][bib].
In recent years, methods to decompose an audio signal into a harmonic and a percussive component have received a lot of interest and are frequently applied as a processing step in a variety of scenarios. One problem is that the computed components are often not of purely harmonic or percussive nature but also contain sounds that are neither clearly harmonic nor percussive. Furthermore, depending on the parameter settings, one often can observe a leakage of harmonic sounds into the percussive component and vice versa. In this paper we present two extensions to a state-of-the-art harmonic-percussive separation procedure to target these problems. First, we introduce a separation factor parameter into the decomposition process that allows for tightening separation results and for enforcing the components to be clearly harmonic or percussive. As second contribution, inspired by the classical sines+transients+noise (STN) audio model, this novel concept is exploited to add a third residual component to the decomposition which captures the sounds that lie in between the clearly harmonic and percussive sounds of the audio signal.
Example decompositions computed with our proposed iterative procedure.
Item Name | Original | Decomposition |
---|---|---|
CastanetsViolinApplause | ||
Stepdad | ||
Heavy | ||
Bongo | ||
Glockenspiel | ||
Winterreise |
The used parameters are:
Nh = 4096, Np = 256, βh = 2, βp = 2, filter length horizontal = 200 ms, filter length vertical = 500 Hz.
Fixed parameters: β = 1, N = 1024
freq \ time | 50 ms | 100 ms | 200 ms | 500 ms | 1000 ms |
---|---|---|---|---|---|
1000 Hz | |||||
500 Hz | |||||
200 Hz | |||||
100 Hz | |||||
50 Hz |
freq \ time | 50 ms | 100 ms | 200 ms | 500 ms | 1000 ms |
---|---|---|---|---|---|
1000 Hz | |||||
500 Hz | |||||
200 Hz | |||||
100 Hz | |||||
50 Hz |
freq \ time | 50 ms | 100 ms | 200 ms | 500 ms | 1000 ms |
---|---|---|---|---|---|
1000 Hz | |||||
500 Hz | |||||
200 Hz | |||||
100 Hz | |||||
50 Hz |
freq \ time | 50 ms | 100 ms | 200 ms | 500 ms | 1000 ms |
---|---|---|---|---|---|
1000 Hz | |||||
500 Hz | |||||
200 Hz | |||||
100 Hz | |||||
50 Hz |
freq \ time | 50 ms | 100 ms | 200 ms | 500 ms | 1000 ms |
---|---|---|---|---|---|
1000 Hz | |||||
500 Hz | |||||
200 Hz | |||||
100 Hz | |||||
50 Hz |
freq \ time | 50 ms | 100 ms | 200 ms | 500 ms | 1000 ms |
---|---|---|---|---|---|
1000 Hz | |||||
500 Hz | |||||
200 Hz | |||||
100 Hz | |||||
50 Hz |
Fixed parameters: filter length horizontal = 200 ms, filter length vertical = 500 Hz
N \ β | 1.0 | 1.5 | 2.0 | 3.0 | 5.0 | 10.0 |
---|---|---|---|---|---|---|
4096 | ||||||
2048 | ||||||
1024 | ||||||
512 | ||||||
256 | ||||||
128 |
N \ β | 1.0 | 1.5 | 2.0 | 3.0 | 5.0 | 10.0 |
---|---|---|---|---|---|---|
4096 | ||||||
2048 | ||||||
1024 | ||||||
512 | ||||||
256 | ||||||
128 |
N \ β | 1.0 | 1.5 | 2.0 | 3.0 | 5.0 | 10.0 |
---|---|---|---|---|---|---|
4096 | ||||||
2048 | ||||||
1024 | ||||||
512 | ||||||
256 | ||||||
128 |
N \ β | 1.0 | 1.5 | 2.0 | 3.0 | 5.0 | 10.0 |
---|---|---|---|---|---|---|
4096 | ||||||
2048 | ||||||
1024 | ||||||
512 | ||||||
256 | ||||||
128 |
N \ β | 1.0 | 1.5 | 2.0 | 3.0 | 5.0 | 10.0 |
---|---|---|---|---|---|---|
4096 | ||||||
2048 | ||||||
1024 | ||||||
512 | ||||||
256 | ||||||
128 |
N \ β | 1.0 | 1.5 | 2.0 | 3.0 | 5.0 | 10.0 |
---|---|---|---|---|---|---|
4096 | ||||||
2048 | ||||||
1024 | ||||||
512 | ||||||
256 | ||||||
128 |
Fixed parameters: Nh = 4096, Np = 256, filter length horizontal = 200 ms, filter length vertical = 500 Hz
βp \ βh | 1.5 | 2.0 | 3.0 | 5.0 | 10.0 |
---|---|---|---|---|---|
10.0 | |||||
5.0 | |||||
3.0 | |||||
2.0 | |||||
1.5 |
βp \ βh | 1.5 | 2.0 | 3.0 | 5.0 | 10.0 |
---|---|---|---|---|---|
10.0 | |||||
5.0 | |||||
3.0 | |||||
2.0 | |||||
1.5 |
βp \ βh | 1.5 | 2.0 | 3.0 | 5.0 | 10.0 |
---|---|---|---|---|---|
10.0 | |||||
5.0 | |||||
3.0 | |||||
2.0 | |||||
1.5 |
βp \ βh | 1.5 | 2.0 | 3.0 | 5.0 | 10.0 |
---|---|---|---|---|---|
10.0 | |||||
5.0 | |||||
3.0 | |||||
2.0 | |||||
1.5 |
βp \ βh | 1.5 | 2.0 | 3.0 | 5.0 | 10.0 |
---|---|---|---|---|---|
10.0 | |||||
5.0 | |||||
3.0 | |||||
2.0 | |||||
1.5 |
βp \ βh | 1.5 | 2.0 | 3.0 | 5.0 | 10.0 |
---|---|---|---|---|---|
10.0 | |||||
5.0 | |||||
3.0 | |||||
2.0 | |||||
1.5 |
BL |
HP |
HP-I |
HPR |
HPR-I |
HPR-IO |
BL |
HP |
HP-I |
HPR |
HPR-I |
HPR-IO |
BL |
HP |
HP-I |
HPR |
HPR-I |
HPR-IO |
||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Violin | -3.10 | -5.85 | 0.08 | 8.23 | 7.65 | 8.85 | -3.10 | -5.09 | 1.08 | 17.69 | 14.58 | 21.65 | 274.25 | 8.33 | 9.44 | 8.82 | 8.78 | 9.11 | |||
Castanets | -2.93 | 3.58 | 2.86 | 8.29 | 9.14 | 9.28 | -2.93 | 6.06 | 10.45 | 22.34 | 20.66 | 24.41 | 274.25 | 8.14 | 4.07 | 8.49 | 9.50 | 9.44 | |||
Applause | -3.04 | - | -7.03 | 4.25 | 4.93 | 5.00 | -3.04 | - | 14.69 | 8.41 | 12.80 | 9.04 | 274.25 | - | -6.85 | 6.95 | 5.93 | 7.69 |
Table 1. Objective evaluation measures. All values are given in dB. Click on the values to listen to the respective components.
For comments and feedback, please contact Jonathan Driedger (jonathan (at) audiolabs-erlangen.de).
page last modified Friday, 11 April 2014 - 09:00