AdaBit-TasNet: Speech Separation with Inference Adaptable Precision

M. Elminshawi, S. Chetupalli, and E. A. P. Habets

Published in the Proc. of the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), 2025

Abstract

Deploying advanced neural network-based speech separation (SS) models on resource-constrained devices is challenging due to their high computational and memory demands. Conventional network compression techniques, such as pruning and quantization, can alleviate these demands without significantly compromising performance. However, they lack the flexibility to select the compression factor at run-time to suit varying operating conditions, such as changing computational and energy budgets in battery-powered devices. In this paper, we introduce AdaBit-TasNet, an adaptable-precision network (APN) for SS that enables flexible bit-width selection during inference. Experimental evaluation on the Libri2Mix dataset demonstrates that AdaBit-TasNet achieves comparable performance to that of individually trained fixed-precision networks at several bit-widths.

Audio Examples

Note: The proposed APN below corresponds to the configuration (SCL-W/A). Please refer to the paper for details.

Example #1

Mixture Groundtruth s1 Groundtruth s2
Bitwidth FPN APN (Proposed)
s1 s2 s1 s2
32
16
8
4


Example #2

Mixture Groundtruth s1 Groundtruth s2
Bitwidth FPN APN (Proposed)
s1 s2 s1 s2
32
16
8
4


Example #3

Mixture Groundtruth s1 Groundtruth s2
Bitwidth FPN APN (Proposed)
s1 s2 s1 s2
32
16
8
4