Generalization Challenges for Neural Architectures in Audio Source Separation

Mobin, Shariq; Cheung, Brian; Olshausen, Bruno

Computer Science > Sound

arXiv:1803.08629 (cs)

[Submitted on 23 Mar 2018 (v1), last revised 27 May 2018 (this version, v2)]

Title:Generalization Challenges for Neural Architectures in Audio Source Separation

Authors:Shariq Mobin, Brian Cheung, Bruno Olshausen

View PDF

Abstract:Recent work has shown that recurrent neural networks can be trained to separate individual speakers in a sound mixture with high fidelity. Here we explore convolutional neural network models as an alternative and show that they achieve state-of-the-art results with an order of magnitude fewer parameters. We also characterize and compare the robustness and ability of these different approaches to generalize under three different test conditions: longer time sequences, the addition of intermittent noise, and different datasets not seen during training. For the last condition, we create a new dataset, RealTalkLibri, to test source separation in real-world environments. We show that the acoustics of the environment have significant impact on the structure of the waveform and the overall performance of neural network models, with the convolutional model showing superior ability to generalize to new environments. The code for our study is available at this https URL.

Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Signal Processing (eess.SP)
Cite as:	arXiv:1803.08629 [cs.SD]
	(or arXiv:1803.08629v2 [cs.SD] for this version)
	https://doi.org/10.48550/arXiv.1803.08629

Submission history

From: Shariq Mobin [view email]
[v1] Fri, 23 Mar 2018 01:26:39 UTC (2,833 KB)
[v2] Sun, 27 May 2018 17:03:09 UTC (1,418 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.SD

< prev | next >

new | recent | 2018-03

Change to browse by:

cs
cs.LG
eess
eess.SP

References & Citations

DBLP - CS Bibliography

listing | bibtex

Shariq Mobin
Brian Cheung
Bruno A. Olshausen

export BibTeX citation

Computer Science > Sound

Title:Generalization Challenges for Neural Architectures in Audio Source Separation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Sound

Title:Generalization Challenges for Neural Architectures in Audio Source Separation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators