OveNet: A Hyper-Range U-Net for Singing Voice Separation

Chi Sheng Wu, Shiang Lee, Von Wun Soo

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In the audio source separation topic, most researchers based on deep learning methods ignored higher-frequency signals due to lack of efficient data compression method. We propose a new model named OvertoneNet (OveNet) that adopts two novel concepts, frequency 1x1 convolution layers, and complex-spectrogram channels, to handle the 44.1k audio signals (Hi-Res audio signals) containing full overtones. The result shows that OveNet performs well in both objective and subjective evaluation on interference using limited training data from SiSEC2018.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE International Symposium on Multimedia, ISM 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages148-151
Number of pages4
ISBN (Electronic)9781728156064
DOIs
StatePublished - 12 2019
Externally publishedYes
Event21st IEEE International Symposium on Multimedia, ISM 2019 - San Diego, United States
Duration: 09 12 201911 12 2019

Publication series

NameProceedings - 2019 IEEE International Symposium on Multimedia, ISM 2019

Conference

Conference21st IEEE International Symposium on Multimedia, ISM 2019
Country/TerritoryUnited States
CitySan Diego
Period09/12/1911/12/19

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

Keywords

  • U-Net
  • audio source separation
  • convolutional neural networks

Fingerprint

Dive into the research topics of 'OveNet: A Hyper-Range U-Net for Singing Voice Separation'. Together they form a unique fingerprint.

Cite this