RAT: Radial Attention Transformer for Singing Technique Recognition

Guan Yuan Chen*, Ya Fen Yeh*, Von Wun Soo*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Singing techniques are important skills for a professional vocal performance that usually involves dedicated fluctuations of timbre, pitch, duration, and loudness, etc. To recognize types of singing techniques can be quite challenging because 1) the time-frequency features in singing are highly dynamic that may appear in a long range of audio signals; 2) different singing techniques such as vibrato and trill tend to have similar features in the locality; 3) The distribution of singing technique dataset suffers from the long-tailed issue. To man-age these problems, we proposed a novel Radial Attention Transformer (RAT) with a Radial Attention (RA) Module that can capture the fine-grained local features as well as the long range inter-dependency of audio features. The experiment results showed that the proposed method, RAT with Adaptive Logit Adjustment (ALA) Loss significantly outperformed pre-vious state-of-the-art models (Convolutional Neural Networks and Deformable CNN), on the recognition tasks of singing technique categories.

Original languageEnglish
Title of host publicationICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728163277
DOIs
StatePublished - 2023
Event48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023 - Rhodes Island, Greece
Duration: 04 06 202310 06 2023

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2023-June
ISSN (Print)1520-6149

Conference

Conference48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023
Country/TerritoryGreece
CityRhodes Island
Period04/06/2310/06/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Keywords

  • local attention
  • logit adjustment loss
  • singing technique
  • sparse global attention
  • transformer

Fingerprint

Dive into the research topics of 'RAT: Radial Attention Transformer for Singing Technique Recognition'. Together they form a unique fingerprint.

Cite this