VAE+NN: Interpolation Composition by Direct Estimation of Encoded Vectors Against Linear Sampling of Latent Space

Pablo L. Diéguez, Von Wun Soo

Research output: Contribution to journalJournal Article peer-review

Abstract

In this paper, we introduce a machine learning technique to estimate the vector encoded by a Variational Autoencoder (VAE) model, without the need of explicitly sampling the vector from the VAE’s latent space. The feasibility of our approach is evaluated in the field of music interpolation composition, by means of the Hsinchu Interpolation MIDI Dataset that was created. A novel dual architecture of VAE plus an additional neural network (VAE+NN) is proposed to generate a polyphonic harmonic bridge between two given songs, smoothly changing the pitches and dynamics of the interpolation. The interpolations generated by the VAE+NN model surpass a Random data baseline, a bidirectional LSTM model and the state-of-the-art interpolation approach in automatic music composition (VAE model with linear sampling of the latent space), in terms of reconstruction MSE loss. Furthermore, a subjective evaluation was done in order to ensure the validity of the metric-based results.

Original languageEnglish
Pages (from-to)517-529
Number of pages13
JournalJournal of Information Science and Engineering
Volume38
Issue number3
DOIs
StatePublished - 05 2022
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2022 Institute of Information Science. All rights reserved.

Keywords

  • VAE
  • composition
  • encoded vector
  • interpolation
  • latent space
  • polyphonic music
  • variational autoencoders

Fingerprint

Dive into the research topics of 'VAE+NN: Interpolation Composition by Direct Estimation of Encoded Vectors Against Linear Sampling of Latent Space'. Together they form a unique fingerprint.

Cite this