Abstract
In this paper, we introduce a machine learning technique to estimate the vector encoded by a Variational Autoencoder (VAE) model, without the need of explicitly sampling the vector from the VAE’s latent space. The feasibility of our approach is evaluated in the field of music interpolation composition, by means of the Hsinchu Interpolation MIDI Dataset that was created. A novel dual architecture of VAE plus an additional neural network (VAE+NN) is proposed to generate a polyphonic harmonic bridge between two given songs, smoothly changing the pitches and dynamics of the interpolation. The interpolations generated by the VAE+NN model surpass a Random data baseline, a bidirectional LSTM model and the state-of-the-art interpolation approach in automatic music composition (VAE model with linear sampling of the latent space), in terms of reconstruction MSE loss. Furthermore, a subjective evaluation was done in order to ensure the validity of the metric-based results.
Original language | English |
---|---|
Pages (from-to) | 517-529 |
Number of pages | 13 |
Journal | Journal of Information Science and Engineering |
Volume | 38 |
Issue number | 3 |
DOIs | |
State | Published - 05 2022 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2022 Institute of Information Science. All rights reserved.
Keywords
- VAE
- composition
- encoded vector
- interpolation
- latent space
- polyphonic music
- variational autoencoders