LVNS-RAVE: Diversified audio generation with RAVE and Latent Vector Novelty Search

Guo, Jinyue; Christodoulou, Anna-Maria; Laczko, Balint; Glette, Kyrre

doi:10.1145/3638530.3654432

Full-text links:

Download:

Current browse context:

cs.SD

< prev | next >

new | recent | 2404

Computer Science > Sound

Title: LVNS-RAVE: Diversified audio generation with RAVE and Latent Vector Novelty Search

Authors: Jinyue Guo, Anna-Maria Christodoulou, Balint Laczko, Kyrre Glette

(Submitted on 22 Apr 2024)

Abstract: Evolutionary Algorithms and Generative Deep Learning have been two of the most powerful tools for sound generation tasks. However, they have limitations: Evolutionary Algorithms require complicated designs, posing challenges in control and achieving realistic sound generation. Generative Deep Learning models often copy from the dataset and lack creativity. In this paper, we propose LVNS-RAVE, a method to combine Evolutionary Algorithms and Generative Deep Learning to produce realistic and novel sounds. We use the RAVE model as the sound generator and the VGGish model as a novelty evaluator in the Latent Vector Novelty Search (LVNS) algorithm. The reported experiments show that the method can successfully generate diversified, novel audio samples under different mutation setups using different pre-trained RAVE models. The characteristics of the generation process can be easily controlled with the mutation parameters. The proposed algorithm can be a creative tool for sound artists and musicians.

Comments:	Accepted to GECCO 24 Companion
Subjects:	Sound (cs.SD); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Audio and Speech Processing (eess.AS)
DOI:	10.1145/3638530.3654432
Cite as:	arXiv:2404.14063 [cs.SD]
	(or arXiv:2404.14063v1 [cs.SD] for this version)

Submission history

From: Jinyue Guo [view email]
[v1] Mon, 22 Apr 2024 10:20:41 GMT (5181kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2404.14063

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Sound

Title: LVNS-RAVE: Diversified audio generation with RAVE and Latent Vector Novelty Search

Submission history