have been exploring retrieval-based voice conversion today. i would like to train a model on my own voice, as while it is fun to transform things into other voices (though i am more interested in the timbral adjustment rather than just using another voice), this would be a great tool to generate vocal pad backing tracks, harmonies, or even lead vocals when i'm unable to provide them (tts -> tune/time in melodyne -> RVC model of my voice).
@msx i wonder how it'd sound if the model was trained on drum sounds instead of regular speech (or non-human sounds, for that matter)
@Gumball2415 that's the next stop. the inverse (applying voices to non-voice sounds) is usually hilarious, and i'm very curious about what happens if the set is non-voice, maybe then applied to voice? i know it does some ML extraction of vocal "components" which may make it super interesting