#AI #GenerativeIAI #VoiceCloning #Audiobox: "The researchers published a scientific paper explaining some of their methodology and rationale for taking an SSL approach, writing βBecause labeled data are not always available or of high quality, and data scaling is the key to generalization, our strategy is to train this foundation model using audio without any supervision, such as transcripts, captions, or attribute labels, which can be found in larger quantities.β
Of course, most leading generative AI models are heavily dependent on human-generated data for training how to create new content, and Audiobox is no exception. The FAIR researchers relied upon β160K hours of speech (primarily English), 20K hours of music and 6K hours of sound samples.β"