Just ran Demucs completely locally on my system (RX 6700 XT / 16 GB RAM).

Demucs is an open source AI model for music source separation, developed by Meta. It can split a full song into individual stems like vocals, drums, bass, and other instruments, making it useful for remixing, transcription, and audio analysis.

Test track: Fear of the Dark by Iron Maiden
(https://www.youtube.com/watch?v=bePCRKGUwAY)

Setup:

- Demucs installed via pip
- Model: htdemucs (default)
- Input converted to WAV using ffmpeg
- GPU acceleration via ROCm

Setting it up is tricky because Demucs is tightly pinned to older PyTorch versions, so you have to install dependencies manually and use "--no-deps" to avoid breaking your (ROCm-)PyTorch setup.

Result:
Very clean vocal separation in most parts. Some artifacts appear during very loud or distorted sections (e.g. emotional peaks or shouting).

Next steps / possibilities:

- Normalize and filter audio before separation
- Extract vocals for transcription or remixing
- Create karaoke / instrumental versions
- Combine with Whisper for lyrics
- Batch processing for datasets
- Model: htdemucs_ft (higher quality)

Video workflow:

- Recorded with OBS
- Edited in Kdenlive
- Transcoded with VAAPI (H.264)

No cloud, real hardware.
Everything runs on Linux, so anyone can set this up.
Works on CPU as well, but much slower.

#Demucs #AI #MachineLearning #AudioSeparation #MusicAI #OpenSource #Linux #ROCm #AMD #DeepLearning #AudioProcessing #Vocals #Karaoke #StemSeparation #SelfHosted #NoCloud #FOSS #Tech #LocalAI #MetaAI
It'd be nice if there was a #Demucs model that could separate laugh tracks from sitcom episodes. I know an #AI laugh track remover exists already, but to be honest, I wasn't impressed at all by the demo. It sounds like it just turned the episode all the way down when a laugh track came in. Unfortunately, I think the reason it can't happen easily yet is because there aren't many public domain croud sounds out there that you can just train AI on if any, or at least, not to my knowledge. #ML
Using AI to turn Youtube videos into Karaoke

Jackson Geller's personal website

#Demucs vs #Logic 11, short comparison:
Well damn. By using '-d mps' on my M1 Max, I got #Demucs to run at about 21 seconds per-second.
A tip for those wanting to try #OpenVino with #Audacity. If using #Demucs, after the process is complete, unselect all tracks in the project except the very first one, which is the original file before processing. Then, press Alt + T, then hit V to remove the selected track. Otherwise, the file will clip like hell!

#ostripdkpr

#Demucs 4HT

huh ?
Guess I need to do some #audioripping sometime soon again ... 

So, we now have GPU support from PyTorch on M1, but how can one utilize this for #Demucs if at all? I am not able to find any references!