Mastodawn

Jeff Moss Apr 12, 2024

New GPU ordered for #InfoCon and #DEFCON archive work, crazy how much they have evolved in the past few years.

For #HEVC / #AV1 encoding it is about how many NVENC chip-sets the GPU has, not how many #CUDA cores it has or whatever.

For voice to text it is about FP16 performance.

I haven't found a good voice to voice translation #LLM yet.

I'd love a GPU that is designed for these kinds of media operations and don't need ray tracing, video shaders, 4 video output, etc.

Show thread

zitterbewegung May 3, 2024

@thedarktangent @thedarktangent do you want to preserve the lecturers voice on the voice to voice translation? Why not do voice to text and then translation and then back to text to speech. If you don’t want a GPU you can look at accelerator’s or workstation cards like an A8000 This is targeted torward deep learning cards but will recommend accelerators. https://nanx.me/gpu/ An A8000 still has encoding support also.

Deep Learning GPU Selector

Choose the best GPU for your deep learning workflow with this interactive selector.

Show thread

Jeff Moss May 3, 2024

@zitterbewegung The voice to text (English) and then translate that text to other language subtitles seems to be the way.

I’d love to find a way to allow voice translation in the original authors tone and inflection. I’ve read articles about it existing but haven’t found an open model or service to do it with.

Show thread

zitterbewegung May 3, 2024

@thedarktangent Facebook has an unreleased model https://voicebox.metademolab.com on Reddit https://elevenlabs.io is what is publicallt available.

VoiceBox

Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale

Show thread

Jeff Moss

@zitterbewegung Thank you!