I'm working on reviving my old podcast searching system using OpenAI's Whisper engine (https://github.com/openai/whisper).

The results so far are amazing. I can run the transcription right on my Mac at roughly 5X realtime, and the accuracy is super impressive. It even gets brand names and weird words right nearly every time.

For example, this segment from The Talk Show where @marcoarment and @gruber argue about how to pronounce databases was perfectly transcribed, down the even the mispronunciations. 🤯

GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision

Robust Speech Recognition via Large-Scale Weak Supervision - openai/whisper

GitHub

@_Davidsmith @marcoarment @gruber

holyshitholyshitholyshit.

a) holyshit
b) i so need to get this running and indexing the podcasts I listen to (because none have transcripts, and I so often want to pull quotes from them)
c) holy shit
d) How do we make it easy for _every_ podcast to add this to their site?!

@masukomi Check out the Snipd podcast player. It has already integrated the Whisper engine and makes it it easy to save quotes, export, etc.
@ruan oooh. I was wondering if there was something like that. Thank you. :D