I'm working on reviving my old podcast searching system using OpenAI's Whisper engine (https://github.com/openai/whisper).

The results so far are amazing. I can run the transcription right on my Mac at roughly 5X realtime, and the accuracy is super impressive. It even gets brand names and weird words right nearly every time.

For example, this segment from The Talk Show where @marcoarment and @gruber argue about how to pronounce databases was perfectly transcribed, down the even the mispronunciations. 🤯

GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision

Robust Speech Recognition via Large-Scale Weak Supervision - openai/whisper

GitHub
@_Davidsmith i must have a misconfiguration somewhere, I’m getting 1x on an M1 MBP at best. It’s still amazing in its quality but the speed is underwhelming.
@donw Try this C++ port of Whisper. It is much faster I believe than the python based version. Then tweak the threads variable to best make use of your machine: https://github.com/ggerganov/whisper.cpp
GitHub - ggerganov/whisper.cpp: Port of OpenAI's Whisper model in C/C++

Port of OpenAI's Whisper model in C/C++. Contribute to ggerganov/whisper.cpp development by creating an account on GitHub.

GitHub
@_Davidsmith @donw have anybody figured out how to transform the microphone PCM16 to whatever whisper is expecting? For me, it is working out of the box for wav files but microphone I just get mumbling as a result