Continuing the conversation about using #whisper in #podcast production with @choochus

We've been talking about transcribing the backlog. Here's what I do with new shows (n=2, early days).

I record the show. I take the raw unedited file and run whisper on default small model. I use that to write show notes as I edit and put things together.

After mastering, I take the final show and run again on medium model. This is what I put in #Obsidian for future reference.

@geniodiabolico @choochus Any idea how #whisper stacks up against the transcription available in OtterAI? I’ve not yet compared the two, but wondering if anyone else has.

@AnthonyBaker @choochus Have not done a head to head quality comparison. Since whisper is running locally, it is a much easier thing to automate transcribing a backlog of 600+ shows than would be Otter. I have used it but not paid for it, so I am not exactly sure what it can do on that end.

For me, whisper is "good enough" and runs in the right place to simplify the tasks of finding information encoded in my audio. Others might be as good or better but I don't know that yet.

@AnthonyBaker @choochus Out of curiosity I'm running the test now on the podcast I published yesterday. Will let y'all know the results.
@AnthonyBaker @choochus Actually, I picked the wrong show. The one from January 20th I did on small, medium and large as a test. I'm running that one through Otter right now. When it is done, I will toot 500ish character snippets of all four. Anthony, I'm happy to transmit you the full files of each if you want them.
@geniodiabolico @choochus If it’s just one episode that’d be awesome. I’m looking to also switch to Whisper but the comparison was something I was definitely interested in. Thanks a ton!

@AnthonyBaker @choochus Here are the results of the same podcast episode transcribed by #otterai and by #whisper with small, medium and large models:

https://www.dropbox.com/sh/639zdujwd7ty78r/AAAXHU6Le6RHh3IlAdIaGn5Ja?dl=0

Transcription Test

Shared with Dropbox

Dropbox

@AnthonyBaker @choochus #otterai snippet:

it is not only a good song, it's good advice. You don't have to believe every single thought that tumbles out of your head. Just because it sounds like you talking. You can take that one of the crank wives. The song is turned out the lights from the album Fox war, I believe this I don't I have done very little research. I think their entire catalogue is like self produced self published.

@AnthonyBaker @choochus #whisper small:

It is not only a good song.
It's good advice.
You don't have to believe every single thought that tumbles out of your head just because it sounds like you talking.
You can take that one to the bank.
The Crane Wives.
The song has turned out the lights from the album Fox Lore.
I believe this.
I don't.
I have done very little research.
I think their entire catalog is like self-produced self-published.

@AnthonyBaker @choochus #whisper medium:

It is not only a good song, it's good advice.
You don't have to believe every single thought that tumbles out of your head just because
it sounds like you talking.
You can take that one to the Crane Wives.
The song is Turn Out The Lights from the album Fox Lore.
I believe this.
I don't.
I have done very little research.
I think their entire catalog is self-produced, self-published.

@AnthonyBaker @choochus #whisper large, which perplexingly did the worst -missing a big chunk. I took this 2 sentences farther. I believe first two line are actually the end of the song.

You don't have to believe every single thought
That tumbles through your head
Just cause it sounds like you talking
Self-produced, self-published, like, I don't believe there's any record label concern in any of this
And this shit sounds so good

@geniodiabolico @AnthonyBaker thanks for sharing! I'll be trying out some dual voice and panel discussions this weekend. Curious what that will look like 😉
@choochus @AnthonyBaker The thing otter does that whisper does not is speaker identification. That show I recorded with y'all about Game of Thrones was kind of a mess in transcript form for example.
@geniodiabolico @choochus Yeah I was figuring it wouldn’t. And I like with Otter that it can identify speakers and you can then name each and it applies their name across the doc. Do wish they had better export options though — they have a lot of structured data you can’t get out of it.