RE: https://mastodon.nz/@leighelse/116149727745113480

I worked on a large scale project testing medical transcription. (Maybe one of the largest ones) Hundreds of doctors reviewed the output and called out the issues.

It was not, and still is not, ready. Public health teams that roll this out without red teaming and remediation and feedback and a way to influence the weights of models are irresponsible.

@skinnylatte

Soo... using that tool, a doctor could see one more patient per shift.

I have this idea. Sounds a bit like science fiction, so bear with me.

What if... we simply recorded the audio, then paid a person to listen to that audio and type up what they heard?

@wakame @skinnylatte

*In general* AI transcription tools are now comparable to or better than human transcription in raw accuracy. Humans aren't very good at sustained effort like this, but may be better at understanding nuance.
Much depends on the system and context; the original article notes that it's dealing with a New Zealand accent, which I'd expect to be currently underrepresented.
A hybrid model is likely to be optimal for precision.

https://www.healos.ai/blog/the-truth-about-ai-medical-scribe-accuracy-rates-2025-healos-achieves-98-performance

AI Medical Scribe Accuracy Rates 2025: HealOS Guide

Discover AI medical scribe accuracy rates and how HealOS outperforms competitors. Compare error rates, performance metrics, and implementation benefits.

@mmalc @wakame I do independent third party testing and evaluations of these types of claims. other than accents, there are aspects in which this type of tool can harm patients and treatment plans without a plan to extensively and continuously test them.

@skinnylatte @wakame
I’m not arguing against that; I'm arguing against the assertion that “just” using humans would be better.

I started in speech recognition research in the late 1980s (and Noel Sharkey was a later colleague). I'm well aware of the issues. Using any basically “safety critical" system without adequate checks is foolhardy.

@mmalc you would know, then, that safe, accurate medical transcription has been about the same distance into the future as *safe* full self driving, but for much longer.

Both require hyper-vigilant supervision, for the same reason - anything missed by either can easily be fatal.

Voice recognition is not getting better, now that it is being called #AI, but it probably is being trusted by decision makers more than it should be.

@skinnylatte @wakame

@wakame
How about paying for sufficient doctors on each shift so they aren't too stressed to create their own documentation? And the support staff so they don't have to do non-medical.admin?
@skinnylatte