reading LLM output is often not so much errors per minute as errors per sentence.