Cohere Transcribe: Speech Recognition

https://cohere.com/blog/transcribe

Cohere Transcribe: state-of-the-art speech recognition

Unmatched accuracy and speed. Transcribe converts your business’ audio data into precise text for search, analytics, and automation.

Cohere

> Limitations

>Timestamps/Speaker diarization. The model does not feature either of these.

What a shame. Is whisperx still the best choice if you want timestamps/diarization?

I would try Qwen-ASR: https://qwen.ai/blog?id=qwen3asr

See the very bottom of the page for a transcription with timestamps.

Qwen

Qwen Chat offers comprehensive functionality spanning chatbot, image and video understanding, image generation, document processing, web search integration, tool utilization, and artifacts.