#CommonVoice

The latest versions of Common Voice Scripted Speech (v26.0) and Spontaneous Speech (v4.0) are available.

Bonus : Tashlhiyt (shi) is included in Spontaneous Speech 🚀

https://discourse.mozilla.org/t/release-live-mcv-scripted-speech-v26-0-and-spontaneous-speech-v4-0/148687

Release live: MCV Scripted Speech v26.0 and Spontaneous Speech v4.0

Hello everyone! The latest versions of Common Voice Scripted Speech (v26.0) and Spontaneous Speech (v4.0) are available for download at Mozilla Data Collective: https://mozilladatacollective.com/organization/cmfh0j9o10006ns07jq45h7xk. In this release, 10 new datasets and 7 new languages have been added! Highlights: Scripted Speech An additional 598 hours (~593K clips) of audio in the datasets Welcoming 4 more datasets: Abaza (abq), Khakas (kjh), Khmer (km), and Afaan Oromo (om) The fa...

Mozilla Discourse
On my phone, I use FUTO keyboard app, both in English and Turkish. I also use voice dictation whenever I can.

Yesterday when I was meaning to use the English dictation, I didn't notice that my keyboard was in Turkish mode, so I dictated in English the Turkish keyboard with its multi-language dictation model. The result it typed out was the Turkish translation of my English dictation.

I just tested this and for the basic phrases it does not type out Turkish spelling of English sounds, but translation:
for instance saying "How are you?" into Turkish keyboard types out "Nasılsınız?" instead of "hav ar yu".

More complex English sentences may produce the English spelling of that phrase, but not very reliable. Phrases in other languages may produce translation or the phrace itself in its own language.

In my testing, the larger and slower model does more translation, while medium model stay true to the language that is spoken. Though I should mention that I speak English and Turkish and very few phrases in French, so my testing was quite limited.

For those of you not sure what I'm talking about, FUTO Keyboard is an Android keyboard aiming to be a privacy respecting alternative to Google's keyboard by providing same functionality, but completely on-device, without relying any cloud functionality.

#FUTO #Keyboard #Voice #Dictation
#Mozilla #CommonVoice
Participez aux animations de Mozilla pendant les @geekfaeries : donnez de la voix pour #CommonVoice, éditez des PDF dans votre navigateur et découvrez les dernières innovations de #Firefox à quelques jours de sa nouvelle version, #thunderbird https://blog.mozfr.org/post/2026/06/geek-faeries-2026-en-ligne-de-mire

@cwebber

Yeah. You're referring to Mozilla Common Voice ( https://commonvoice.mozilla.org )

I believe that The HomeAssistant voice assistant uses it in a couple of ways, some non-obvious, ie not just for TTS. From memory, part of their voice recognition training comes from taking CommonVoice samples and adding all kinds of noise and distortion to them that would likely also happen in the real world (background noises, being muffled by objects etc), and then training on those distorted samples.

I contribute occasionally. There's an app on Fdroid that makes it quick and easy.

It's a great project and one I wish Mozilla would focus more on instead of all the other junk.

#Mozilla #CommonVoice #HomeAssistant #TTS #STT

Mozilla Common Voice

@AlexUnder

Si tu as du temps/de l'envie pour du travail bénévole, y'a le projet #CommonVoice de Mozilla qui concerne directement la linguistique...

#Mozilla #CommonVoice 25.0 veröffentlicht.

Common Voice Corpus, der von Mozilla gepflegte, weltweit größte freie Datensatz menschlicher Stimmen, wurde in v25.0 veröffentlicht. Wir berichteten bereits mehrfach über dieses unter der #Creative_Commons #CCO-Lizenz stehende Projekt. Das seit 2017 bestehende Projekt Common Voice fördert damit den Markt der Spracherkennung alternativ zu den großen kommerziellen Anbietern wie Amazon, Apple, Google und Microsoft...

https://linuxnews.de/mozilla-common-voice-25-0-veroeffentlicht/

Mozilla Common Voice 25.0 veröffentlicht

Mozilla hat kürzlich mit Common Voice 25.0 den seit 2017 gepflegten größten freien Datensatz menschlicher Stimmen veröffentlicht. Wir finden das in Zeiten…

LinuxNews.de
Mozilla Common Voice 25.0 veröffentlicht

Mozilla hat kürzlich mit Common Voice 25.0 den seit 2017 gepflegten größten freien Datensatz menschlicher Stimmen veröffentlicht. Wir finden das in Zeiten…

LinuxNews.de
This is why well-functioning open-source Datasets such as #commonvoice are so important ... Quote: Underlying that efficiency is a decision to build on existing foundations. IBM trained its speech model by modality aligning Granite 4.0 to speech on publicly available open-source corpora rather than building a separate speech stack.

I'm looking forward to chatting with the good folks at #SLVlabs this Wednesday lunchtime about @mozilla #CommonVoice, linguistic diversity and the intersection of that with #GLAM - as well as @mozilladatacollective, ethical datasets and fair value exchange.

https://lab.slv.vic.gov.au/participate/technologist-talk-kathy-reid-mozilla-common-voice

Technologist Talk: Kathy Reid on Mozilla Common Voice | SLV LAB

Explore how libraries can advance linguistic diversity, ethical AI and sustainable data stewardship through Common Voice.

SLV LAB

Ein Gedanke zum #DUT von mir: Auch unabhängige Projekte könnten euren Beitrag gebrauchen. Klar niemand hat mehr Zeit für Extra-Hobbies, aber es sind oft auch Kleinigkeiten:
- Eure Stimme als Audioschnipsel für #CommonVoice
- Euer Lieblingskneipe in #OpenStreetMap
- Artikel zum Hobby verbessern auf #Wikipedia
- ...

Sporadisch kann man auch die digitale Welt verbessern 🙃️

#DUTgemacht #DIDit