Mastodawn

reading about #pocketsphinx,
https://cmusphinx.github.io/2022/08/pocketsphinx-continuous/

author talks about #sox
>and it is so perfect that development on it largely ceased

BASED. THIS is the attitude that software sorely needs!
a project *completing* should be a good thing, but for some reason the neophilia is so strong that people can't imagine a state other than "incomplete" or "abandoned".

Why I Removed pocketsphinx_continuous And What You Can Do About It, Part One

CMUSphinx is an open source speech recognition system for mobile and server applications. Supported languages: C, C++, C#, Python, Ruby, Java, Javascript. Supported platforms: Unix, Windows, IOS, Android, hardware.

CMUSphinx Open Source Speech Recognition

Richard Emling (DO9RE)May 1, 2025

I'm exploring ways to improve audio preprocessing for speech recognition for my [midi2hamlib](https://github.com/DO9RE/midi2hamlib) project. Do any of my followers have expertise with **SoX** or **speech recognition**? Specifically, I’m seeking advice on: 1️⃣ Best practices for audio preparation for speech recognition. 2️⃣ SoX command-line parameters that can optimize audio during recording or playback.
https://github.com/DO9RE/midi2hamlib/blob/main/tests/speech_menu.sh #SoX #SpeechRecognition #OpenSource #AudioProcessing #ShellScripting #Sphinx #PocketSphinx #Audio Retoot appreciated.

GitHub - DO9RE/midi2hamlib

Contribute to DO9RE/midi2hamlib development by creating an account on GitHub.

GitHub

Show thread

cyclical_obsessive Feb 23, 2025

@avlcharlie

For inspiration Carl - Total “life” today: 53536 hrs

https://youtube.com/watch?v=vhAQSvxJHTU

#GoPiGo3. #RaspberryPi #Nyumaya #AutomaticSpeechRecognition #pocketsphinx #Robots

Raspberry Pi 3 GoPiGo3 Robot Carl Meets A Squirrel

YouTube

Asheville Charlie Feb 22, 2025

Ok..so my #ai #robotics project is just about done. I do these things as personal challenges and I've succeeded on all the major points, now just tweaking and putting them together.

I think I'll work on the speech response today and then integrate the movements and this will be a done one.

Here are some of the pieces parts for the nerds..
*Everything is local*

#Raspberrypi 4 2G
#Pca9685 i2c pwm controller
#Mg90s servos
#Pocketsphinx STT
#Festival light TTS
#python lots of python

coucou Jan 11, 2025

@bmzimmermann
provided you have ~ 4GB free disk space and significantly more than 8GB RAM, the results so far are mind blowing. In contrast to #pocketsphinx which comes with lots of limitations in terms of file formats, sampling rates, recognized languages etc.
of course, you have to speak clearly and audibly. :)

Show thread

Tuxicoman Apr 28, 2023

@older @homeassistant

Which downs to #pocketsphinx and #espeak ?

Mike Stone Jun 30, 2022

@Corvus You can. There are several different ways to configure the wake word. You can use a predefined model, you can define a new wake word using phonemes and #PocketSphinx, or you can even train your own model using #Mycroft's Precise software. It's open source and provides the best accuracy. #Precise is based on a neural network that is trained on sound patterns rather than word patterns. It does take a lot more work though.

https://mycroft-ai.gitbook.io/docs/using-mycroft-ai/customizations/wake-word

Using a Custom Wake Word

You might want to change the Wake Word to a phrase that's easier for you to speak, is more culturally appropriate, or just more personal and fun for you.

hackaday unofficial Mar 28, 2022

Say Friend And Have This Box Open For You

Handcrafted gifts are special, and this one's no exception. [pender] made a Tolkien-inspired box for his son and shared the details with us on Hackaday.io. This one-of-a-kind handcrafted box fulfills one role and does it perfectly - just like with the Doors of Durin, you have to say 'friend' in Elvish, and the box shall unlock for you.

This box, carefully engraved and with attention paid to its surface finish, stands on its own as a gift. However, with the voice recognition function, it's a project complicated enough to cover quite a few fields at once - woodworking, electronics, and software. The electronics are laid out in CNC-machined channels, and LED strips illuminate the "Say Friend And Come In" inscriptions once the box is ready to listen. If you're wondering how the unlocking process works, the video embedded below shows it all.

Two solenoids keep the lid locked, and in its center is a Pi Zero, the brains of the operation. With small batteries and a power-hungry board, power management is a bit intricate. Two capacitive sensors and a small power management device are always powered up. When both of the sensors are touched, a power switch module from Pololu wakes the Pi up. It, in turn, takes its sweet time, as fully-fledged Linux boards do, and lights up the LED strip once it's listening.

[Pender] didn't want to go for any cloud-based voice recognition service - such a gift requiring a pre-established WiFi connection would be no fun at all. Instead, he set up and used PocketSphinx for offline voice recognition. The box is a surprise for his son, thus, [pender] couldn't quite ask for voice samples. Thankfully, PocketSphinx recognizes phrases without pre-training. There's also a contingency mechanism - the electronics aren't accessible until you open the box. The secret is to apply voltage to two pads, which will unlock the solenoids directly, just in case the battery dies or the Pi glitches out.

We don't quite know what awaits when the box is opened, but we can only hope that the son's name isn't Balin. We thank [pender] for documenting the details, as a reminder that our hacker skills let us build unique and beautiful things for the people we value. Hacker-built gifts are great - we've seen guestbooks for weddings, couple's proximity sensing LED badges, ESP32-based CTFs, and even meteorite turntable display stands.

#art #raspberrypi #hackadayio #pocketsphinx #powermanagement #voicerecognition #woodworking

Say Friend And Have This Box Open For You

Handcrafted gifts are special, and this one’s no exception. [John Pender] made a Tolkien-inspired box for his son and shared the details with us on Hackaday.io. This one-of-a-kind handcrafted…

Hackaday

hackaday unofficial Sep 25, 2021

Making Linux Offline Voice Recognition Easier

For just about any task you care to name, a Linux-based desktop computer can get the job done using applications that rival or exceed those found on other platforms. However, that doesn't mean it's always easy to get it working, and speech recognition is just one of those difficult setups.

A project called Voice2JSON is trying to simplify the use of voice workflows. While it doesn't provide the actual voice recognition, it does make it easier to get things going and then use speech in a natural way.

The software can integrate with several backends to do offline speech recognition including CMU’s pocketsphinx, Dan Povey’s Kaldi, Mozilla’s DeepSpeech 0.9, and Kyoto University’s Julius. However, the code is more than just a thin wrapper around these tools. The fast training process produces both a speech recognizer and an intent recognizer. So not only do you know there is a garage door, but you gain an understanding of the opening and closing of the garage door.

In addition, the tools are all made to work in Unix-style pipelines which is refreshing. Here's an example configuration from the project's website:

[GarageDoor]
open the garage door
close the garage door

[LightState]
turn on the living room lamp
turn off the living room lamp

There are templating features so you can specify optional words and alternative words in a single rule. There are other features like mapping an object like living room lamp into something more computer-friendly.

Overall, this looks like a fun tool to have in your kit. If you do something interesting with it, be sure to drop us a tip so we can cover it. Meanwhile, we've been watching Linux speech for quite a while. Of course, what we really want is speech commands like the USS Enterprise, and we have to admit it is getting closer.

#linuxhacks #deepspeech #json #linux #pocketsphinx #speech #voice

Making Linux Offline Voice Recognition Easier

For just about any task you care to name, a Linux-based desktop computer can get the job done using applications that rival or exceed those found on other platforms. However, that doesn’t mea…

Hackaday

lain Feb 4, 2021

Soon, the 'Gossip Booth' of radioslumber.net. A speech recognition online tool that filters 'secrets' by returning in-between-the-words expressions.
Laughter, pauses, uncertain expressions, breaths, sounds of discomfort and pleasure.
...
...
#pocketsphinx #speech_recognition #gossip #radioslumber #web_audio
...
...
...
Gossip in any language can be a fast, low-in-volume way of speaking, sounding like murmuring often including slang. This way of speaking happening between immigrants may sound blurry to the ears. Without the understanding of the words the speaking becomes a melody, a sound missing the logical meaning. This may be annoying for the outsiders but at the same time it is a familiar, joyful, shared and supportive space for the ones that had to leave their home. Even more for the ones that often don't have the public platforms to share their concerns and solve their problems, and they are not supported by the systems (legal, economic, cultural, political) they live in. The 'Gossip Gaps' embraces these discomforts in the ear of some listeners and the fast murmurings of despair and pleasure.