TIL: There is an open source "Alexa replacement" project

https://libretechni.ca/post/426637

TIL: There is an open source "Alexa replacement" project - LibreTechni.ca

>As Snowden told us, video and audio recording capabilities of your devices are NSA spying vectors. OSS/Linux is a safeguard against such capabilities. The massive datacenter investments in US will be used to classify us all into a patriotic (for Israel)/Oligarchist social credit score, and every mega tech company can increase profits through NSA cooperation, and are legally obligated to cooperate with all government orders. > >Speech to text and speech automation are useful tech, though always listening state sponsored terrorists is a non-NSA targeted path for sweeping future social credit classifications of your past life. > >Some small LLMs that can be used for speech to text: https://modal.com/blog/open-source-stt [https://modal.com/blog/open-source-stt]

Time to get a mic for my home server!
Home Assistant has been heavily working on that sort of functionality lately.
Home assistant continues to be fantastic, I remember it was what felt like fairly recently that all we had was OpenHAB and although it was fine, it was a bit of an uphill struggle to do anything.
There were like, about two years between OpenHAB and HA being released. Former debuted in 2011, HA saw first release in 2013.

Oh really? I could have sworn HA was a fair bit later than that

I think I used OpenHAB between about 2013 and 2018, then switched to HA around then after discovering it and reading about it for a couple of weeks.

Must have just had my head in the sand then!

To be fair, in the early days HA wasn't too usable. Even around 2018-19, the integrations were limited and the core logic was quite wonky. I'd say around 2020 it became mature enough for daily use for non-tinkerers.
I mean, there are many. STTm TTS and self-hosted automation are huge in the local LLM scene.
I do wish there was a smaller LongCat model available. My current AI node has a hard 16GB VRAM limit (yay AMD UMA limitations), so 27B can't really fit. An 8B dynamically loaded model would fit, and run much better.

You can do hybrid inference of Qwen 30B omni for sure. Or Vibevoice Large (9B). Or really a huge array of models.

…The limiting factor is free time, TBH. Just sifting through the sea of models, seeing of quantization works and such is a huge timesink, especially if you are trying to load stuff with rocm.

And I am on ROCm - specifically on an 8945HS, which is advertised as a Ryzen AI APU yet is completely unsupported as a target with major issues around queuing and more complex models (although the new 7.0 betas have been promising but TheRock's flip-flopping with their Docker images has been making me go crazy...).

Ah. On an 8000 APU, to be blunt, you’re likely better off with Vulkan + whatever GGML supports. Last I checked, TG is faster and prompt processing is close to rocm.

…And yeah, that was total misadvertisement on AMD’s part. They’ve completely diluted the term kinda like TV makers did with ‘HDR’

The thing is, if AMD actually added proper support for it, given it has a somewhat powerful NPU as well... For the total TDP of the package it's still one of the best perf per watt APU, just the damn software support isn't there.

Feckin AMD.

The IGP is more powerful than the NPU on these things anyway. The NPU us more for ‘background’ tasks, like Teams audio processing or whatever its used for on Windows.

Yeah, in hindsight, AMD should have tasked (and still should task) a few devs in popular projects (and pushed NPU support harder), but GGML support is good these days. It’s gonna be pretty close to RAM speed-bound for text generation

Aye, I was actually hoping to use the NPU for TTS/STT while keeping the LLM systems GPU bound.

It still uses memory bandwidth, unfortunately. There’s no way around that, though NPU STT/TTS would still be neat.

…Also, generally, STT responses can’t be streamed, so you mind as well use the iGPU anyway. TTS can be chunked I guess, but do the major implementations do that?

Piper does chunking for TTS, and could utilise the NPU with the right drivers.

And the idea of running them on the NPU is not about memory usage but hardware capacity/parallelism. Although I guess it would have some benefits when I don't have to constantly load/unload GPU models.

Yeah… Even if the LLM is RAM speed constrained, simply using another device to not to interrupt it would be good.

Honestly AMD’s software dev efforts are baffling. They’ve sicked a few on libraries precisely no-one uses, like this: github.com/amd/Quark

While ignoring issues holding back entire sectors (like broken flash-attention) with devs screaming about it at the top of their lungs.

GitHub - amd/Quark

Contribute to amd/Quark development by creating an account on GitHub.

GitHub

Oh, I forgot!

You should check out Lemonade:

github.com/lemonade-sdk/lemonade

It’s supports Ryzen NPUs via 2 different runtimes… though apparently not the 8000 series yet?

GitHub - lemonade-sdk/lemonade: Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk

Lemonade helps users discover and run local AI apps by serving optimized LLMs right from their own GPUs and NPUs. Join our discord: https://discord.gg/5xXzkMu8Zk - lemonade-sdk/lemonade

GitHub
I've actually been eyeing lemonade, but the lack of Dockerisation is still an issue... guess I'll just DIY it at one point.

It’s all C++ now, so it doesn’t really need docker!

You might consider Arch (dockerless) ROCM as well; it looks like 7.1 is in the staging repo right now.

Due to the fact I am running UnRaid on the node in question, I kinda do need Docker. I want to avoid messing with the core OS as much as possible, plus a Dockerised app is always easier to restore.
It’s not lack of software, it’s lack of hardware. Home assistant is ready as are others, but there’s no good cheap mic/speaker/esp in a box hardware
No, home assistant very much is not ready to replace an Alexa device. Home assistant mainly only does automation of smart devices, and as far as i can see nothing else. One of the main things people use Alexa for is to play music from services like Spotify, and home assistant doesn't appear to do that.
Sorry… my experience has been trying to move my google home to something open with no cloud… it’s not been perfect for me after moving. Definitely things missing, but lots of things are better. Spotify does work with home assistant… maybe look again or send a pr

It isn't listed anywhere on their homepage or example demos or anywhere listing its capabilities, so i did a web search to find it and I found that it sorta just kinda can do Spotify, but (1.) that isn't listed anywhere on the home assistant webpage which shows just how not ready for the mass market it is, and (2.) takes a ridiculous amount of very techie setup just to get it to work

https://www.home-assistant.io/integrations/spotify/

And also, out of the box can i ask it to:

  • tell me the weather?

  • set a timer?

  • set an alarm?

I don't see anything on the website that says it can do these things. And even if it can (which doesn't appear to be the case from their website) then the fact that the website doesn't say it can do these things is a problem in itself that shows it isn't ready for the mass market

Just look at the webpage for Alexa vs. Home Assistant and it's clear that Alexa has a very wide variety of abilities and is designed to be easy to use by anyone, while the home assistant website only shows it doing smart device automation and looks like it's not for regular folks

https://www.amazon.com/dp/B0DCCNHWV5

https://www.home-assistant.io/

I would LOVE to replace my Alexa devices with a local FOSS system, but unfortunately home assistant isn't close to being able to do that yet

Spotify

Instructions on how to integrate Spotify into Home Assistant.

Home Assistant

I'm sorry, what?

Googling "home assistant Spotify" results in the very link you've provided.

And you can hardly expect a project like Home Assistant, with THOUSANDS of first party integrations, to cater to your specific needs, or to provide preferential treatment to companies like Spotify, who provide absolutely no support to the project.

It also doesn't require a "techie setup", but following a quite straightforward guide, that culminates in clicking about maybe a dozen buttons (most of them being "I accept" to various terms and policies), then copying a handful of readily provided strings into the right fields. It's simple enough that even my tech illiterate father can do it.

Home Assistant at the end of the day is NOT an Alexa (or other voice assistant) replacement, but a smarthome control hub OS. That it provides a voice assistant interface is quite secondary to its main mission.

Home assistant has a voice assistant feature
It does, but it still has the same inabilities as the screen interface has
You very clearly dont understand home assistant.

The HA Voice Preview is a pretty solid device, but you're right, there isn't really any ready made Echo/Google Home Mini replacement device - primarily because all those devices are generally sold at a loss, or at cost at best, and subsidised by your data being sold.

You won't be able to make a Google Home Mini contender for below $50, and at that price most people will opt for the former. Good quality speakers, microphones, local processing (like the XMOS chip in the Voice Preview) all cost money, and there's no subsidy to be made. Some older Echo devices are rootable, but the hardware tends to be somewhat exotic (meaning no open source support for specialised components), and there's little ongoing third party support (focus has been on the display-equipped models, and to run Android on them).

All in all, "cheap" and "fully local open source voice assistant" don't really coexist.

The issue with that is there isn’t an expensive option either. The only thing close is the home assistant voice preview and it’s still very “preview”. There’s not really any way to do it well at any price point right now.

Well yeah, the availability of these more advanced hardware bits is pretty new - for example, all the older GH Minis and Echo devices were running a quite pared down Linux distro with software processing for e.g. wake words.

Transplanting all that to MCUs takes time, but now we have a solid base, a handful of devices/boards that utilise the various XMOS chips, and soon we will be seeing more and more consumer level devices - but again that takes time when there's no big megacorp behind the project pushing it to completion with bottomless finances and hundreds of engineers.

But you're not exactly correct on there being no other options. There's the Satellite1 smart speaker which might be a DIY kit but it does exist. Then there's the Seeed Studio Respeaker Lite w/ ESP32-S3 to which you can slap a speaker (either directly or a powered speaker through the audio jack). In fact the Respeaker lineup has a handful more options for smart speakers all utilising the various XMOS chips.

Just keep in mind that these speakers are DIY mainly for two reasons:

  • the technology is pretty new
  • there's no big corpo push behind it to deliver profitable (in some way) consumer products

There WILL be consumer products (hopefully soon) on the market, but again, this is being done by volunteers and small startups with just a handful of people, it takes more time to get them on the market than it does for companies the size of Amazon or Google.

Satellite1 Dev Kit

Awesome! I’m very much looking forward to it.

Hopefully there’s some more good mostly consumer-ready devices soon. The software side I’m happy to play with and write my own assistant logic. The gaps in the home assistant software I can code myself but the hardware stuff is outside my wheelhouse.

There also used to be an open source Alexa-like kind of smart speaker that went by Mycroft AI. They were doing crowd funding I believe but that didn’t go anywhere and so they eventually stopped working on it. You can still find their stuff on YouTube though: www.youtube.com/@MycroftAIForEveryone/videos
Mycroft AI

Open Source Artificial Intelligence. Mycroft is voice assistant software that's free, customizable, and doesn't store your data. Our platform allows you to voice-enable any connected device, or you can use our hardware device, Mark I. It's the open source alternative to the Echo's, Cortana's and Siri's in the market. We're hustling to bring a second device to market. You can find sneak peaks of the device on our social accounts linked below, and sign up for updates there too! And join our chat and forum at: https://chat.mycroft.ai https://community.mycroft.ai

YouTube
I have one! It was a really cool project! www.openvoiceos.org is the community fork carrying it forward
Home

Home page of OVOS

I need to play with HomeAssistant more. My last bit of hesitation was I was struggling to find a replacement for the announcement and intercom functionality, which is half of what my family uses Alexa for.

It looks like it got announcements with the “broadcast” intent in February; for the intercom, there may be a plugin. This seems like it might have me covered on the intercom front: github.com/JoeHogan/ha-intercom

Perhaps I’ll mess around with it again once the semester’s over; a lot of my family would really like to jump the Amazon ship and certainly be willing to try it if I give them the option.

2025.2: Iterating on backups

Lot of backup features including using Google Drive and Microsoft OneDrive as backup locations! Voice can now broadcast messages and control your thermostat. And much more!

Home Assistant

We’ve got local llm models, we’ve got local text to speech, isn’t it a matter of time until someone puts in the work to build one? It shouldn’t be surprising.

If you follow programming communities, the most popular thing beginners say they want to build these days is “local AI chat assistant” or some variant of the concept.

I’m extremely interested in the speech-to-text component, not neccessarily for a little AI bot buddy but moreso for archiving my own spoken word to augment my shitty memory