The Local Alternative

#webcomic #krita #miniFantasyTheater

@davidrevoy Oh man... I feel this SO SO much...
@davidrevoy 🤣 Are you reading my mind? I JUST got a LocalAI container running and I'm trying to figure out how to make it significantly less slow.
@davidrevoy And still unable to be decoupled from the fact that it was created by The Cabal Of The Evilest Wizards You Can Imagine, No, Eviler Than That
About the only project that cares about divesting is this one: huggingface.co/jadael/comma-v0…
jadael/comma-v0.1-2t-GGUF · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

@davidrevoy she should try an MoE (magic of elements) model, they work pretty nice consumer stones
@davidrevoy Is it an African or an European swallow?
@LappenjammerDieZweite @davidrevoy 11 m/s? Must be an American swallow that just stopped at a Burger King.
@davidrevoy 🦜 A...f...r...i...c...a...n... ...o...r... ...E...u...r...o...p...e...a...n...?
@davidrevoy the trick is to ask it to do something and then work on something else, while it is busy
@davidrevoy Also its cooling is running like crazy while it's almost overheating :D
@davidrevoy IE as a bird. 🐌
@susannelilith 🤣 Poor IE, long time dead, and still receiving nukes. (but deserved 🤭 ).
@davidrevoy https://github.com/ggml-org/llama.cpp check this with the vulkan backend 
GitHub - ggml-org/llama.cpp: LLM inference in C/C++

LLM inference in C/C++. Contribute to ggml-org/llama.cpp development by creating an account on GitHub.

GitHub
@davidrevoy I feel that, running fast-whisper and speech-to-phrase for HomeAssistant Voice locally. Superb Comic! 👌
@davidrevoy Tuboquant may fix that problem
Mom, @davidrevoy is being mean to AI bros again! Mooom!!!
@davidrevoy like my rpi5 with a 3B model 🤣🤣🤣
@davidrevoy 4060TI => biggest AI bang for the buck. An'Apple seems to be mirroring its function+capability going forward. #GoodEnuf
@davidrevoy Great Comic as usual. It's fun to see the story progress every week. Can't wait to see next week's where Pepper still isn't done./sarcasm
@davidrevoy And that is exactly the big lie. The magic of brute force token jumping your way to a patch only works if you have gigawatts of Nvidia filled data center powered by methane jet engines. And nobody knows the actual cost of any of this. You try it locally and it's just making your m4 laptop, that normally churns Blender Cycles renders in seconds, crazy hot to touch and takes hours instead of seconds. It seems way less efficient than ... blockchain.
@jimmac True, my local tests here were mostly with https://flathub.org/en/apps/com.jeffser.Alpaca (advantage of being easily removed and leave the computer clean after test). I tried few models available, but even on my workstation, I saw it was slow. Very educational to imagine how screaming some distant CPU and GPU must be on the servers of the AI compagnies, and how they make it totally invisible to the end users who are asking questions on their phone 'for the fun', not even questioning the (hidden) cost of it...
Install Alpaca on Linux | Flathub

Chat with AI models

@davidrevoy @jimmac that's the huge point Ed Zitron has been making for some time now: if you expose the true cost of this product to the users and bill them for every stupid prompt, oopsie and hallucination, nobody will want to pay for it.
@davidrevoy @jimmac and it's so obvious and anyone with a PC can test it locally like you did and see for themself, and yet nobody cares, it's so frustrating

@davidrevoy @jimmac Yep. I tried, just to see, if running a smaller targeted model in something like Ollama would be any more interesting to use with Home Assistant than it's built-in parser system (which I've added to with my own automations).

A _slight_ mis-hear of me setting a timer caused it to spew out some completely useless garbage on significant delay.

Even making HA matches take priority, it'd still, using the very thing LLM's are _supposed_ to be good at, screw up interpretations. "is the fan on in the hallway?" and it'd say "I don't know about any fans in Home Assistant hallway" or something but the fan was very on, and very in the Hallway room (later trying the correct HA syntax answered exactly right VERY fast).

I got rid of it. I'd rather be able to ramble at my speaker and say "Pizza pasta put it in your mouth" and it reply with "I'm not aware of any area called 'your mouth'" in 2 seconds or "Sorry, I didn't understand that" if I'm even more unintelligible (or just 'off' with my command), and only have the STT and TTS overheads on my GPU, than have the dice roller fuck up repeatedly.

The only thing it was halfway decent at was me basically tossing it a JSON dump from the weather forecast command and going "Here make a conversational thing about the next couple days". It was actually pretty good at that, but not worth it. Rewrote that as just my own writing to say exact temps and conditions for each of the next 3 days. My brain can track the similarities hearing it.

@davidrevoy Kiwix + Hotspot = profit and speed
On any device, even the weakest one
@davidrevoy did she become so dependent on it so as to go through all that trouble? why would you want an alternative other than nothing?

@ariarhythmic I think she is conflicted about it, she experienced many new possibilities with it (calling spells, discussing with tree spirits) and even if it always backfired on her, she is interested to keep experimenting. Her motivation (not really said here) is to get an Avian Intelligence that can rebuild her house without having to pay for premium for her AI Parrot.

You check the last episode in order, I write them to shape a larger story: https://www.peppercarrot.com/en/webcomics/miniFantasyTheater__Avian-Intelligence.html

MiniFantasyTheater - Pepper&Carrot

Official homepage of Pepper&Carrot, a free(libre) and open-source webcomic about Pepper, a young witch and her cat, Carrot. They live in a fantasy universe of potions, magic, and creatures.

Pepper&Carrot

@ariarhythmic @davidrevoy
well there are two options for her to go further:
Option 1 is deliberate AI-Poisoning, meaning making the AI responses so unbearable that everybody stops using them.
Option 2 is to have an Avian Intelligence lookalike, so you can say, that yours is more intelligent, while in truth it's just good old deterministic behavior. 1+1 always equals 2, not sometimes has a different opinion about the result.

Her surroundings pointed out, that Option 1 is pointles to follow, as nobody seems to be impressed by the dumbness of the Avian Intelligence. Hence Option 2 is the next logical attempt.

@mohs @ariarhythmic Interesting, I'll see where it goes. I'm particularly conflicted on the episode of next week, happening a 1st April; my brain has mixed up feelings: should I do a sad thing (to change and contrast)? or something unusual? or another technique (eg. ASCII art). Many possibilities, many opportunity to have bad taste and be excused for it too ^ ^

@davidrevoy

Monty Python ❤️

@davidrevoy She needs to discover ternary magic 🤔

@davidrevoy here it's not that slow...

... I use a model to filter job postings (as a fallback to when the regexes return unreliable results) which is sooo small that it's unreliable...

... well, every model is unreliable, so it's just more unreliable than usual.

@davidrevoy Aaaw, a pigeon! Good choice, lady!

@davidrevoy Now, when it comes to actual LLMs and other hype-based "AI" products, I'm not that much more impressed with local ones than the big cloud guys.

But when it comes to *Avian* intelligences? I'm afraid you've done your job as an artist too well. Pigeon bot is wonderful. I love him.

@davidrevoy It's sad seeing her slowly decline into accepting Avian more and more.
@landelare Yes, but necessary for her (and global story) dramatic curve, and her relation with it (and the No AI Club) she once was part of.
This episode was one of the most problematic for me; but I wanted to cover an aspect of the 'slow running locally' to make her interested to walk to the data centers.

@davidrevoy This is so true. At least it taught me about keeping backups of everything cause it's a matter of when, not if it fails a task and cannot backtrack.

I was lucky to get a sufficiently powerful enough GPU but even that is slow compared to hosted models.

Oh well like her, I have an itch to scratch and it's llms. ( ꩜ ᯅ ꩜;)