"lessons learned"
"lessons learned"
To be clear: this isnt an AI problem, the LLM is doing exactly what its being told to
This is an Openclaw problem with the platform itself doing very very stupid things with the LLM lol
We are hitting the point now where, tbh, LLMs are on their own in a glass box feeling pretty solid performance wise, still prone to hallucinating but the addition of the Model Context Protocol for tooling makes them way less prone to hallucinating, cuz they have the tooling now to sanity check themselves automatically, and/or check first and then tell you what they found.
IE a MCP to search wikipedia and report back with “I found this wiki article on your topic” or whatever.
The new problem now is platforms that “wrap” LLMs having a “garbage in, garbage out” problem, where they inject their “bespoke” stuff into the llm context to “help” but it actually makes the LLM act stupider.
Random example: Github Copilot agents get a “tokens used” thing quietly/secretly injected to them periodically, looks like every ~25k tokens or so
I dunno what the wording is they used, but it makes the LLM start hallucinating a concept of a “deadline” or “time constraint” and start trying to take shortcuts and justifying it with stuff like “given time constraints I wont do this job right”
Its kinda weird how such random stuff that seems innocuous and tries to help can actually make the LLM worse instead of better.
I don’t think we’ve overcome the halfglass of wine issue, rather, we’ve papier-mâchéd over some fundamental flaws in precisely what it is happening when an LLM creates the appearance of reason. In doing saw we’re baking a certain amount of sawdust into the cake, and the fact that no substantive advances has really been made since maybe the 4, 4.5 days, with most of the “improvements” being seen coming from basically better engineering, its clear we’ve hit an asymptote with what these models are capable/ will be capable, and it will never manifest into a full reasoning system that can self correct.
There is no amount of engineering sandblasting that can overcome issues which are fundamental to the models structure. If the rot is in the bones, its in the bones.
Nah there have been huge advancements in the past few months, you are definitely out of touch if you havent witnessed them
Recent models have gotten WAY better at “second guessing” themselves, and not acting nearly so confidently wrong.
I don’t think we’ve overcome the halfglass of wine issue
That isnt an LLM issue at all, that has nothing to do with LLMs in fact. Thats a problem with Stable Diffusion which is an entirely different kind of AI, but yeah that issue is fundamental to what stable diffusion is.
with most of the “improvements” being seen coming from basically better engineering
I mean, thats not much different from any other tech, a LOT of advanced tech we have today is dozens and dozens of separate bits of engineering all working in tandem to create something more meaningful.
Your smartphone has countless different and distinct advancements on different types of technology that come together to make a useful device, and if you removed any one of those pieces from it, it would be substantially less useful as a tool.
So yeah, I personally will very much count the other pieces of the puzzle, advancing, as the system as a whole advancing.
LLMs today compared to ones a year ago are quite a bit better, by a large degree, and the tooling around them has also improved a lot. The proliferation of Model Context Protocol Tools is proving to be a massive part of the system as a whole becoming something actually very useful.
Perhaps you didn’t notice the forum you’re posting in. We’re not here because we love hearing slopaganda.
Personally I believe MCP is the new AMP, and I look forward to dancing on its grave.
Personally I believe MCP is the new AMP, and I look forward to dancing on its grave.
Care to elaborate? MCP is a fairly basic concept and just a specific type of a web server, so its not exactly going to go anywhere anytime soon, since you are literally posting on a forum right now that uses the same tech, lol
Sorry, are you talking about MCP, or AP? I don’t know why any usage of PieFed (what I’m using) or Lemmy would require MCP.
MCP as a way to make agents appear smart is a smoke screen. We already have APIs to enable different online applications to talk to each other, it’s called REST, or Hypermedia if you want to get real fancy. We don’t need yet another layer on top that obscures web properties and places them behind chatbots benefiting Big Tech megacorps and nobody else.
MCP is a fairly basic concept and just a specific type of a web server,
What part of that did you not understand.
We don’t need yet another layer on top that obscures web properties and places them behind chatbots benefiting Big Tech megacorps and nobody else.
If you think MCP servers benefit “Big Tech megacorps and nobody else” then all I can conclude is you are technically behind enough you dont even know how to use docker and therefore your argument is coming from a place of naivety
MCP servers are incredibly simple and easy to self host, and a few self hostable models are competent now at invoking them.
Tonnes of FOSS self hostable software supports wiring it up as well.
Which means anyone can leverage MCP servers to enable LLMs to do whatever you want.
I would compare it to advancements in stuff like Zigbee for IOT devices, its a simple lightweight spec thats small enough you can even put it on an ESP32 with ease.
And if you dont see how there’s a lot of power in that for private self hosted users, then you arent using your imagination enough.
Your attitude towards me and other people in this thread is incredibly distasteful. I know exactly what Docker is. I also know that MCP servers are irrelevant unless we’re talking about LLM agents, a technology funded by Big Tech which is dangerous & destructive (hence the forum you are currently posting in).
This conversation is now over. 👋
If you knpw how to use docker and claim that agents are only funded by large corps, then you must really be living under a rock and/or dont know how to google.
Theres tonnes of grassroots agentic FOSS platforms available and self hostable models people all over the world have built to run on them.
Your either extremely out of touch or purposefully spreading disinformation if you think MCP backed agentic options are limited only to “bog tech corporations”.
Go… google it? I dunno, theres tonnes of options out there now, you are talking like its still 2024, shit has moved way past beyond that now…