Mastodawn

RE: https://hachyderm.io/@mnl/116715814203059670

one day someone skeptical of #llm #llms will take me up on this. my timeline still feels like i'm living in an alternate universe.

Show thread

john fink ok!! :goat:2h ago

@mnl a great many anti-AI people are *very heavily invested* in 2023-2024 modes of thinking where hallucinations/"bullshit" et. al. were more of a problem, and I think are reticent of having their view on this challenged.

Show thread

mnl mnl mnl mnl mnl 2h ago

@adr which is ironic... and it was ofc still possible to put models to good use back then.

Show thread

john fink ok!! :goat:2h ago

@mnl of course, but in my line of work (librarianship) there was a time where inference-only chatbots -- e.g., no RAG, no web search, whatev -- were spewing out false bibliographic references to people, and librarians sort of seized on that as proof of LLMs being useless and many have not progressed from that critique.

Show thread

mnl mnl mnl mnl mnl 2h ago

@adr ngl i lowkey miss that era because things were a bit more ... fun? for lack of a better word. now models pretty much do exactly what i expect them to do with code at the inference level, and I have a really hard time getting gpt-5.5 not to do things _better_ than i am able to even imagine them. Which is kind of wild seeing that the hallucination era was... 2 years ago?

Show thread

john fink ok!! :goat:2h ago

@mnl Oh, I love the mess! I think of early LLM days like Janelle Shane / @janellecshane 's experiments as an idyllic, happy time, when these things were dumb and weird and we weren't worried about them taking our jobs so much.

Show thread

mnl mnl mnl mnl mnl 2h ago

@adr I signed that "stop model development" petition back then for exactly that reason, and got blasted here for being a tech bro AI pusher (!?) lol. Look where we are at now 😭 If we were still tinkering with gpt-3.5 things I think would be quite different, and I also think the industry at large wouldn;t have dived headfirst into that tokenmaxxing agentic agentic craze.

Show thread

mnl mnl mnl mnl mnl 2h ago

@adr like i don't even want people to use #llm , I have no skin in the game. I'm heavily critical of many uses of #llms, but there can't be a real discourse if there can't even be agreement on simple facts, facts which heavily influence say, the labor aspects of ai criticism. or its effects on software engineering.

Show thread

john fink ok!! :goat:2h ago

@mnl I *literally* only use them on a local setup, and my local setup is *trash* -- 32GB unified RAM and the worst AMD iGPU ever -- and even at the 9b-12b-30b w/ MoE level I max out at there is real utility. I mean there's a lot of *waiting*, but there's utility. And god yeah, I wish people would talk more about labour aspects. I mean they do, of course, but... more.

Show thread

mnl mnl mnl mnl mnl 2h ago

@adr yeah, even small models are wild, i can definitely do gpt-4 era work on my i7 laptop. there's still a gap that has huge impact in practical use between glm-5.1 (haven't tried out the newest batch with M3 and co) and gpt-5.5, but darn it's close (the practical gap being that I can let gpt-5.5 rip and go for a walk and at worst expect things to be unfinished, vs glm-5.1 getting lost in the weeds and nuking shit when it does)

labor implications for sure, there's no value in handcrafted code. and if the machine _actually_ does a better job for most programming tasks that engineers get paid for, refusing to use that technology just makes you ... a bad engineer. Which is ofc genuinely scary, but IMO it just means we need to seriously sit down and discover what software engineering looks like now that computers can code/architect/maintain.

Show thread

Daniel Demmel 1h ago

@mnl @adr the gap is incredible and does a huge disservice to debates, especially with comrades here on Masto...

I just tried Claude Fable in Claude Code (the little I could on a company subscription) and had involuntary giggles of delight after throwing a lazily ambiguous and very ambitious task at it and watching it coming up with plans that took what I asked and extrapolated my intentions into some really nice solutions.

Meanwhile, I also recently managed to talk my boss into grabbing a refurbished 128GB M3 Max MBP for £3K and it runs MiniMax 2.7 at Q3_K_S with RAM left for actual computer usage and that model shows up in a tough agentic coding benchmark: https://cognition.ai/blog/frontier-code

It runs at a pretty incredible 25 tokens / sec output (only 11B active parameters) and can do so much of the stuff only the biggest frontier models could not even a year ago...

Show thread

mnl mnl mnl mnl mnl 1h ago

@dain @adr it's really impressive, and I think the real bubble burst will be when a good enough for all intents and purposes model can run on a local notebook, or a mom and pop inference hoster (of which there are already so many!). I already can't really care about opus 4.8, fable, and certainly wouldn't if i had to pay anything close to api costs on them.

and in the meantime opensource communities (i care very much about opensource and people being able to use their computers for their own needs) are putting up walls as a way to... cement the status quo??

Show thread

Daniel Demmel 1h ago

@mnl @adr I'm totally with you, would never (even ask my company to) pay for the full API token costs, it's more just amusing myself with how much I can push my ambitions on a £90 subscription.

A 3 grand laptop is of course expensive, but it's also not completely outrageous as a business expense you can write off and Minimax is already there even to do most agentic work a less techy person or business would need.

Yeah, the situation in open source is a bit sad, but in a way I think it's also good that some projects stay "pure" and we finally need to have very explicit debates about governance.