my stupid llm research is absofuckinglutely not going the way i was hoping.

ive spent like a fucking week trying to setup a testing harness to get local models to do the same test 100 times, aperture science style, to test the drift of their results

but 100% of the time, the model:
- emits tool calls incorrectly, so i see them
- ignores instructions
- falls into a loop
- says its gonna do stuff, then .. just doesnt
- intentionally deviates from instructions even when explicitly told not to

this is nuts. keeping these things on track is a sisyphean task of pure crystalline futility.

how people think they're gonna lose their jobs to this fucking crap is nuts. the only way to make it work is to put it in trillion dollar datacenters, so that the 1% of output that isnt absolute trash actually makes it out of the system.

@Viss the output doesn't have to be good for people to lose their jobs
@twipped but it has to 'at least work'
@Viss do you really think the executives pushing this shit have actually used it enough to see if it works? Do you think they understand development enough to even know if it actually works?
@twipped it'll be a delayed response. of course they dont, but they'll see their quarterly figures start plumetting, and their staff will start quitting, and their costs will suddenly skyrocket and it wont take long before they finally have to admit its not working
@Viss @twipped "AI makes you faster"

Please say that again after my coworker decided to "rewrite" the datastructure using gemini and hands you a mess that has you recursing 5 times because of the inefficient way they have built shit. I dare you

@mander @twipped llms seem to be like a hammer...

- good for a very narrow number of things
- using it for something other than its narrow purpose gets ugly and causes big problems really fucking fast.
- heavy, can be a pain in the ass
- will absolutely do damage if you miss
- RIP your thumbs
- hit a nail wrong, it shoots into your eye
- you can just murder people with a hammer if you want

@Viss @twipped hey, now you are being very mean to hammers
@mander @twipped if you weld a flathead screwdriver to a hammer, then a level, then a stud detector, then a 10mm ratchet.. suddenly it stops being a very effective hammer. and the other shit you welded to it wont be very effective either