Mastodawn

RE: https://lingo.lol/@thedansimonson/116156498254821672

this is also why LLMs are not "another layer of abstraction". do you need to read the asm output of your compiler every time it runs? no, and having to do so would substantially reduce the value of said compiler

Show thread

pmorinerie Mar 2

@jcoglan @nicuveo yes, but…

A while ago, I investigated using the SDCC compiler to write code for the Game Boy Color.

I usually read and write Game Boy code in assembly (that’s how all the GB games were written back then). So I know how to do basic things, how to write an idiomatic array enumeration, a handful of clever (and not really mandatory) assembly tricks, etc.

That said, how nice would it be to write C – instead of spending hours at "how to multiply those numbers"?

Well: it is awful.

Show thread

pmorinerie Mar 2

@jcoglan @nicuveo for a start, the very-close-to-the-metal constructs don’t map really well to the C language. The C SDK is full of "Use this macro to do magic side effects", some areas of memory become locked or unlocked at specific points…

But the main thing is that the generated code is awfully bloated. Easily 10x the assembly version.

Show thread

pmorinerie Mar 2

@jcoglan @nicuveo All arguments are passed on the stack. No registers are ever used to pass arguments. The stack routines take dozens of instruction before calling a function, and dozens again in the function epilogue. A trivial function becomes a pile of mess.

Plus, simply doing "complex" maths (e.g "x = 64 / 10") will generate a ton of instructions to divide by a non-power-of-two value. When writing assembly code, this is visible, and worked around. Not in C.

Show thread

pmorinerie Mar 2

@jcoglan @nicuveo and all of this bloat matters!

This isn’t even about hot loops; more about general bloat. The Game Boy has a very limited number of cycles available each frame – around 10 000 instructions. That’s it. So bloating the code by a factor 10 for no reasons will make you hit the frame budget _very_ quickly.

Show thread

pmorinerie Mar 2

@jcoglan @nicuveo So, from the point of view of an experienced assembly developer on the Game Boy, C is an absolute waste. Bloated, slow, with leaky abstractions.

Maybe useful in a handful of cases though – like a one-shot large function, out of any critical path, with a lot of maths. Someone used it to generate a maze. But of course the actual gameplay code is assembly.

Show thread

pmorinerie Mar 2

@jcoglan @nicuveo but besides that…

The Game Boy C SDK exists. It is actively used. Homebrew games have been written with it.

Heck, even my own first experience with C was writing a Pokemon-themed ROM for the Game Boy, with scrolling and basic physics. I didn't knew what a pointer was, but still managed to produce something.

Show thread

pmorinerie Mar 2

@jcoglan @nicuveo the point is:

When we’re used to a paradigm, and know it well, we quickly see the limitations of a new way. We know immediately what wouldn’t work, what we would loose, what is slow and bloated.

And sometimes those inefficiencies don’t matter that much.

We can’t (seriously) program the Game Boy in C. But for the next generation of machines, it was the way to go. With the time the generated code is a bit less bloated – but mostly it doesn’t matter: the CPU became fast enough.

Show thread

pmorinerie Mar 2

@jcoglan @nicuveo Now to be clear: fuck GenAI, and fuck the big corps that digest our own work and resell it to us.

I just feel that the argument of ”AI-generated code is a mess!” may be weak.

Sure it is – but maybe it will get slightly better, and for some use-cases it will matter less, and people will use it in non-critical or prototyping or throw-away cases, and it will have some value.

Show thread

pmorinerie Mar 2

@jcoglan @nicuveo
But we can still focus on stronger arguments, the horrible ethics of it instead, the environmental cost, the bubble about to pop, etc etc :)

Show thread

Antoine Leblanc

Mar 4

@pmorinerie @jcoglan
so, two things on this. On one hand: i agree! arguing against genAI / LLMs in terms of efficiency / productivity is arguing *on their terms*. it's implicitly accepting that those tools are okay to use, despite the fact that even if they were good there would still be a mountain of objections.
on the other, i think you've missed the original point that OP was making. :D
i don't read the original point as "LLMs are a bad layer of abstraction because they generate bad / bloated code"; i read it as "by their very nature, they are *random* processes: you cannot treat them as just another layer of abstraction, because *every output has to be reviewed*". no matter how bad a compiler is, it is predictable: sure, it will generate inefficient ASM, but it will always generate ASM the same way, while a LLM *by design* cannot.

Show thread

pmorinerie

@nicuveo @jcoglan eh, indeed :) The conflit between stochastic GenAIs interfacing with deterministic programming (and operating on data-as-facts) is definitely there. A non-deterministic compiler isn't very useful.

Show thread

Phairupegiont Mar 4

@pmorinerie @nicuveo @jcoglan Many subtle points can make compilers non-deterministic in practice. Randomness is a red herring, the real advantage of compilers is the guarantees about run time semantics.

I see many mocking llm as stochastic parrots, but you know what else is non-deterministic? Humans :)