"where needed"
"WHERE NEEDED"????
For which part of this process do you NEED AI help?
*screams into screaming pillow*
@nina_kali_nina
...the sad thing is a tool that could look at that binary and point you at the reponof the port would have a genuinely impressive advancement for search
The technology they're building could actually be useful if they stopped throwing money in the furnace, and were more conservative with applying it. Though of course you likely don't n3ed ML for that anyways and I'm pretty sure the mass buildout is using ai as an exuse for massive resource and land grabs thst they already wanted to do sooo
@nina_kali_nina @madengineering @0xabad1dea I’ve seen this myself as well. Reviewed a draft PR for some enhancement for our internal compiler generated by Claude. It’s weird and too complicated and my spidey sense is tingling. I go look for the same module name in the rust compiler. It’s basically the same structure, but simplified.
And I think that’s generally what people are seeing when they see an LLM being “creative”. They’re just unable to comprehend how vast a pool of human work it’s cribbing from.
@0xabad1dea @nina_kali_nina @bytex64 I've improved since college.
So Claude, consider this a challenge. I've upped my game, now up yours.
@0xabad1dea but the blog post announcing the CCC quite literally says that the agents made the code base unmaintainable and cannot fix any more bugs without introducing new ones. So, that's a fail too.
And then looking at it from a practical perspective: if I want a C compiler, I can get one for free, and I have multiple options: clang, gcc, pcc, tcc, chibicc, and probably many more. If for some reason I want to add the support for a new platform in them, I can, too. It's been done too many times to count. Why would I want to spend merely 20 grand on building a thing that is, by all sensible benchmarks, at best is a toy?
I have an answer, and I don't like it. If I wanted to undermine labour, if I wanted to destroy FOSS, if I wanted to steal human work and resell it, that would've been exactly what I'd do. And I'm yet to be proven otherwise that there are other real motivations behind such projects.
2/2
@nina_kali_nina @0xabad1dea I have another answer, but it is not satisfying.
They really have no idea what they are doing and they are so privileged they never had to really face reality.
I wonder how it does with the tests that gcc passes but the agent didn't have access to.
@da77a9
I - don't think that's necessarily accurate. (Context: I'm a c programmer and compiler hacker, and not at at all a fan of these things)
But i think that that's downplaying that creating a rust implementation of any given feature is harder than doing the same in C, since it forces you to pay attention to e.g. lifetimes.
any competent C programmer would be doing thst anyways even if tje compiler doesn't force them to, but - there's arguments both ways here tbh
"Rustc made sure it can't!" Is... not really accurate.
If you write c code with some kinds of bad memory management, it'll compile but segfault.
If you write the equivalent rust, it won't compile
So necessarily in order to compile at all, it must be doing the same me.ory management that it'd need to do in C. The compiler checking its work probably makes it easier because it's constant small feedback which is more likely in the training set, but with unit tests available anyways, the increased complexity of writing rust over writing c is possibly a wash.
@da77a9 @0xabad1dea @pixx I would guess that the breadcrumbs given by rustc's error messages are more conducive to the LLM landing on working code than runtime crashes using C. Also what you or I find hard may not matter to an LLM that doesn't "understand" like a human does.
I guess my point was that this "from-scratch implementation" is a fiction: an LLM could not have invented the first C compiler any more than it could have invented rustc. It's spitting out a remixed version of code and techniques invented by many, many, extremely ingenious humans. At a cursory glance it has wow factor, but ultimately all LLM vendors are greedy ingrates trying to extract value from the work of others 
@goatcheese @0xabad1dea @pixx
Oh absolutely it can't invent. But
1) rust compile errors prevent some versions of Frankenstein's compiler from even lurching off the table, so Frankenstein has to try again with different body parts (feedback loop)
2) a train of borrowed fragments of rust, that pass that fitness test, and that fit together probabilistically, from a sample of rust code that works (as well as compiles) is more likely to stay on the rails than the same in C.
It is interesting that "correct" but it's a memory hog (no ownership conflicts but lifetime management issues?) is evident from the comparison of gcc vs ccc execution.
I'm not suggesting any sort of magical properties from rust - just that it removes some degrees of freedom.
@0xabad1dea It’s so diabolically bad I don’t know how you do it. We’re not talking about gcc -O3 here, which does some truly herculian things, we’re talking about GCC with basically every optimization disabled. I don’t understand how the generated code wouldn’t run within a finite constant factor of gcc here, you just have to spit out the dumbest possible assembly for a given input source.
You just know there’s some absolutely horrific workarounds going in here because it’s apocalyptically bad in utterly incomprehensible ways.
@regehr @0xabad1dea i know! there’s presumably a whole Source -> AST -> SSA -> Multiple optimization passes -> Assembly pipeline going on here! what on earth is it even doing in there that the output is this embarassingly bad?!
The output would be quite frankly embarassing for a single pass source -> assembly/machine code translator (which you can do for a half reasonable subset of C in 2kB of C code, see e.g. OTCC) but there’s an entire optimization pipeline in there?!
mov big_offset(%rbp), %reg and back are probably huge and giving the instruction decoder indigestion.And we're talking about the kind of things that tcc can compile. TCC was originally an entry into the International Obfuscated C Competition, as a C compiler that fitted on one screen and could compile itself (the back end bit is in QEMU as the Tiny Code Generator, which QEMU uses for JITing small fragments of emulated code).
The full version is bigger, but still very small. And it can compile SQLite.
It's pretty naïve. It doesn't do anything more than peephole optimisation. In the worst case performance is usually around 25% of GCC (occasionally worse for vectorised hot loops), for some things it's closer to 90%.
TCC is not designed for generating fast code, it was designed to be simple and to generate code quickly (they did a demo about 20 years ago with tcc embedded in GRUB, compiling the Linux kernel and then booting it. It took 30s to compile the kernel in an x86 emulator on a 1.25GHz PowerPC host). So if you're generating slower code than TCC, that's really embarrassing.