38 Followers
42 Following
552 Posts
Posts are my own and do not reflect the opinions or policies of any organization or other life forms
Homepagehttps://www.cs.purdue.edu/homes/mcrees
Pronounsthey/them
InterestsC programming, systems integration, build automation

Today, thanks to a former PhD student, I learned that LLMs are terrible at generating Whitespace code: https://arxiv.org/pdf/2603.09678

Also today, I was invited (and I am accepting) to join a group on redesigning our undergraduate curriculum, which includes discussions on our choice of first programming language.

I am not saying that these things are related. I just happen to be mentioning them both at once because I saw them both today. You get a lot of characters in one toot after all.

What?

LLM code is the lead pipe of our generation of software engineers. They’re going to be replacing it for decades and wondering how we were so foolish
One of the ways that LLM-authored code improves productivity is by merely SAYING it does things. It's way faster than the whole time-consuming process of actually doing things. This is real code someone sent to me for review.

"It writes boilerplate faster."

Is that all we aspire to now? Not abstraction or templating or better language design, just... output boilerplate faster.

It's depressing.

Someone said something about ingesting libc and this is what popped into my head
LLMs are expensive and impossible-to-secure malware vectors, part 4000: https://lwn.net/Articles/1061548/
A GitHub Issue Title Compromised 4,000 Developer Machines (grith.ai)

The grith.ai blog reports on an LLM prompt-injection vulnerability that led to 4,000 installati [...]

LWN.net
A few years ago I designed a way to detect bit-flips in Firefox crash reports and last year we deployed an actual memory tester that runs on user machines after the browser crashes. Today I was looking at the data that comes out of these tests and now I'm 100% positive that the heuristic is sound and a lot of the crashes we see are from users with bad memory or similarly flaky hardware. Here's a few numbers to give you an idea of how large the problem is. 🧵 1/5

I’ve been thinking about this for days. Incredible stochastic algorithm, gets more reliable the larger your input, incredibly fast, trivial to implement and deterministic on its inputs. It really has so much going for it.

(Via @jonathankoren )

Writers: Generative AI models were built on our stolen works, are deeply unethical, and risk devaluing our entire profession.

Artists: Generative AI models were built on our stolen works, are deeply unethical, and risk devaluing our entire profession.

Developers: Wheeeeeeeeee!