Mastodawn

We used to have working spelling and grammar checkers. Why does everybody in tech pretend you need a whole-ass LLM to check for typos?

Show thread

Andrew Radev Feb 20

@baldur A friend told me he uses it to do tasks that are easy to validate, like renaming variables. I was like... did you previously not have ways to rename variables? Is this not something that you've done a million times before?

He mumbled some excuses about how he couldn't get some LSP server to work or something. This man has close to 20 years of professional experience, and chatbots have absolutely ruined his brain.

Show thread

Andrew France

@AndrewRadev I have a bit more sympathy for this type of "instructional" use case. Depending on your language and editor, it can be a right pain to get tooling to work well. I still struggle with Neovim config and getting things working!

I wanted to move an Elixir module yesterday and thought I'd try it with Gemini to save me some drudge work. It worked, found references I probably would have missed on first go. Though not sure it saved me any time in the end with all the "thinking" required.

As much as I dislike these things, it would be non-factual to deny they have some utility.

Show thread

Andrew Radev Feb 20

@Odaeus Of course they have some utility compared to doing nothing. What is their utility compared to other things? What is the utility of a non-deterministic text generator compared to a deterministic compiler or LSP server? What is the *cost* of doing this if it wasn't subsidized by astronomical losses by these companies?

Show thread

Andrew France Feb 20

@AndrewRadev I was comparing its utility to the work involved in learning or choosing an editor/IDE and having it perform the same task... self-evidently not comparing it against doing "nothing" or a compiler..! And how from the perspective of a user/developer, there is benefit there. And sadly, it is reliable for many such tasks.

I do not need to be lectured about the numerous clear externalised costs.

Show thread

Andrew Radev Feb 20

@Odaeus You mention moving a module in Elixir. I don't have professional experience with Elixir. In my last job, I wrote React with typescript.

I always had `tsc --watch` running in a terminal window. When I needed to move some code, refactor components, etc, I would make the change, then follow the tsc errors one by one. This was a fairly straightforward and 100% reliable process. Tsc had its issues, but it was deterministic.

This is why I compare an LLM "move this module" task to a compiler. I guess maybe you can ask the LLM to move the module and also run the compiler? But in either case, you won't miss anything. Maybe it'll fail and you'll have to spend the same time fixing issues than you would have doing things the direct way. Maybe it will succeed, but it'll introduce unrelated changes that compile, but introduce issues. I don't see how you could ever know what you're going to get and why this would be a desirable workflow for a professional software developer. I would always prefer a consistent, reliable workflow, rather than a roll of the dice.

The METR study had people estimate they were faster by 20% using AI, but they were 19% slower. Their estimations were completely off, because it's impossible to reliably know whether you would have, in fact, missed that one reference. I agree that there is *perceived* utility compared to the alternative. I agree there *might* be real utility *sometimes*. I don't believe that most developers have actually designed experiments where they've measured whether it's beneficial for them on average, or not.