When you use LLMs to code something, are you using diff to do line-by-line comparison of the changes the LLM gives you as you work on the code?
Yes, always
26%
Sometimes
8%
Almost never
1.4%
I don't use LLMs for coding
64.7%
Poll ended at .
Edited the text to remove the confusing "vibe coding" wording.
@AlSweigart wow how did you edit this a second time without losing all the votes?
@glyph ¯\_(ツ)_/¯
@AlSweigart @glyph whoa, yeah, is mastodon removing the bug/feature where editing a poll resets the votes? I know .social runs on the latest build, but I scanned PRs quickly and didn't see a related change, maybe I missed it
@aburka Maybe it's because I didn't edit the poll options, just the poll question?

@AlSweigart I don't use them but by definition you don't look at the diffs when vibe coding

> I "Accept All" always, I don't read the diffs anymore
https://x.com/karpathy/status/1886192184808149383

Andrej Karpathy (@karpathy) on X

There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper

X (formerly Twitter)

@AlSweigart I got hung up on the definition of "vibe code" and also what "using diff to compare changes" means.

If it's something to go into production, I always review diffs, but often I let the LLM do its thing for a while before reviewing the changes of its "final" product (it's rarely final but often most of the way there).

If it's throwaway code, there are times I've "vibe coded" in the sense of barely looking at the code (if at all). But that's veering into YOLO/danger territory.

@treyhunner @AlSweigart I was a little hung up on what diff meant. I would lump review of a "git diff" or pull request changes (diff view) into the same category unless you're really trying to gauge changes before committing.
@webology @treyhunner I edited the question. I mean diff as in line-by-line comparison like what git diff gives you.
@AlSweigart @webology @treyhunner I relish the nuanced ambiguities here. To me, it suggests how green this area is. And how much maturity is still required.

@webology @slott56 very green! Using a "coding agent" became a thing only ~12 months ago.

My confusion with the question is because CLI coding agents prompt for each change by showing a diff, which can be disabled but diffs are still shown. Then there's the diff before committing. But the coding agent might be *doing* the commits on its own (depending on the work flow). But there's then the whole PR diff.

@AlSweigart, I understood your question as "do you review the code before running it".

@webology @slott56 @AlSweigart oh and to be clear: even that "do you review a diff before running" question is a bit ambiguous...

If an LLM makes unreviewed changes, the coding agent asks to run the tests, and "yes" is answered then technically no diff was reviewed before the code was run in *some* context (just not production and possibly not even on the machine of the human steering the coding agent).

@AlSweigart I'm not sure what you mean by "using diff" in this context.

Every change that an AI makes gets reviewed and a diff is taken in that process by basically every code review tool on the planet, but I don't independently generate a diff and look over that.

Is your question "are you following a specific workflow involving taking a diff," "are you reviewing what the AI does," "are you reviewing the entire codebase each time," or something else?

@AlSweigart if i am using an LLM to code something i will still usually type everything out manually and modify stuff i know is incorrect or inefficient or not in the correct style. to be clear, i never use LLMs integrated into an editor and only ever use an LLM as a last resort. even then, everything goes through me.

@AlSweigart

When I was using Cursor in the early days, I looked more carefully. As I've moved to a CLI tool and believe the AI is getting better, I'm barely glancing at the diffs as they flow past. And mainly only to look at the code when there are specific problems appearing

@AlSweigart Generally no. I generally prefer the CLI experience to the IDE experience for most LLM work. And when that's the case, I'll review the changes separately in the IDE once the LLM finished them.
I also prefer that they run formatters & ensure linters, typecheckers are clean before I actually take a look.
@AlSweigart As for reviewing the LLM code - similar to my own work, I guess?
If it's a throwaway script that I'll hack and never edit I'll just make sure it doesn't do anything evil (weird dependencies or function calls, etc.).
If it's code I care about - I'll review and improve it.
If it's something I'm letting someone else read - I'll make damn sure it's not slop.

@AlSweigart currently my workflow is to let it make a plan than to let it create a test which i review rigorous the diff
Then i let it implement the code and let it run all the tests
Than I check the diff of the code but not as rigorous as the test code diff
So with linting tools which are integrated in the tests and report the llm changes the code according to the text output

Is it perfect. No. Can I do it perfect. No. But it is a reliable and consistent process to create testing tools and other little helpers.

@AlSweigart I see it as an engineer's responsibility to sign off on anything they commit/push/deploy, which extends to any tool they use to orchestrate these actions.

An engineer who prompts a language model to commit code is still committing the code. We can't launder liability through a markov chain 😀

@rezmason "We can't launder liability through a markov chain"

Sure we can. Not being able to hold a computer responsible for management decisions is AI's prime selling point.

@AlSweigart I wonder if management will still be legal in 2036 😝
@AlSweigart I use LLMs for coding very sparingly, almost exclusively through "I googled something and some AI summary appears to have gotten it right, so I'll try that". I think this counts but is hard to categorize
@lynndotpy Yeah, I mainly use LLMs as a "tip of the tongue" kind of Google. "What was that show that had the guy who wore that shirt?" kind of queries that keyword searches can't do.

@AlSweigart After I tested an llm with "assuming that pi is 4, square the circle," (math proof with a very incorrect premise), I was not impressed.
It blathered about how "your innovative new geometry" would revolutionize architecture and blah blah blah. At no point did it even attempt a proof.
So if feel like if I did try to vibe code something, I'd get.... something. It would compile. It would also have no connection to what I asked for.

I'll do it myself, thank you.

@madengineering Yep. "The text generators generate text." is how I describe LLM's "intelligence".