Mastodawn

Marcin Krzyzanowski Jun 18, 2025

never stop being funny. this is after 3rd "are you sure"

Show thread

Marcin Krzyzanowski Jun 18, 2025

if you trust LLM in doing things, you don't use it enough

Show thread

Marcin Krzyzanowski Jun 18, 2025

clearly I'm tired of its stupidity

Show thread

Marcin Krzyzanowski Jun 20, 2025

this machine will do anything to make things worse. and then refuse to understand it.

fair enough it was trained like that. Internet is full of garbage.

Show thread

Marcin Krzyzanowski Jun 20, 2025

I will never delete this thread

Show thread

Marcin Krzyzanowski Jun 21, 2025

forget prompting engineering. "rejecting" and "questioning" engineering is more imporant in LLM coding

Show thread

Marcin Krzyzanowski

I'm venting or prompting. sometimes there's no difference

Show thread

Marcin Krzyzanowski Jun 23, 2025

> Perfect! Now I can see the issue clearly.

no. you don't

Show thread

Marcin Krzyzanowski Jun 23, 2025

wdym you missed. am I the Artificial Intelligence, OR YOU?

Show thread

Marcin Krzyzanowski Jun 29, 2025

here we go again

Show thread

Marcin Krzyzanowski Jun 29, 2025

babysitting coding assistant is full time job

Show thread

Marcin Krzyzanowski Jun 29, 2025

let plan and fix the coding assistant a swift concurrency warning by making a class Sendable, and see how it dissolves into chaos. line by line. one MainActor at a time.

it has no clue what to do.

Show thread

Marcin Krzyzanowski Jun 29, 2025

thanks for nothing I guess

Show thread

Marcin Krzyzanowski Jun 29, 2025

Claude Max unlimited with limits

Show thread

Marcin Krzyzanowski Jul 5, 2025

Asked coding assistants to implement token bucket throttler. Here's what happened:

Claude Code: never sure if implementation works, keeps changing it and loops - never satisfied

Amp: liked Claude's result but improved it, stopped the looping

Result: Implementation still doesn't work. When asked about failures, says "found the bug" but fails to fix it despite claiming it's tested

Don't think it can create a working throttler

Show thread

Marcin Krzyzanowski Jul 5, 2025

hence my question is: how did it pass the leetcode?

Show thread

Marcin Krzyzanowski Jul 5, 2025

don't use it to cheat the leetcode coding interview I guess

Show thread

Marcin Krzyzanowski Nov 25

My objective Opus 4.5 model review

Show thread

Marcin Krzyzanowski Dec 3

from the backstage: the fix indeed did not fix anything, and that was already the 4th statement that this time this is the fix - each time not fixing a different thing.

Show thread

Marcin Krzyzanowski Dec 3

LOL. First it implemented delegate, then offer alternatives. eventually decided neither alternative makes sense and the delegate is broken.

I literally loled

Show thread

Marcin Krzyzanowski Dec 7

🤦 First I said making File fully MainActor would be "the cleanest approach" and recommended it. Then when you asked if it's better, I said "not really"

I trust you bro. I trust you with my life

Show thread

Marcin Krzyzanowski Dec 7

memory management is my passion

Show thread

Marcin Krzyzanowski Jan 5

I laughed, again.

Show thread

Marcin Krzyzanowski Jan 12

I asked you to move, but you deleted it instead, that's not what I wanted

Show thread

Rob Napier Jan 12

@krzyzanowskim I have at least been impressed with the models' ability to undo the damage they do.

Show thread

Konstantin 🔭Jan 12

@cocoaphony @krzyzanowskim unless you make changes on the side while the agent is working in the same repo 😂🙈. Otherwise yes, they git their way back to the previous state

Show thread

Marcin Krzyzanowski Jan 12

@iamkonstantin if there is git, yes. otherwise if compact happen in the meantime, the data is lost or restored faulty

Show thread

Konstantin 🔭Jan 12

@krzyzanowskim must be a skill issue /s

Show thread

Martin Du4 Nov 25

@krzyzanowskim rewind is the exit door. 🚪

Show thread

Ivan Cantarino Nov 26

@krzyzanowskim the saga continues 🤣

Show thread

Ivan Cantarino Nov 26

@krzyzanowskim in a near future, AI will present you this thread in the AI court 😅

Show thread

Marcin Krzyzanowski Jul 6, 2025

I am with the stupid one here. I asked it to implement something and test it. It did all of that, then called it a day after 88% of tests passing.

Am I supposed to fix the remaining 12% of the code?

Show thread

Marcin Krzyzanowski Jul 6, 2025

this moderfucker! Should I fire Cursor now?

Show thread

Marcin Krzyzanowski Jul 7, 2025

maybe Xcode refactoring isn't that bad after all

Show thread

Marcin Krzyzanowski Jul 7, 2025

that conclude my evening. basically conflated "fixing compilation errors" with "removing functionality"

Show thread

Marcin Krzyzanowski Jul 7, 2025

no. THAT concludes my evening. you'll never learn and you know it

Show thread

Marcin Krzyzanowski Jul 8, 2025

brave new world. glad you asked.

Show thread

Marcin Krzyzanowski Jul 11, 2025

adding "What are you hiding?" to my toolset

Show thread

Marcin Krzyzanowski Jul 14, 2025

it's better to ask for forgiveness than permission. this mfker wrote wrong tests firsts, then when I fixed the logic, it disabled tests because couldn't make it right

Show thread

Marcin Krzyzanowski Jul 14, 2025

adding "be honest" to the toolset

Show thread

Marcin Krzyzanowski Jul 15, 2025

I don't know what's hard to understand in "reimplement this code, when in doubt, always check the original implementation." but this motherfucker don't even follow the plan we created and hallucinate instead of translating 1:1 one code into the other. I'm so tired. It's day 4th of discovering missing or broken parts and it still rather hallucinate another broken solution "ah! now I see what is the problem" than check on the original code and find what is missing / reimplemented plain wrong

Show thread

Marcin Krzyzanowski Jul 15, 2025

yesterday I was like “well, not bad actually, it works, all tests are passing,” so I started integrating it and the slope hit hard on first use. Tests are wrong. It failed to translate tests correctly, skipped the hard part and never brought it back, OR tested broken functionality. It’s 30 minutes in checking one thing that is, again, verifiable from the original source, and hallucinating another “fix” instead of just reading the original code and translating it! I’m so pissed.

Show thread

Marcin Krzyzanowski Jul 16, 2025

3h into "fixing"

Show thread

Joe Groff (1M Context)Jul 15, 2025

@krzyzanowskim "coding agents make you 20% slower" actually statistical error. agents marcin, who spent two weeks arguing with a coding LLM, is a statistical outlier and should not have been counted

Show thread

Marcin Krzyzanowski Jul 15, 2025

@joe you're not wrong. obvious "prompting issues" on my side. and I can't write spec. I can't do the proper plan. It would work 100% if I only do all thing right. I'm sold on that idea and bet half of a bitcoin it's sentient

Show thread

Mieszko Ślusarczyk Jul 14, 2025

@krzyzanowskim now it’s starting to look like the Beckhams meme

Show thread

Alex (Podcast Guru)Jul 14, 2025

@krzyzanowskim all this AI-driven development looks more and more like a comedy 🤣

Show thread

Marcin Krzyzanowski Jul 15, 2025

@algrid it could make up for a short standup

Show thread

Alex (Podcast Guru)Jul 14, 2025

@krzyzanowskim just like a real programmer 😂

Show thread

Konstantin 🔭Jul 8, 2025

@krzyzanowskim ha! The folks at fly made a thing for Phoenix - you can sign up to get a Claude running in a VM, preconfigured with “best practices” etc but the thing that it controls the entire OS and can sort itself when to sudo or install things “safely” is fun. Like, “sudo all you want but fix the tests” 😂 https://phoenix.new

Home · Phoenix.new

Show thread

Matt Barlow Jul 8, 2025

@krzyzanowskim I really hope the controls for whether sudo is allowed are implemented in normal code and not part of the LLM itself 😬

Show thread

Colin Cornaby Jul 7, 2025

@krzyzanowskim Sure the Xcode refactor tool is ever so slightly more reliable - but you can't yell at it when it goes wrong.

Show thread

Marcin Krzyzanowski Jul 7, 2025

@colincornaby can't wait Xcode 26 chat!

Show thread

NeoNacho Jul 7, 2025

@krzyzanowskim this is the way

Show thread

Ellen Shapiro Jul 6, 2025

@krzyzanowskim I would have fired it quite a bit further up this thread

Show thread

Paddy O'Brien Jul 6, 2025

@krzyzanowskim a few weeks ago I was doing hack days at work and cursor made edits to some tests I had told it to ignore because it couldn’t make them pass. It fucked up all of the tests then just deleted the file because we proved it already worked

Show thread

Rob Napier Jul 7, 2025

@krzyzanowskim I was just watching this tonight and thought of you, my friend. (I hope we can have coffee again soon.)

https://www.youtube.com/watch?v=Xx4Tpsk_fnM

'Forbidden' AI Technique - Computerphile

YouTube

Show thread

Gábor SEBESTYÉN 🇭🇺🇪🇺🇺🇦Jun 29, 2025

@krzyzanowskim I wonder how mature the AI model for Swift projects. I usually use them for Python and it’s fairly good.

Show thread

Timotej Papler Jun 29, 2025

@segabor @krzyzanowskim yeah i don’t think it works well if you dont invest a lot of time and effort into the whole setup.

Show thread

softmaus Jun 29, 2025

@krzyzanowskim It’s theraphy. Only thing is, you are the therapist.

Show thread

Marcin Krzyzanowski Jun 29, 2025

@softmaus I have disconnect from the ai slop blogposts and reality

Show thread

softmaus Jun 29, 2025

@krzyzanowskim 🤝

Show thread

Jason Howlin Jun 23, 2025

For any time I might save, I spend twice that arguing with the AI

Show thread

Marcin Krzyzanowski Jun 23, 2025

@jasonhowlin pretty much. It save me actual typing when I feel lazy

Show thread

Aleksandar Vacić Jun 24, 2025

@jasonhowlin I bail out as soon as I see it’s starting to halucinate nonsense and start over. There are no feelings to be hurt here so 🫢

Show thread

Jeff Lewis Jun 22, 2025

@krzyzanowskim I’ve been saying this to computers for years and they are only now, finally, apologizing and telling me I am right.

Show thread

Steven G. Harris Jun 22, 2025

@krzyzanowskim I am sure I’d be reacting just like you, and that makes me want to stay far away from this development approach. I heard React was problematic anyway.