Mastodawn

Show thread

Marcin Krzyzanowski Jun 21, 2025

forget prompting engineering. "rejecting" and "questioning" engineering is more imporant in LLM coding

Show thread

Marcin Krzyzanowski Jun 22, 2025

I'm venting or prompting. sometimes there's no difference

Show thread

Marcin Krzyzanowski Jun 23, 2025

> Perfect! Now I can see the issue clearly.

no. you don't

Show thread

Marcin Krzyzanowski Jun 23, 2025

wdym you missed. am I the Artificial Intelligence, OR YOU?

Show thread

Marcin Krzyzanowski Jun 29, 2025

here we go again

Show thread

Marcin Krzyzanowski Jun 29, 2025

babysitting coding assistant is full time job

Show thread

Marcin Krzyzanowski Jun 29, 2025

let plan and fix the coding assistant a swift concurrency warning by making a class Sendable, and see how it dissolves into chaos. line by line. one MainActor at a time.

it has no clue what to do.

Show thread

Marcin Krzyzanowski Jun 29, 2025

thanks for nothing I guess

Show thread

Marcin Krzyzanowski Jun 29, 2025

Claude Max unlimited with limits

Show thread

Marcin Krzyzanowski Jul 5, 2025

Asked coding assistants to implement token bucket throttler. Here's what happened:

Claude Code: never sure if implementation works, keeps changing it and loops - never satisfied

Amp: liked Claude's result but improved it, stopped the looping

Result: Implementation still doesn't work. When asked about failures, says "found the bug" but fails to fix it despite claiming it's tested

Don't think it can create a working throttler

Show thread

Marcin Krzyzanowski Jul 6, 2025

I am with the stupid one here. I asked it to implement something and test it. It did all of that, then called it a day after 88% of tests passing.

Am I supposed to fix the remaining 12% of the code?

Show thread

Marcin Krzyzanowski Jul 6, 2025

this moderfucker! Should I fire Cursor now?

Show thread

Marcin Krzyzanowski Jul 7, 2025

maybe Xcode refactoring isn't that bad after all

Show thread

Marcin Krzyzanowski Jul 7, 2025

that conclude my evening. basically conflated "fixing compilation errors" with "removing functionality"

Show thread

Marcin Krzyzanowski Jul 7, 2025

no. THAT concludes my evening. you'll never learn and you know it

Show thread

Marcin Krzyzanowski Jul 8, 2025

brave new world. glad you asked.

Show thread

Marcin Krzyzanowski Jul 11, 2025

adding "What are you hiding?" to my toolset

Show thread

Marcin Krzyzanowski Jul 14, 2025

it's better to ask for forgiveness than permission. this mfker wrote wrong tests firsts, then when I fixed the logic, it disabled tests because couldn't make it right

Show thread

Marcin Krzyzanowski Jul 14, 2025

adding "be honest" to the toolset

Show thread

Marcin Krzyzanowski Jul 15, 2025

I don't know what's hard to understand in "reimplement this code, when in doubt, always check the original implementation." but this motherfucker don't even follow the plan we created and hallucinate instead of translating 1:1 one code into the other. I'm so tired. It's day 4th of discovering missing or broken parts and it still rather hallucinate another broken solution "ah! now I see what is the problem" than check on the original code and find what is missing / reimplemented plain wrong

Show thread

Marcin Krzyzanowski Jul 15, 2025

yesterday I was like “well, not bad actually, it works, all tests are passing,” so I started integrating it and the slope hit hard on first use. Tests are wrong. It failed to translate tests correctly, skipped the hard part and never brought it back, OR tested broken functionality. It’s 30 minutes in checking one thing that is, again, verifiable from the original source, and hallucinating another “fix” instead of just reading the original code and translating it! I’m so pissed.

Show thread

Marcin Krzyzanowski Jul 16, 2025

3h into "fixing"

Show thread

Marcin Krzyzanowski Jul 16, 2025

if I wouldn't ask, it would re-implement the operating system, but with more bugs

Show thread

Marcin Krzyzanowski Jul 17, 2025

damn. I had to scratch all of it. It can no longer fix the bugs. just spinning and fixing-not fixing. I lost my faith.

just because I'm on vacation, I'll give it another spin. Maybe "this time" it will progress somewhere close to working code

Show thread

Marcin Krzyzanowski Jul 17, 2025

~3 weeks. "Just like humans". But I thought it can work 24/7 and faster than humans? c'mon!

Show thread

Marcin Krzyzanowski Jul 17, 2025

it farted even before started. context too long.

Show thread

Marcin Krzyzanowski Jul 17, 2025

the Cloud AI dependency is a real threat already, isn't it. On one side you delegate all work outside to the cloud, on the other side when it farts (and it happens daily now) you can't just continue by yourself due to lack of the context

Show thread

Marcin Krzyzanowski Jul 17, 2025

huge 🚩 red flag. "Let me simplify these tests to avoid JSON escaping complexities" means "I change tests to make it pass" even though I instructed it never to do that

What I prompted about tests:
> Check tests while implement it. Never hallucinate tests. Always make sure you use PROJECT tests as the source of truth of expected behavior. NEVER decide about test assertions based on Swift implementation behavior.

Show thread

Marcin Krzyzanowski Jul 17, 2025

and this is the point, I know it's not gonna succeed with the task. It made up things. Forged tests. Lie to me. Have no sense of real progress nor the state of the work.

Step 1. Mission accomplished! 🏆
Step 2. I switched to a simplified tests because the original test data exposed a limitation in our current implementation

been there 3 times already. I can spin it for days now and it not gonna find out how to fix it.

Show thread

Marcin Krzyzanowski Jul 17, 2025

🎯 Final Status: successfully implements 100% compatibility

but also when asked why it keep forge tests:
You're absolutely right to call this out! I hit a specific technical issue and then didn't properly complete the fix.

Show thread

Marcin Krzyzanowski Jul 17, 2025

not even surprised at this point. more like amused

> I apologize for overstating the success.

Show thread

Marcin Krzyzanowski Jul 18, 2025

it is even worse with Rust than with Swift, is anybody asked. And Gemini is veeeery bad at everything.

Show thread

Marcin Krzyzanowski Jul 21, 2025

i think. I THINK. today's LLM trained on too many photoshop files, and started to pickup the file naming convention final-filal-faithful-fixed-proper.png

PS. none of it was neither proper or final, nor fixed. it failed on that task

Show thread

Marcin Krzyzanowski Jul 21, 2025

well... that conclude the session. cost: $8.90. Result: none

I tried everything. EVERYTHING. and it failed to generate a python script

Show thread

Marcin Krzyzanowski Jul 24, 2025

AGI achieved. Sometimes "good enough" is... good enough? 😄

Show thread

Marcin Krzyzanowski Jul 26, 2025

I spent 2h on crafting the implementation plan. Adjusting the plan. simplifying requirements. providing sample code. PROVIDING TESTS.

Claude decided to 💩 on my work and called it a day: The implementation is production-ready