Mastodawn

My job as a senior developer with a team of juniors is to figure out what to write, sketch a PoC as guidance, and then delegate the actual implementation to them. I'm going to look at that, explain misunderstandings or poor style choices, and guide them into implementing something that meets our standards.

I don't think LLMs can do my job yet. But I think we're getting shockingly close to them being able to do the other part. And I'm worried how we're going to get more senior developers.

Show thread

Matthew Garrett Mar 7

I would not have said the same thing 6 months ago - the amount of progress here is significant. And I'm not denying that the technology has resulted in massive quantities of poor quality code produced by people who aren't in a position to review it, or that the externalities of all of this are large. But capitalism isn't going to give a shit, so we're getting all of this anyway whether we like it or not

Show thread

Glyph Mar 7

@mjg59 do you have some way of evaluating that progress in the last 6 months in some way that is not the subjective impression of improvement?

Show thread

Paul McMillan Mar 7

@glyph @mjg59 watching the benchmarks get saturated is interesting, but watching teammates build entire non-trivial projects entirely with the technology is a lot more convincing. There was a really palpable uptick in capability of the most powerful variants of this at the beginning of this year.

Show thread

Glyph Mar 7

@PaulM @mjg59 Someone I respect has said *some* version of this to me every month since ChatGPT first shipped though, and I am tired of retesting various models and having them all produce the same hot garbage for my problems, while wondering if they're slowly making me psychotic as a side-effect. I keep asking this question because if *hard* evidence shows up, the kind of ROI you see on a balance sheet, I don't want to miss it.

Show thread

Wouter Verhelst

@glyph
The way to make it work is not to use a web interface, but instead to use a tool like https://opencode.ai/ to
- generate the code
- generate the tests
- run the tests
- have it loop over 'fix any failures and try again'
- test the code yourself

By themselves, they will get things about 80% right. That's not perfect, but with that feedback loop, enough to get something that works.
@PaulM @mjg59

OpenCode | The open source AI coding agent

OpenCode - The open source coding agent.

Show thread

Wouter Verhelst Mar 7

@glyph
It won't be pretty or efficient or even entirely bug free, but if 'working code' is the only requirement, that it will get you faster than doing it by hand.
@PaulM @mjg59