My job as a senior developer with a team of juniors is to figure out what to write, sketch a PoC as guidance, and then delegate the actual implementation to them. I'm going to look at that, explain misunderstandings or poor style choices, and guide them into implementing something that meets our standards.

I don't think LLMs can do my job yet. But I think we're getting shockingly close to them being able to do the other part. And I'm worried how we're going to get more senior developers.

I would not have said the same thing 6 months ago - the amount of progress here is significant. And I'm not denying that the technology has resulted in massive quantities of poor quality code produced by people who aren't in a position to review it, or that the externalities of all of this are large. But capitalism isn't going to give a shit, so we're getting all of this anyway whether we like it or not
@mjg59 do you have some way of evaluating that progress in the last 6 months in some way that is not the subjective impression of improvement?
@glyph @mjg59 watching the benchmarks get saturated is interesting, but watching teammates build entire non-trivial projects entirely with the technology is a lot more convincing. There was a really palpable uptick in capability of the most powerful variants of this at the beginning of this year.
@PaulM @mjg59 Someone I respect has said *some* version of this to me every month since ChatGPT first shipped though, and I am tired of retesting various models and having them all produce the same hot garbage for my problems, while wondering if they're slowly making me psychotic as a side-effect. I keep asking this question because if *hard* evidence shows up, the kind of ROI you see on a balance sheet, I don't want to miss it.

@glyph @mjg59
that's entirely fair, and they have been getting better, but what constitutes "worth using" is pretty individual. I'm curious if you have any examples of something you'd quantity that way.

Maybe some relatively complex feature or bugfix you already wrote that you'd like to use as a benchmark for capability? Alternatively, a couple of trivial features you'd like in a personal project but haven't gotten around to building?

At a more mundane level, I suspect they could reliably alleviate a significant amount of the drudgery associated with maintaining OSS - fixing tests when dependencies are updated, etc. Nothing you can't trivially do yourself, but also in my experience painful to try to get the ADHD brain to pay attention to.

@PaulM @mjg59 At this point I am too nervous about the risks to actually touch one for anything non-trivial, and I think everyone should refrain from their use for ethical and safety reasons. One pretty robust argument in that discussion is "they're most likely actually an economic drain, even if they seem useful". But this is a tenuous argument that might become false at any moment, and if I'm not using them I won't know when that moment is.

@glyph

Production of all hardware, building and operation of all data centres are huge environmental issues, and while human activity was certainly extremely polluting even before that, the whole content generating stuff comes on top of that.
This idiotically might not concern companies like your employer.

But content generation models shift power to those who own them.
This might also not concern your employer, but if it's a SW corp they're externalising their core product

@PaulM @mjg59

@ari I'm pretty sure you meant to headline my username and not glyph's in that response?

@PaulM it's more a direct reply to Glyph then to you, so I think I want to headline Glyph? But I don't have a firm enough grasp on the workings of mastodon to be certain either way.

email / reddit style discussion trees have their merrit, I'd rather have that (and no character cap, but I digress)

@ari well then, I'm not sure why you're talking like that to glyph. He's an independent open source developer who does consulting and has a patreon and doesn't use LLMs and has contributed to a bunch of fundamental internet software you undoubtedly use.

So you kinda sound like an asshole here.