I audited the source for the two projects I built today with Codex in Xcode and it's hard to find anything I could classify as 'slop'. It perhaps speaks to how well Cocoa development is structured, how consistent everything is, but it all just looks… normal. I would have no problem whatsoever building upon and maintaining by hand what I've got here — there are no weird hacks, nothing that makes me scratch my head, nothing I wouldn't have done myself. The horror stories may just be horror stories
@stroughtonsmith I've done a similar audit on an internal tool I had it write, what I'd say is that it writes 'vanilla' code. Which is very straightforward and idiomatic. Comparing it to my own code I'd say that mine is a bit more of a mix, with some 'clever' code and some ‘lazy' code which it lacks. Codex seems to aim straight down the middle. Which is very reasonable, and in some ways potentially easier for me to refine.
@stroughtonsmith how much do you attribute these apps to you “knowing what to ask and how to ask it”? That is, are you just asking for behavior and UI in generic terms, like a PM would? Or are you using swift terminology, asking for particular API, etc., based on your existing knowledge of the platform? And are you directing it for how to set up Classes and project hierarchy, or are you treating XCode like a black box?
@joelion it requires deep platform knowledge and the ability to debug/investigate complex layout problems

@stroughtonsmith For me, it's much less about slop and much more about the risk of societal, cognitive, and environmental damage (not to mention the basis of theft) that prevents me from getting excited about any of it 😔

I'd much rather see more tools that make it easier to get started on new ideas be invented that _don't_ use LLMs… surely we as a whole could build towards that kind of future instead 😅

@dimitribouniol @stroughtonsmith hmmm.. You could look into GSD just don't let it write code but just specs and examples and split work.

https://github.com/glittercowboy/get-shit-done

GitHub - glittercowboy/get-shit-done: A light-weight and powerful meta-prompting, context engineering and spec-driven development system for Claude Code and OpenCode.

A light-weight and powerful meta-prompting, context engineering and spec-driven development system for Claude Code and OpenCode. - glittercowboy/get-shit-done

GitHub
@dimitribouniol @stroughtonsmith ok. It's LLM but beats the hell out of looking up documentation and APIs and saves you from analysis paralysis

@stroughtonsmith
I think it’s also interesting that the well-structured Xcode and general corpus that was likely used in post training can be exploited by current model search / exploration as well as LLM generation.

My guess is that the corpus is largely code that that works and makes good use of idioms that you might use yourself—along with many others. The probability of generating slop is lower. This may make code generation a sweet spot for current LLM tech.

@stroughtonsmith have you tried to use it to actually completely finish an app though. I feel like it’s amazing at prototypes but sucks at actually implementing anything close to a finished app. And to me the first 70% of a project is the fun part. Why would I hand that off?