⭐️ New blog post: A Month With OpenAI's Codex
https://highcaffeinecontent.com/blog/20260301-A-Month-With-OpenAIs-Codex
It's been literal *years* since I last posted anything, so you know this is a big deal for me 😜
⭐️ New blog post: A Month With OpenAI's Codex
https://highcaffeinecontent.com/blog/20260301-A-Month-With-OpenAIs-Codex
It's been literal *years* since I last posted anything, so you know this is a big deal for me 😜
@stroughtonsmith Great post. I'm not at all surprised that you get it.
“It didn’t just blow away my expectations, it showed me the world has changed: we’ve just undergone a permanent, irreversible abstraction level shift.”
“This story is unfinished; this feels like a first foray into what software development will look like for the rest of my life.”
Totally agree on both counts. 🚀
@stroughtonsmith > something like Codex can chew through and rewrite a thousand lines of code in a second. Eventually, I just trusted it.
Jia Tan’s mistake was being too careful and wasting too much time on the social engineering. The next attacker will be far lazier than that, all they need is to poison the datasets (which is trivial, even by the vendors’ admission) and soon thousands of developers will be happily shipping unvetted malicious code which will compromise everyone beyond repair.
@stroughtonsmith does this level of “trust” work with applications which have actual genuine real world use, not let projects?
Entire banking or financial systems?
Inter continental missile systems?
@SeanMacGabhann of course not, I wouldn't even trust it with a spreadsheet. That would be silly.
I also don't work on entire banking or financial systems, or ICBMs.
I trust it to write my code, not everybody else's
Thanks for the reply
But to me it’s where the diconnect/confusion lies
The sheer enthusiasm/belief in what you are doing versus what you wouldn’t trust it with
I think gen public/politicians and media don’t get the nuance
@stroughtonsmith https://x.com/petergostev/status/2026396167345459292?s=46&t=rFvA0C-h5tnMppUfb0s9JQ
It looks pretty bad for codex 5.3 imo.
@boxed @stroughtonsmith benchmarks of AI models are not everything (not saying, not important). It's as important as how agents manage context, system prompts, errors etc.
_random_agent_ + Opus 4.6 can be much worse, than _great_agent_ + Opus 4.6