Mastodawn

The post from Cursor.
https://x.com/i/status/2056415413077233983

Cursor (@cursor_ai) on X

Introducing Composer 2.5, our most powerful model yet. It's more intelligent, better at sustained work on long-running tasks, and more reliable at following complex instructions. For the next week, we’re doubling the included usage of the model.

X (formerly Twitter)

Show thread

Kol Tregaskes 48m ago

They're training the next model from scratch using 10x more compute, backed by Colossus 2's million H100-equivalent capacity. Reports say they're using tens of thousands of xAI GPUs for this new model.

Cursor's harness already makes the Claude and GPT models perform better. Imagine what they could achieve with so much more compute.
https://cursor.com/blog/composer-2-5

Composer 2.5 の紹介 · Cursor

特に長いホライズンのエージェント型タスクにおいて、Composer 2 から知能とふるまいが大きく向上しました。

Cursor

Kol Tregaskes 48m ago

Cursor just released Composer 2.5 and announced they're training a much larger model from scratch with xAI on Colossus 2. This could be huge.

Composer 2.5 improvements:
• Better at sustained work on long-running tasks
• More reliable with complex instructions
• Improved communication style and effort calibration
• Built on Moonshot's Kimi K2.5 with additional RL training

The bigger story is what comes next.

Show thread

Kol Tregaskes 3h ago

Peter's post.
https://x.com/petergyang/status/2056019573938565534?s=20

Peter Yang (@petergyang) on X

Here's my new episode with @alexalbert__, who shared an inside look at how Anthropic is building the next Claude. We talked about how the research team: → Plans for the model and harness together → Uses Claude to turn user feedback into evals → Trains Claude's character &

X (formerly Twitter)

Show thread

Kol Tregaskes 3h ago

What really got me is how heavily they’re using Claude to analyse user feedback. It clusters themes, generates synthetic versions of recurring issues, and converts them into evals. The model helps them improve the model.

Right at the end, the consciousness question came up. Alex called it a “big question” they’re working on. No detail, but the fact it’s even on their roadmap tells you something about where they think this is going.
https://youtu.be/T4ieZPIEmd8?si=3KXce86LjNOC2R__

Inside How Anthropic Is Building the Next Claude | Alex Albert

YouTube

Show thread

Kol Tregaskes 3h ago

The interesting part is they’ve got strong intuitions based on architecture choices, but they still don’t truly know what the model will be good at until it’s deep in training.

They’re also working on memory in a way that feels more human. When the agent isn’t running tasks, it revisits memories in the background - spotting contradictions, pruning, tidying, cleaning up. Alex called it “dreaming”. If it works, that’s genuinely smart.

Kol Tregaskes 3h ago

Anthropic shared how they’re building the next Claude model. A few things stood out.

They treat every model like a product launch. Before training even starts, they write a spec for the capabilities they want - coding, knowledge work, spreadsheets - then track, step by step, whether training is actually delivering on those targets.

Show thread

Kol Tregaskes 6h ago

https://x.com/ajambrosino/status/2055451468900213074

Andrew Ambrosino (@ajambrosino) on X

Thanks for the feedback on Codex in the ChatGPT mobile app. While it’s in preview, we’re working to improve it fast. What you can expect next: push notifications, /fork, ability to restore after revoking, better reconnects, fixing the ability to control other devices, fewer

X (formerly Twitter)

Show thread

Kol Tregaskes 6h ago

The real issue: when credits reset, the agent should just continue automatically. That's what an autonomous agent should do. Instead, you have to manually intervene every time. At minimum, there needs to be a "force continue" button to get things moving again.

I want Codex to work. The developers clearly care and they're shipping regularly. But these workflow blockers keep dragging the experience down, especially on Windows where the performance gap is noticeable.

Show thread

Kol Tregaskes 6h ago

The /goal feature is working (thanks to adjusting a config file), which is promising. But when you run out of credits mid-task, everything halts. Goals get stuck "thinking" with no clean way to reset or push them forward. I'm having to set goals from scratch again just to get unstuck.