1. We are measuring the wrong things.

TPS (Tokens Per Second) is a vanity metric from the era of "chatting." In the Agentic Age, speed is irrelevant if the model takes ten iterations to reach the right result.

The only unit that actually matters is NTE—Number of Tokens to Execute.

#AIAgents #LLM #Tokenization #TechStrategy #Efficiency

2. When reasoning is tokenized, every unnecessary word is a tax on intent. A model that needs a 2,000-token internal monologue to trigger one API call isn’t sophisticated—it’s inefficient.

The most valuable models won't be the ones that talk the fastest. They will be the ones that reach an impactful action via the shortest possible linguistic path.

#AIAgents #LLM #Tokenization #TechStrategy #Efficiency