Mastodawn

Charlie Stross Oct 3, 2025

Bubble's gonna burst!

OpenAI generated US$4.3 billion in revenue in the first half of 2025, according to financial disclosures to shareholders.

The artificial intelligence firm reported a net loss of US$13.5 billion during the same period, with more than half attributed to the remeasurement of convertible interest rights.

Research and development expenses were its largest cost, totaling US$6.7 billion in the first half.

(OpenAI current valuation: $500Bn.)

https://www.techinasia.com/news/openais-revenue-rises-16-to-4-3b-in-h1-2025

Show thread

Scott Laird Oct 3, 2025

@cstross that's a lot more revenue than I would have expected, frankly.

Show thread

Charlie Stross Oct 3, 2025

@laird A lot of it is probably churn by big corporate customers like MSFT, AAPL, and META who want to convince their shareholders they're all in on AI. Once the bubble bursts they'll have no reason to keep throwing money at OpenAI.

Show thread

David Chisnall (*Now with 50% more sarcasm!*)Oct 3, 2025

@cstross @laird

For those who haven't been watching:

A load of AI companies are actually just OpenAI resellers. OpenAI is among the biggest investors in these companies, but NVIDIA is also big. The way it works is something a bit like this:

NVIDIA invests $10M in some AI company.

That company takes $2M in revenue for effectively reselling OpenAI models at well below cost.

That company buys $12M of OpenAI credits (from their revenue plus investor money).

OpenAI reports $12M of revenue and uses this to justify raising $100M more investor money.

OpenAI buys $100M of GPUs from NVIDIA.

NVIDIA reports $100M of revenue and an asset worth at least $10M.

Everyone wins! Money from thin air!

There's a more fun variant that is, roughly:

Investment firm has a $100M stake in NVIDIA.

Investment firm borrows $50 against this stake and uses it to invest in a load of AI startups (maybe even OpenAI).

Demand for GPUs goes up!

NVIDIA valuation increases significantly and now the firm's NVIDIA portfolio is worth $1B

Company slowly sells 20% of their stake (slowly, so it doesn't spook the market, but it's a fairly small fraction of market cap, so that's fine).

Company writes off the $50M investment in startups.

Company now has $150M in cash and an NVIDIA stake valued at $800M that they can borrow against.

If anyone except Trump were in the White House, I'd expect the end result of a bunch of SEC investigations to result in a lot of folks going to prison.

Show thread

Jon

@david_chisnall @cstross @laird I've been thinking for a while that NVIDIA's operating model has become:

1. Identify compute-intensive scam (crypto, LLM)
2. Tweak GPU product pipeline to enable scam
3. Push scam incessantly to get customers to buy tweaked products
4. Identify next scam before previous one burns out.

Also, there is a special place in Hell reserved for whomever decided to label smallish matrices as "tensors" in LLM-speak. They aren't *wrong*, just wildly misleading.

Show thread

David Chisnall (*Now with 50% more sarcasm!*)Oct 4, 2025

@oddhack @cstross @laird

Early NVIDIA was a lot better. They identified that you could make a graphics accelerator for 10% of the cost of the ones in high-end graphics workstations (which cost tens of thousands of dollars) with maybe half the performance and sell it to at least a thousand more people. They were the first to do hardware-accelerated geometry transforms and lighting. They were the first to make parts, then more parts, of the graphics pipeline programmable shaders instead of fixed-function units.

They rode (and were some of the major drivers of) the wave of 3D acceleration becoming mainstream. A computer sold in 2000 often had a dumb frame buffer and might have had a 2D accelerator (though often didn’t because CPUs were getting to be fast enough that these rarely helped). By 2005, even laptops had 3D accelerators. NVIDIA was in a market that grew from a tiny niche to the 200M units per year. The NVIDIA stock price quadrupled between 1999 and 2005.

The problem with that kind of staggering growth is that investors expect it to continue. But once you’ve saturated the market, how do you keep growing? You need new markets.

They’ve made mobile chips, but that’s a market with a lot of established players. They’ve made server chips, and they bought a SmartNIC vendor and so were in a good position to displace AMD as the cloud SoC vendor of choice. Unfortunately, best case, that would have doubled their profits. The market demands more.

As far as I can tell, their current stock price is assuming so much growth that it assumes everyone in the world will own at least three NVIDIA high-margin GPUs in a few years. It’s well past the point where the market is behaving rationally and deep into greater-idiot syndrome.

Show thread

Charlie Stross Oct 4, 2025

@david_chisnall @oddhack @laird You know about Sutherland's wheel of reincarnation in graphics, yes? Circa 1990 I worked at one of NVIDIA's ill-fated predecessors—Real World Graphics, used gangs of cheap RISC processors in place of custom silicon to do "cheap" GPU stuff for PCs. The mass market gamer PC graphics cards only matched their performance about a decade later.

Alas, they were targeting industrial/workstation niches and very under-capitalized.

Show thread

David Chisnall (*Now with 50% more sarcasm!*)Oct 4, 2025

@cstross @oddhack @laird

Wow, it’s been a long time since I read that paper. The i860 was designed as a general-purpose processor and, as the name hints, as a successor to the x86 line. I remember seeing a load of ads for it in BYTE as the next big thing. It didn’t succeed there, but did end up in a load of graphics accelerators.

I think that undersells NVIDIA somewhat though. It turns out that 2D graphics has quite a lot in common with other common compute tasks. A lot of the performance-critical bit is simply memcpy (or BITBLT, as Dan Ingalls called it), maybe with some blending, but the maximum number of sprite pixels you’ll want to composite is a linear function of the number of destination pixels (you don’t need to bother drawing things that are completely occluded and calculating occlusion for 2D scenes is not hard). You can make a 2D accelerator that’s faster than a generic RISC chip. Most 2D vector graphics operations on bitmaps or geometry (skew, scale, rotate, and so on) can be encoded as 3x vector multiples by a 3x3 matrix, but specialising a chip for 3x1 x 3x3 matrix multiplication is a load of work. And your colour blending is RGBA, so that needs other dedicated hardware. At the same time, you want fast scalar floating-point on your RISC chip for other things and you can easily keep up with display resolutions rendering PostScript-style graphics.

In the ‘90s, ‘Windows Accelerators’ were common. They let GUIs offload window drawing (lines, rectangles, simple sprite blitting) to the graphics card. They were mostly a speedup because PCs often didn’t have FPUs back then.

A lot of RISC chips were, in spite of the hype, not actually very good. The 860, on paper, was around double the performance of a 486 (and used fewer transistors!), but actually compiling code to target it was very hard and in real-world performance it was typically slower. Bolting a decent FPU on the side of a RISC chip was quite easy (especially when it didn’t need to handle baroque things like x87’s 80-bit floating point representation and the bizzare ‘do this operation as binary floating point but then apply a correction so that the error is what it would have been if you’d done it as binary-coded decimal’ instructions that x86 chips needed).

It wasn’t that RISC chips were good at graphics, so much as RISC chips were cheap and the places where they sacrificed performance didn’t matter for 2D graphics and so they were a lot cheaper per unit performance than doing the same work on the main CPU (even when that CPU was another RISC chip that had learned a few more lessons and had a more balanced performance profile).

3D is different in a bunch of ways. The data in a 3D scene is either XYZW vectors for geometry or RGBA vectors for colour spaces. And a lot of the primitive operations you do on colours are the same as the ones you do on vertexes. You can have a common data representation for both colour and geometry, which means you can spend more effort on vector operations in this data type.

CPUs also got 4-way vector units at around this time. AMD called theirs 3DNow! because they expected it to enable fast software 3D. Quake 2 had a mode to use 3DNow! and it was a bit faster than the default software renderer. And it still looked much worse than the OpenGL version.

Because the second thing about 3D graphics is that it’s embarrassingly parallel. You want to do the same thing to every vertex in the scene. And the same thing to every texel on a triangle. This is also true of 2D, but with 2D you hit diminishing returns way earlier. Humans need around 25fps to perceive smooth motion. For dynamically rendered scenes there will be a little bit of jitter for slight variations in the underlying lotion and so 60fps is a nice place to aim for. If you can render a 2D scene in 10ms, you’re well ahead of what you need. Adding parallelism to render it in 1ms provides no benefit. But with 3D, the data is much bigger. Even the first NVIDIA cards handled scenes with millions of triangles. You simply wouldn’t build a 2D scene that big. And, even then, you’d see pixelation if you walked up to a wall because it was a low-resolution texture that looked fine in a distance, but putting a texture in video memory that was as big as the screen (so looked good when you got close to it) was completely infeasible to do for every surface in a 3D scene. Texture compression made this somewhat possible, but more modern 3D accelerators have thousands of times as much texture memory as they need for a single frame buffer.

And that increase in memory brings me to the third thing. CPUs are optimise around the idea that workloads exhibit a lot of locality of reference in both spatial and temporal dimensions. If you access some data, you are likely to access it again soon. You are also likely to access nearby data soon. At the same time, data access patterns are hard to predict. For graphics, this is far less true than a lot of other workloads. Most memory tends to be touched once per scene, so caches don’t help, and a lot of memory is read to render each scene, so you need to be able to stream a lot more memory through the compute units than elsewhere. At the same time, most memory accesses are predictable, so you can program the access pattern (which may be something complex like recursive Z shapes) into the memory controller and stream data at DRAM line rates. All of this ends up with a very different design to most CPUs and trying to either build it out of cheap RISC cores or build something that works well as both a GPU and CPU will involve a lot of compromises that will hurt either or both workloads.

I touched on a lot of this a few years ago in my CACM article No Such Thing as a General-Purpose Processor.

No such thing as a general-purpose processor

Show thread

Charlie Stross Oct 4, 2025

@david_chisnall @oddhack @laird Yup. RWG's Reality ISA bus board back in 1990 threw four i860s at rendering in parallel, along with a 34020 to do 2D overlays: it was mainly marketed at simulators. (They had a high-end VME-bus system with up to 96 i860s, aimed at the vehicle/flight simulator market.)

Show thread

David Chisnall (*Now with 50% more sarcasm!*)Oct 4, 2025

@cstross @oddhack @laird

A lot of them also ended up in printers, for the same reason, before largely being displaced by MIPS. There was a weird period around then where printers came with a CPU that, for some workloads, was faster than the host and a bunch of people did the distant ancestor of early GPGPU work by encoding things as print jobs.

What kind of simulations were people running? It sounds like they’d be better than most contemporary alternatives for finite element and fluid dynamics things.

I vaguely remember there was something exciting in the i860’s concept of bus locking for multiprocessor systems, but not the details.

Show thread

Charlie Stross Oct 4, 2025

@david_chisnall @oddhack @laird Here's a throwback to a more elegant age, then!
https://github.com/ading2210/linuxpdf

GitHub - ading2210/linuxpdf: Linux running inside a PDF file via a RISC-V emulator

Linux running inside a PDF file via a RISC-V emulator - ading2210/linuxpdf

GitHub

Show thread

Marius Oct 4, 2025

@david_chisnall @cstross @oddhack @laird

I think it was simulators ( to train pilots), not physics simulation like FEM or CFD. So game-style 3D graphics avant la lettre, with a huge budget for individual hardware but a small total market.

Show thread

Liam Proven Oct 4, 2025

@david_chisnall @cstross @oddhack @laird

> 2D graphics is… embarrassingly parallel

[...]

> This is also true of 2D,

(?)

Show thread

David Chisnall (*Now with 50% more sarcasm!*)Oct 4, 2025

@lproven @cstross @oddhack @laird

Fixed. In my defence, someone put the 2 and 3 buttons right next to each other on the keyboard. Clearly not a good design.

Show thread

Jon Oct 4, 2025

@david_chisnall @cstross @laird we used i860s as the transform & lighting pipeline processors on UNC's Pixel-Planes 5. Good FP performance but the exposed pipeline made programming at low level very painful. Unfortunately the amount of state required to save/restore the FP pipeline was so enormous (like 1000 cycles in the interrupt handler IIRC) that we ended up simply locking out interrupts during FP-intensive code.

Show thread

The Penguin of Evil Oct 4, 2025

@oddhack @david_chisnall @cstross @laird not sure Nvidia look for scams as such. They just look for anything that eats vast amounts of compute that fits their product line, it just happens to be scams and gaming. No different to oil companies pushing AI because it drives consumption of their product.

Show thread

Jon Oct 4, 2025

@etchedpixels @david_chisnall @cstross @laird I was running the OpenGL Architecture Review Board from 1997-2006 so got to watch as the traditional workstation vendors were eaten from beneath by NVIDIA and ATI. I occasionally comforted myself by thinking that, although NVIDIA won the market, at least SGI's customer base was mostly intelligent adults doing science & engineering, not teenage boys blowing shit up.

Show thread

The Penguin of Evil Oct 4, 2025

@oddhack @david_chisnall @cstross @laird What drives markets is always about volume and price. We did so much Linux network optimization for the porn industry !

Show thread

Jon Oct 4, 2025

@etchedpixels @david_chisnall @cstross @laird when I joined SGI in the late 90s they had a commanding lead in the porn server market. Unfortunately the marketing people were reluctant to capitalize on this clear evidence of technical superiority (and really, Origins were pretty sweet systems).

Show thread

Brian "bex" Exelbierd Oct 4, 2025

@oddhack @etchedpixels @david_chisnall @cstross @laird too many companies and projects won’t use porn company references. It’s sad as they’re often cutting edge.

Show thread

Brian "bex" Exelbierd Oct 4, 2025

@oddhack @etchedpixels @david_chisnall @cstross @laird ^^ @buttplugio

Show thread

Jzero Oct 4, 2025

@oddhack @david_chisnall @cstross @laird
via: https://www.wheresyoured.at/the-case-against-generative-ai/

Show thread

China.Gamer.Guy 神州电玩游士 Oct 5, 2025

@oddhack
If only it were limited to Nvidia. 😢

I mean: *so* many companies are driven by this logic, and the investor expectations behind it, that it's endemic.
😞

@david_chisnall @cstross @laird