These comments from people in/near railroads/shipping explaining why something like the hazardous Norfolk Southern train derailment was inevitable (and, in fact, there have been three derailments in Ohio alone in the past 5 months) read like the comments from Intel folks leading up to the rash of serious CPU bugs starting ~10y ago (https://danluu.com/cpu-bugs/)

This kind of thing seems expected under weakly regulated capitalism since the company avoids most of the cost of negative externalities, but

We saw some really bad Intel CPU bugs in 2015 and we should expect to see more in the future

I have the same question that Andrew Gelman has on grade inflation (https://statmodeling.stat.columbia.edu/2011/07/27/12383/ / https://statmodeling.stat.columbia.edu/2023/01/29/grade-inflation-why-hasnt-italready-reached-its-terminal-stage/): why isn't there even more corner cutting?

E.g., why did Intel take verification so seriously for so long in the wake of FDIV? AMD was buggier than Intel during the FDIV era, was much much buggier in the K7 era, and is buggier today. Someone who wants a non-buggy CPU doesn't really have a better alternative, so, from a shareholder viewpoint, Intel isn't cutting enough corners.

Grade inflation: why weren’t the instructors all giving all A’s already?? | Statistical Modeling, Causal Inference, and Social Science

My non-confident guess, based on how I've seen things play out (at companies I've worked and elsewhere), is that a tiny percentage of people want to do the right thing despite the incentive structure strongly pushing for the wrong thing.

Since we're in an era that increasingly celebrates some variant of "greed is good", the ability of the minority to force the right thing to happen is decreasing but, sometimes events like the FDIV bug will give people leverage to make the right thing happen.

@danluu Reputation is worth a lot, even if there is no alternative at that very moment. I'm not sure how much the extra stringent verification costs intel, but it could be a small percentage and entirely rational in the long term.
@danluu you're probably not wrong.
But it could also be a point of differentiation for Intel. "We cost more and our chips are slower, but they are more correct."
That's routine market differentiation.
Further, to maintain that differentiation, an internal culture develops with aligning values. The people in the company get sense of self-worth by providing that "higher quality".
@danluu That’s my assumption as well - everywhere I’ve worked, there’s always been some elements just doing their own thing, incentives be damned. I’m sure I must’ve read some systems analyses that have noted this…
@danluu must also be because you would stand out if you suddenly introduced a lot of bugs/gave all students As
@danluu My guess is, that this is prevented by the fear of escalation. In my experience, when you signal to everybody, that you don't care (i.e. give everybody A's), then the system is practically dead within a few cycles (few years in school terms). People subconsciously know this, because of the small timeframe and because it's easier to see.
@danluu When this is happening slowly instead, the escalation is happening too indirectly for people to actively fear. They may not even see it as an escalation.

@danluu
Some example thoughts:
"If I give everybody an A, what is the purpose of tests?",
"If everybody always succeeds in the same way,
why do we need to give homework?" or maybe in Intels case: "If we do not have a less buggy CPU, maybe our CPUs are not better?"

The current profit of the company is not the only priority, as the profit is the result of many things.

@danluu anecdata from grad school—in philosophy, the professors said the second year grad students, who were TAing for the first time, tended to be too aggressive and often had to be nudged to give higher grades. Note that this isn’t “all A’s”, (we were a solid but not flagship state school), and the grad students had to be nudged to stick to a bell curve where the median grade was a B.

@danluu I question your assumption that it's cheaper to ignore these kinds of issues. If a company screws up they do have to pay a lot of money (product recall, compensation, etc).

The point of cost comparison is the additional outlay to do things right vs paying for screwups - the problem you'd run into is screwing up constantly would quickly get very expensive.

At best you could argue current employees may not have sufficient incentive, and take short term profits at long term expense.