Last boosts: since this trend has only accelerated, I figured I'd re-share.

I was reviewing some older notes of mine from the event. This one stood out at the time and still does:
Meghan Wiessner and Nathan Ensmenger talked about FORPLAN, a large linear programming model the US forestry service used to generate forest use plans. They both noted its complexity and its shortcomings, how it did not take account of local knowledge and otherwise oversimplified forestry, and how it was divisive.
If you've never come across FORPLAN I recommend looking it up (this is good if you're OK with technical reports). It went into use in late 1979 and was controversial from the beginning. It relied on (then) largescale linear programming methods to determine how to manage the US's forests. Like so many efforts before and since, it set aside expert and/or local knowledge of the domain, made horrendous miscalculations, yet was treated as if it were making divine proclamations that must be followed. One of the early critics of it started a libertarian blog called the Antiplanner to argue against government land-use planning.

#FORPLAN #planning #ForestManagement #AI #LinearProgramming
I proposed two talks for that event. The one that was not accepted (excerpt below) still feels interesting to me and I might someday develop this more, although by now this argument is fairly well-trodden and possibly no longer timely or interesting to make. I obviously don't have the philosophical chops to make an argument at that level, but I'm fascinated by how this technology is so fervently pushed even though it fails on its own technical terms. You don't have to stare too long to recognize there is something non-technical driving this train. "The technologist with well-curated data points knocks chips of error off an AI model to reveal the perfect text generator latent within" is a pretty accurate description and is why I jokingly suggested someone should register the galate.ai domain the other day. If you're not familiar with the Pygmalion myth (in Ovid), check out the company Replika and then Pygmalion to see what I'm getting at. pygmal.io is also available!

Anyway:
ChatGPT and related applications are presented as inevitable and unquestionably good. However, Herbert Simon’s bounded rationality, especially in its more modern guise of ecological rationality, stresses the prevalence of “less is more” phenomena, while scholars like Arvind Narayanan (How to Recognize AI Snake Oil) speak directly to AI itself. Briefly, there are times when simpler models, trained on less data, constitute demonstrably better systems than complex models trained on large data sets. Narayanan, following Joseph Weizenbaum, argues that tasks involving human judgment have this quality. If creating useful tools for such tasks were truly the intended goal, one would reject complex models like GPT and their massive data sets, preferring simpler, less data intensive, and better-performing alternatives. In fact one would reject GPT on the same grounds that less well-trained versions of GPT are rejected in favor of more well-trained ones during the training of GPT itself.

How then do we explain the push to use GPT in producing art, making health care decisions, or advising the legal system, all areas requiring sensitive human judgment? One wonders whether models like GPT were never meant to be optimal in the technical sense after all, but rather in a metaphysical sense. In this view an optimized AI model is not a tool but a Platonic ideal that messy human data only approximates during optimization. As a sculptor with well-aimed chisel blows knocks chips off a marble block to reveal the statuesque human form hidden within, so the technologist with well-curated data points knocks chips of error off an AI model to reveal the perfect text generator latent within. Recent news reporting that OpenAI requires more text data than currently exists to perfect its GPT models adds additional weight to the claim that generative AI practitioners seek the ideal, not the real.
#AI #GenAI #GenerativeAI #GPT#ChatGPT #OpenAI #Galatea #Pygmalion
Regarding the ideological nature of what's at play, it's well worth looking more into ecological rationality and its neighbors. There is a pretty significant body of evidence at this point that in a wide variety of cases of interest, simple small data methods demonstrably outperform complex big data ones. Benchmarking is a tricky subject, and there are specific (and well-chosen, I'd say) benchmarks on which models like LLMs perform better than alternatives. Nevertheless, "less is more" phenomena are well-documented, and conversations about when to apply simple/small methods and when to use complex/large ones are conspicuously absent. Also absent are conversations about what Leonard Savage--the guy who arguably ushered in the rise of Bayesian inference, which makes up the guts of a lot of modern AI--referred to as "small" versus "large" worlds, and how absurd it is to apply statistical techniques to large worlds. I'd argue that the vast majority of horrors we hear LLMs implicated in involve large worlds in Savage's sense, including applications to government or judicial decisionmaking and "companion" bots. "Self-driving" cars that are not car-skinned trains are another (the word "self" in that name is a tell). This means in particular that applying LLMs to large world problems directly contradicts the mathematical foundations on which their efficacy is (supposedly) grounded.

Therefore, if we were having a technical conversation about large language models and their use, we'd be addressing these and related concerns. But I don't think that's what the conversation's been about, not in the public sphere nor in the technical sphere.

All this goes beyond AI. Henry Brighton (I think?) coined the phrase "the bias bias" to refer to a tendency where, when applying a model to a problem, people respond to inadequate outcomes by adding complexity to the model. This goes for mathematical models as much as computational models. The rationale seems to be that the more "true to life" the model is, the more likely it is to succeed (whatever that may mean for them). People are often surprised to learn that this is not always the case: models can and sometimes do become less likely to succeed the more "true to life" they're made. The bias bias can lead to even worse outcomes in such cases, triggering the tendency again and resulting in a feedback loop. The end result can be enormously complex models and concomitant extreme surveillance to acquire data to feed data the models. I look at FORPLAN or ChatGPT, and this is what I see.

#AI #GenAI #GenerativeAI #LLM #GPT #ChatGPT #LatentDiffusion #BigData #EcologicalRationality #LessIsMore #Bias #BiasBias