Let's talk about "#AI", #LLM, and #MachineLearning, which I don't put in quotes.
First, I am not anti-science, I am anti-JUNKscience and MARKETING, and there is a difference.
Why can I discuss a field I'm not in with some knowledge? I spent over 30 years in #geophysics, #SignalProcessing, and in #geology and #hydrology and #hydrogeology modelling. People doing this kind of work (along with #meteorology and #Climatology) are the progenitors of the current science. 1/
First, there is no such thing as "#ArtificialIntelligence". It's not artificial and it isn't intelligentce. As a pursuit, it is quite real, and what it produces is not intelligence, but detailed models based on huge amounts of related archived data. The important word here is "model". So every time you see output from #Gemini or #ChatGPT, put in your mind that it is a MODEL of a paragraph on a topic, a MODEL of a picture of a sparrow on a pear, a MODEL of #python code. 2/
Which brings us to an important concept, "All models are wrong, some are useful." This is a decades old aphorism, but what does it mean? The gist is another aphorism "The map is not the territory".
A model is a snapshot, based on current information, current assumptions/algorithms, and current allocation of processing resources. "Reality" which is what the models try to represent, is constantly and unpredictably changing. Our data on reality also changes, which changes the modelling 3/
Large dataset modelling has come a long way since I 1st walked into the #LANDSAT/EROS facility in Souix Falls, SD way back in the early 80's. Mouth open, eyes wide. Big tape drive computers whirring and clicking. The operators describing how they process signal returns into forest models, carbon densities, urban densities, surface temperatures, etc.. I was just a student, but fascinated by what they were able to do. Slight historical digression. 4/
Back to topic.
Modelling has changed and improved a lot. Mostly due to more powerful computers allowing more brute force analysis. The math tools really haven't changed that much. Early commercial modelling included programs like #Surfer decades ago and used #Kriging, (#Gaussian peocess regression) of geostatistical data. Most current geospatial modelling uses this technique, often enhanced with #BayesianDataAnalysis (read the book by that name). So now the table is mostly set. 5/
The problems historically are
1. Data heterogenity - this is why people often preprocess data with Baysian analysis to determine the reliability (a technical term) and confidence or credibility intervals for the model. An example is you have a string of 5, 5, 5, 5, 5, 5, 5. What is the credibility interval if you guess the next number is 5. Pretty high, right? Now your string is 2, 87, 36, 11, 7, 43, 24, 9, 75. What is the credibility for your guess for next in string? This can be quantified 6/
Data hetrogeneity is why bad data can really screw a model. Especially when it doesn't have a lot of homogenous data for that specific model. That paragraph, code snippet, or face. Bad data can overwhelm "good" data, skewing the model. This is often the problem with #LLM output & can't be overcome with more processing or algorithms, it is a math problem. The real issue is how it is marketed. They KNOW this is an issue & could easily provide you the user with confidence values. They don't 7/
Historical problems
2. Edge cases, boundary conditions, data gaps - the further you get from existing data, the less reliable the model output. Every model has a limit. What you include or exclude from the model assumptions and data set affects the outcome. A face is never deep purple? An ocean wave is never higher than 100ft? 200ft? A certain noun is never preceded by a certain adjective? All can be true, until they aren't. 8/
Final historical problem
3. Overall data reliability and relevance - This is the primary and most important problem. Models must assume the data they use is both reliable and relevant. All the big players are scrambling to gather more data in order to solve the heterogeneous and boundary condition problems, but the primary problem is reliability and relevance. All of these data sets are gathered from our online society. How reliable is that? How relevant? 9/
Every "#AI" or #LLM will put out #racist and/or #bigotted models. From facial recognition to societal predictions. This is predictable, because our society itself is racist and bigotted, it literally can't produce models which defy the data. This is why we can't rely on it for policy, for health care decisions, for law enforcement, benefit determinations, informing legislation, or any other use related to society. Fundamentally unusable. 10/

@Okanogen Doesn’t this also imply that our current system is also racist and bigoted and can’t be trusted to make decisions for society?

I think you’re comparing these systems to a platonic ideal of a government which can’t exist in the real world.

Seems more useful these hypothetical systems to our current system. I realize that’s far more difficult, but it’s also far more relevant.

@dogzilla
That is one of the points here. These models, which we know produce racist and bigotted models, prove what people already know, our society is racist and bigotted.
Changing that is hopefully a societal goal that can't be achieved relying on AI or LLM input.

@Okanogen Note that I’m not advocating for AIs here, but I *am* advocating for logical rigor.

If AI government is simply a reflection of society, why exactly wouldn’t we trust it at least as much as we do human government? Conversely, why would one be preferable to the other?

Seems if anything it’d be easier to implement change across an ai-run government, assuming instant government-wide application of a single viewpoint is even a desirable goal. That’d suck with Trump.

@dogzilla
I don't think my point is getting across. It is impossible for any computer model to innovate or change society. The only positive change we will see is by decisions made by people.
Even worse, so-called AI will only give us racist and bigotted decision because it is trained on a racist and bigotted data set and that CANNOT be fixed without fixing society itself. There is no curating it.

@Okanogen Couple things:
- you’re saying an AI will be bigoted because it’s trained on data from our society, which itself is bigoted. Fair enough. But the alternative is a government made up of bigoted people, so how would AI be worse exactly?

- AIs can’t innovate (yet), but that’s not a positive in, say, the judiciary. Or in carrying out many executive or legislative functions. In these cases, uniformity in applying laws is the goal. Seems AI would be better that people there.

@dogzilla
The difference is racists don't believe our society is bigotted and racist and claim AI is infallible, they use it to further oppress. Which makes AI even more racist in a spiral.
https://www.theguardian.com/technology/2024/mar/16/ai-racism-chatgpt-gemini-bias
As AI tools get smarter, they’re growing more covertly racist, experts find

ChatGPT and Gemini discriminate against those who speak African American Vernacular English, report shows

The Guardian

@Okanogen You just literally described exactly what’s currently happening to the US government.

So I challenge you to describe one way in which AI government would be worse than our actual, real-world government is.

@dogzilla
I challenge you to defend "AI" driven "predictive policing" as "uniformity in applying laws" rather than a racism accelerator. Example:
https://www.ohchr.org/en/stories/2024/07/racism-and-ai-bias-past-leads-bias-future
The Algorithmic Divide: A Systematic Review on AI-Driven Racial Disparities in Healthcare - PubMed

To address racial disparities in healthcare outcomes, enhanced ethical considerations and regulatory frameworks are needed in AI healthcare applications. Comprehensive bias detection tools and mitigation strategies, coupled with active supervision by physicians, are essential to ensure AI becomes a …

PubMed
@Okanogen The question here is “how much better/worse is this than the previous system and why?” I simply don’t believe that all AI systems would have this outcome no matter what, because this kind of failure is inherent in these systems. That’s just not how they work. I think it’s quite achievable to iterate on this system to dramatically improve outcomes, just like with the system in replaced. The difference is an AI-based system is far more malleable than a non-AI system.