@atpfm taking a more simple example. Computers have been superhuman at chess for many many years now.
If you attempt to use the current generation of LLMs to play a game of chess against each other (or indeed against a player), they will spit out illegal moves, create pieces out of thin air, & generally just vomit hallucinations in relatively short order.
Again, this is a solved problem for computers, but the LLMs perform terribly. Now imagine this but in a field you know nothing about.
4/4
@atpfm ... because the time horizons for finding out it doesn't work are much longer & the negative outcomes when you are wrong are much more severe than "my program didn't compile".
So we don't have a good way to train AI to get better faster in fields that have the same dynamics as medicine/finance without causing much harm.
Medicine & finance are so highly regulated because bad advice can sound good to the untrained (see trump saying injecting bleach might fight COVID) 3/
@atpfm ChatGPT doesn't have a way to determine what is true. It has a way to determine what is most likely to come next in the sequence. Given the corpus of non-programming data, a lot of what was on the internet before was bad information (that's why experts are useful).
For programming, we had large corpuses of good quality data (GitHub et al), and the advantage of being able to run experiments in silico (ie does it compile & work).
We can't do experiments like that in finance or medical 2/
@atpfm re the assertion that AI are experts.
Fundamentally the issue is that AI needs data, and good quality data. We have currently trained the current crop of AI on all of the internet, along with some additional bespoke sources. They are text predictors, not fact predictors.
Pre AI, a large corpus of the medical and financial advice online was simply wrong. Medical sources said everything was cancer. Financial sources had incredibly bad advice on dealing with debt.
1/
@stroughtonsmith ... but I don't think it's there yet for a number of use cases (eg financial reporting, secure software, etc).
Not because I don't think it can be, but because the hallucination issue is a real problem, in addition to the fact the AI has been trained on code that contains bugs, many of which can be hard to spot.
These two things combined mean that for something like coding an actuarial projection model for regulatory reporting, it's not reliable enough yet. 5/5
@stroughtonsmith ... it contain more of that community, but it won't be exclusive to that community. So you will have some leakage both ways which will harm your sampling of the population you were wanting to poll (for example I saw this because someone I follow in the Apple community boosted it, and even though I do code professionally I don't code in the Apple space). I do work in the actuarial and stats space.
I would also note I'm not an AI detractor, I think it will be a useful tool ... 4/
@stroughtonsmith ... AI professionally, because their work requires them to, but be vocally opposed to its use and be critical of the value it brings.
This is a similar situation for your other categories.
Essentially you aren't asking the question your conclusions needed you to be asking (and it's actually difficult to do in a single question poll)
Then you have the bias from the skew of your reach for this poll. Your following skews towards the Apple community, which will make it ... 3/
@stroughtonsmith ... representative sample of either all people who have opinions on AI on Mastodon, nor is it a representative sample of all (Apple) coders.
There are a couple of sources of bias that mean it doesn't support your conclusion, chief of which is your assumption that only people in the last category are those who are vocally opposed to AI. Certainly there will be a decent chunk from that group, but the others will contain detractors to.
For example, someone may use ... 2/
@stroughtonsmith your poll has a decently large number of votes, but in order to take it as a representative sample of the population you have to satisfy a bunch of criteria:
- the poll needs to be randomly distributed among the population you are trying to interrogate
- the question needs to be designed to not introduce bias
- you need to get a sufficient number of responses
Your poll is certainly a valuable insight into the population of people who you can reach, but it's not a ... 1/