I've seen folks arguing that good and accurate info can come out of "AI" too, so we can't dismiss it as garbage.
This misses the point entirely.
Even if "AI" "says" something 100% accurate, the provenance is still garbage. It's like a broken clock. It's like waiting for the nazi to say something non-offensive and saying "wow they're at least right about some things".
So how do we move forward? We can't entirely put this shit back in the shitter. The models are large but tractable for bad actors to keep and continue using even if we somehow banned them.
But there's a lot we can do...
@dalias this is a decent point about LLMs and AI but it’s going to be solved within the year from the research labs, then probably another 6 months rolled into the FOSS/commercial AI tools
There’s already been decent work into figuring out where LLMs got the info from, the next step is understanding why it used those sources, then training it how to discern on which sources to value
Who's got the time for all that, though? And what about the fact that the well of information future AIs draw from is forever polluted by the previous generations?
More importantly, why wasn't the lack of sourcing seen as an issue before the fact, rather than afterward? Every authoritative source in history had footnotes, references, etc. In the digital realm, even Wikipedia has references. So why did the big brains developing AI not take provenance into account?
@darrelplant @dalias because researchers didn’t know LLMs would be able to chat. This was an emergent capability. They weren’t trying to build a chat bot, they were trying to build special-purpose sentiment/analysis/grammar/translation tools, and chatting took everyone by surprise. LLMs were essentially an accident
Now that they know LLMs can do zero-shot and one-shot learning, they’re working very hard on the provenance/explainability/alignment questions
Pretty sure they knew they would be able to chat before they released products with names like "ChatGPT".
I've been watching attempts at chatbots develop since the late 70s. If the people building tools to write text based on language data had no inkling that their tools could fake holding a conversation, then they are very, very stupid people.
@darrelplant @dalias OpenAI releasing chatgpt was hugely controversial — & still is — by the people who actually discovered LLMs. OpenAI didn’t build chatgpt they just commercialized it. But once something is published research anyone can use it
Looking at how industries & governments reacted, I don’t think anything would have stopped someone from commercializing LLMs before they were ready. Best we can do now is harass/regulate new entrepreneurs to not repeat that
I don't know, after all the science-fiction I read and watched, I'm really kind of surprised at how bad they are. It's 2023! Where's my jetpack?
@darrelplant @Techronic9876 I mean if you know how they work it's not surprising.
It's also why sci-fi authors never envisioned "AI" as LLMs - because they're such a ridiculously dumb, obviously "fake" way to do AI, with no intelligence whatsoever.
The programming and data storage details of the positronic brains of my youth were never really specified. I mean, we were still working with punch cards. It was assumed it was going to be something more sophisticated than punch cards and glossed over. "Big, dumb database search" wasn't a thing.
The provenance issue isn't even related to chatbots specifically, though. A traditional search engine basically provides its reference data by way of links to those references. It doesn't verify their validity, but they are provided, just a like a freshman college student writing a paper.
Even if an LLM was incapable of chat, and its function was to summarize a topic, or to generate articles to put human writers out of work, IT NEEDS TO PROVIDE REFERENCES.
@alsothings @dalias there’s already really good work on arxiv on identifying which documents an LLM output comes from, and other work on letting LLMs know the probability of tokens explicitly, and then other work on the output being a system of agent LLMs
If you put all this together you have an AI that can explain itself and explain other things, down to the sources & other possibilities
I’ll be surprised if someone doesn’t have a working demo of this by fall, & an OSS project by next spring
@Techronic9876 @alsothings "An AI that can explain itself and explain other things, down to the sources & other possibilities"?
No you do not. You have a probability model that tells you, for particular word soup, which sources and explanations are most likely, within that model, to have some correlations with the word soup that can plausibly be interpreted by a reader as agreeing with it.
@dalias @alsothings it’s not a word soup, otherwise this would have worked two decades ago with the n-gram models
It’s a multi thousand dimensional vector space where the model regresses each dimension into some concept, then maps the tokens at the intersection of all possible concepts it represents
This means the model can infer new concepts by interpolating between points, or extrapolating to new points in the space
@alsothings @dalias a single LLM would not, but that’s where the systems previously mentioned that are currently being researched will come in
Next time you’re in public and are eaves dropping on people’s casual conversations, tell me it doesn’t sound like two chat GPT agents 99% of the time; people under-appreciate what a really good precise interactive word soup can actually do
@dalias @alsothings it’s a functionalist perspective
But the power of a belief system is in its predictive ability. You predict LLMs will stagnate and continue to have poor explainability and reasoning
I predict in about a year and a half, consumer AI will have mostly solved the explainability problem and continue to get better beyond that
I hope whoever is wrong then will update their belief system to reflect what actually happened
@Techronic9876 @alsothings It's *not* a functionalist perspective unless the function you have in mind is convincing people to believe something. Which is generally a malicious function.
In your example of overhearing a conversation, the difference is that it corresponds to a vast network of consistent facts unknown to you but that could become known later, and the GPT garbage doesn't. Having these appear similar to you is a problem not an achievement to celebrate.
@dalias @alsothings the technical aspect can be solved, humans will always fight to use something for good or evil
good people need to keep using and building good AI
@dalias @alsothings I think I have a very clear grasp of the mathematical reality, having done a hundred hours of course work on the topic, thousands of hours of reading, and several days of conference presentations
Saying it’s just a “word soup” demonstrates being out of touch with the mathematical reality