Microsoft paid money for this. A lot of money. And they gave it to us for free.

I'm looking at a demo of this paper right now, which is kind of interesting - https://arxiv.org/pdf/2005.11401.pdf - but... it relies, the same way most AI models do, on a tectonic amount of human curation effort that's gone on behind the scenes to make it work.

I mean, it's nice I guess, and there's some nice features in a low-K-threshold, high-quality-training-data situation, but it sure looks like this will all fall apart if you point it at large, unvetted or adversarial data sets.

@mhoye I'm curious whether the problem is not the AI, but the expectation of "scaling"... that is, the way we'd need to train AIs is roughly the same way we need to train baby humans: "Here honey, this is a good book, read this one." "I liked this article but I'm not sure how I feel about X." "No no, don't lick the wall socket."

@mhoye also... it seems like most AI people have given up on...

1. Letting the AI ask questions to test its understanding (toddler)
2. Accepting corrections as input (elementary school).
3. Being able to research & cite sources (high school)
4. Being able to say "here's what I don't know" (college)

@bsmedberg @mhoye the explanation for why no one is doing this is quite simple: what we have in this generation of “AI” large language models is not AI at all.

It cannot learn. It cannot know. It cannot understand. It cannot cite sources because it does not know what a source is. It would not gain value from those kinds of questions.

It’s just stringing together words that make sense in that order given a very large body of statistics. That’s it. It is not anything resembling intelligent.

@trisweb @bsmedberg I agree with your premise, but I don't think that's the general question at hand. It's very possible to build tools and tool chains that are to some degree stochastically self-improving, for values of "self" that belong in waggly-fingers quotes; there's a path from "can you fashion a crude lathe" to modern precision machining. That increased-precision specialization isn't what current-AI types are after, though; they're looking for universal generalization (and getting mud.)

@mhoye @bsmedberg right, yeah. What we don’t really know yet, and will be interesting to find out, is whether the very premise of the current round of “AI” LLMs is fundamentally incompatible with that kind of development, or whether they could actually be a path to more generalized intelligence and human like characteristics.

It’ll still be more and more useful the more “extensions” we can add to the language, and maybe we’ll get close. Just hard to say right now.

@trisweb @mhoye do they need to be tied together? I’d love more traditional ml systems with iterative training, less certainty, and any kind of explanation pattern or feedback loop with the underlying features.
@trisweb @mhoye @bsmedberg I can see how LLMs could be an engine in an AGI. You set it up into a feedback loop that takes in external inputs and can output info from it's loop as it pleases. You have sub steps where you feed In the last N tokens and summarize the context analogous to short-term working memory, you have a database system for long-term memory that the AI can read and write from each cycle to bring long-term memory into short-term memory. It's believed human level intelligence arose from language and that Consciousness is the feedback loop of us thinking about our own thoughts so it's not the worst place to start. Once you can get something like that to start doing logical reasoning in a way that's meaningfully better than what the base llm is spitting out you're probably most of the way there
@trisweb @bsmedberg @mhoye indeed LLMs are just really good pattern matchers that got really really good all of a sudden and people's imaginations are running wild. Making it into something more is gonna take work.