I think a thing most people do not understand about chatbots is HOW SIMPLE THE UNDERLYING MODELS are.

It's really just statistics of how close words are to each other in a huge ass amount of written text, with a sprinkle of classification and labeling (done by humans).

And then it autocompletes your prompt. That's it, that's really all there is to it.

@thomasfuchs to the extent that computers are just large collections of on/off switches, yes. But those descriptions hide in both cases the tremendous tapestry of ideas and power that come from the master of how things connect with each other.
@Migueldeicaza I mean it’s essentially a very powerful search engine, with limitations due to the statistical nature of it.

@thomasfuchs yes it has limitations, but even the proximity is governed these days by very interesting and sophisticated systems - it is now very far from “what’s the most likely word given the words behind me”

Like video games that are just a pile of hacks combined, but are able to deliver an immersive experience - these piles of hacks amount to very useful tools.

@Migueldeicaza I guess what I’m saying is that just because it creates complex results it is not at all complex itself; it’s very easy to trick yourself into attributing more to it (e.g. sentience) than is actually there.

Which makes chatbots into perhaps useful but also highly dangerous tools.

@thomasfuchs I see what you mean - absolutely, some folks are really falling for it.

@Migueldeicaza @thomasfuchs

Yeah, there are tons of things to complain about the AI industry for, but it’s been a long time since LLMs operated as simply as that. There is a lot going on there.

And regardless of what’s happening behind the scenes, the capabilities and usefulness for specific work has gone up exponentially.

I get having serious issues with LLMs, but this specific type of criticism always strikes me as a reflection of unfamiliarity with the current state of the harnesses and models.

@scottwillsey @Migueldeicaza @thomasfuchs An unfamiliarity that is hard to improve because, unless you are working in this field, I see very few in depth discussions of what the current state is all about. Lot of noise, very little signal.

@sandorspruit @scottwillsey @Migueldeicaza How exactly is it not inferring the next token based on the previous tokens by using a statistical model?

I’d like to add that I have not made any statement about the usefulness, veracity or applicability of LLMs in my OP.

@sandorspruit @Migueldeicaza @thomasfuchs

Just using the tools will tell you people are speaking of that which they know not. 😄🤷

It’s not unknowable at all.

@thomasfuchs @Migueldeicaza yes but the search index is lossy-compressed. It pulled the factual pixels into more layers of search. It encoded ‘the world’ with 2000 era mp3s stolen from limewire

@onyxraven @Migueldeicaza For a lot of application it doesn't matter if it's lossy or wrong sometimes.

For example you could use it as a web index, but before showing results to the user you cross check with a relational database of links to filter out any made-up links.

I really want someone to do this because it would be immensely powerful.

@thomasfuchs @Migueldeicaza agree - these are some of the amazing uses of vector encodings and transformer - expanding actual relevancy in the search index, related terms, etc - a better paradigm then just tf-idf alone. The problem is that the attention is on using JUST the compressed data :/
@onyxraven @Migueldeicaza I’m just begging for anything that better that current search engines 😭

@thomasfuchs @Migueldeicaza oof. Yeah. It felt like they were getting so good.

I’ve dealt with two kinda difficult search domains in my career. I kinda want to try again.

First was helping at photobucket - we were trying to blaze a new path with a distributed compute index. Had solr atop hbase and did some pretty cool stuff. There was a lot more we could have done there as those domains really matured.

Now we have a tough domain in consumer products at Ibotta - we are absolutely leaning in on the vector relativity stuff. Inference/imputing in sparse input data is really interesting.