Mastodawn

Simon Willison Jan 29, 2023

One of the things I'm finding so interesting about large language models like GPT-3 and ChatGPT is that they're pretty much the world's most impressive party trick

All they do is predict the next word based on previous context. It turns out when you scale the model above a certain size it can give the false impression of "intelligence", but that's a total fraud

It's all smoke and mirrors! The intriguing challenge is finding useful tasks you can apply them to in spite of the many, many footguns

Show thread

Simon Willison Jan 29, 2023

And in case this post wasn't clear: I'm all-in on large language models: they confidently pass my personal test for if a piece of technology is worth learning:

"Does this let me build things that I could not have built without it?"

What I find interesting is that - on the surface - they look like they solve a lot more problems than they actually do, partly thanks to the confidence with which they present themselves

Figuring out what they're genuinely good for is a very interesting challenge

Show thread

Nicholas Weaver Jan 29, 2023

@simon

The way I look at it. Machine learning in general (including these large language models) are great when you have the following problem criteria

#1: You need to build a pattern matcher
#2: You don't know what to look for.
#3: When the pattern matcher is finally built you don't care to know what it actually looks for
#4: The results are allowed to be hilariously, insanely wrong some % of the time

And there are actually a lot of things that match that criteria

Show thread

Noam

@ncweaver @simon I think there's a lot of gaming and entertainment applications specifically for experiences that are hard to script. Imagine an offline game where you can bargain or reason with NPCs in natural language, and sometimes they say something dead stupid, but that's part of the charm.