Mastodawn

David Chisnall (*Now with 50% more sarcasm!*)

A lot of the current hype around LLMs revolves around one core idea, which I blame on Star Trek:

Wouldn't it be cool if we could use natural language to control things?

The problem is that this is, at the fundamental level, a terrible idea.

There's a reason that mathematics doesn't use English. There's a reason that every professional field comes with its own flavour of jargon. There's a reason that contracts are written in legalese, not plain natural language. Natural language is really bad at being unambiguous.

When I was a small child, I thought that a mature civilisation would evolve two languages. A language of poetry, that was rich in metaphor and delighted in ambiguity, and a language of science that required more detail and actively avoided ambiguity. The latter would have no homophones, no homonyms, unambiguous grammar, and so on.

Programming languages, including the ad-hoc programming languages that we refer to as 'user interfaces' are all attempts to build languages like the latter. They allow the user to unambiguously express intent so that it can be carried out. Natural languages are not designed and end up being examples of the former.

When I interact with a tool, I want it to do what I tell it. If I am willing to restrict my use of natural language to a clear and unambiguous subset, I have defined a language that is easy for deterministic parsers to understand with a fraction of the energy requirement of a language model. If I am not, then I am expressing myself ambiguously and no amount of processing can possibly remove the ambiguity that is intrinsic in the source, except a complete, fully synchronised, model of my own mind that knows what I meant (and not what some other person saying the same thing at the same time might have meant).

The hard part of programming is not writing things in some language's syntax, it's expressing the problem in a way that lacks ambiguity. LLMs don't help here, they pick an arbitrary, nondeterministic, option for the ambiguous cases. In C, compilers do this for undefined behaviour and it is widely regarded as a disaster. LLMs are built entirely out of undefined behaviour.

There are use cases where getting it wrong is fine. Choosing a radio station or album to listen to while driving, for example. It is far better to sometimes listen to the wrong thing than to take your attention away from the road and interact with a richer UI for ten seconds. In situations where your hands are unavailable (for example, controlling non-critical equipment while performing surgery, or cooking), a natural-language interface is better than no interface. It's rarely, if ever, the best.

Erik Jonker Jan 1

@david_chisnall ...that's also one of it's strengths, language is a completely different "beast" then math. Comparing it is useless. Language fulfills different functions then math. But just as important for human beings.

@ErikJonker @david_chisnall

That's the point. Giving directions to a machine requires math. Using the term "language" to refer to machine instructions has led people down the wrong path over and over again and led to monstrosities like COBOL and Perl and "Congratulations, you have decided to clean the elevator!"

Erik Jonker Jan 1

@resuna @david_chisnall cobol is not a monstrosity as a programming language, it’s ofcourse legacy

Erik Jonker Jan 1

@resuna @david_chisnall …a natural language interface for a computer can have enormous benefits, a good example is an educational context, you can interact, ask question’s etc in a way not possible before

@ErikJonker @david_chisnall

Even communicating with other humans in natural language leads to confusion, and humans are much much better at dealing with ambiguity than any computer.

Erik Jonker Jan 1

@resuna @david_chisnall ofcourse but current AI models can provide a level of education that scales easily, it will supplement humans in their roles and sometimes replace them. Current models can perfectly help students with high school math, even with some ambiguity

@ErikJonker @david_chisnall

The software that people refer to as "AI" is nothing more than a parody generator, and is really really bad at dealing with ambiguity. It's a joke. If you actually think that it is capable of understanding, then it has been gaslighting you.

Erik Jonker Jan 1

@resuna @david_chisnall I actually know how these models work, it's not about intelligence and understanding, they are just tools but very good ones in my own experience

Erik Jonker Jan 1

@resuna @david_chisnall ...if you tried GPT-4o or a tool like NotebookLM then you know they are more then parody generators, it doesn't help denying the capabilities of these technologies especially because there are real risks/dangers with regard to their use

Cluster Fcku Jan 1

@ErikJonker @resuna @david_chisnall now take your comments, and substitute it like this: "I find English and German very useful for work. It doesn't help denying the capabilities of *natural languages* especially because there are real risks/dangers with regard to their use". At times language appears as outer thought, but do not use it is as decisive thought. As a centralized source for inquiry and digestion, LLMs are far more dangerously illusive than the natural languages by billions.

@ErikJonker @david_chisnall

They purely operate on text patterns, they do not reason, they do not build models, they just glue tokens together. There is nothing in their design to do any more than that. This is an inherent feature of any example of this class of programs.

naught101 Jan 1

@ErikJonker @resuna @david_chisnall

Huh? Perfectly?

There have been multiple instances of people showing LLMs getting answers wrong to the most basic arithmetic problems. That's not a bug, it's an inherent feature of the model, which draws meaning from language only and has no concept of maths.

That incorrectness can only get more likely as math problems get more complex. And the more complex it gets, the harder it is for humans to detect the errors.

How is that perfect for education?

Erik Jonker Jan 1

@naught101 @resuna @david_chisnall as a support tooll during homework, where it can give additional explanation, I see a bright future for the current best models (for highschool level assignments) , for text based tasks they are even better (not strange for LLMs) . Ofcourse people have to learn to check and not fully trust, at the same time there is a lot of added value. It's my personal/micro observation but i see it confirmed in various papers

@ErikJonker @naught101 @resuna @david_chisnall

Of course people have to learn to check and not fully trust,

This is what makes them particularly ill-suited for educational tasks. A large part of education on a subject is developing the ability to check, to have an intuition for what is plausible.

Erik Jonker Jan 1

@RAOF @naught101 @resuna @david_chisnall true, but you can adapt and fine tune models for that purpose

@ErikJonker @naught101 @resuna @david_chisnall can you? How? Is there an example of this that you have in mind, or is this more a “surely things will improve” belief?

What is the mechanism that bridges “output the token statistically most-likely to follow the preceding tokens” and “output the answer to the student's question”?

@ErikJonker @naught101 @resuna @david_chisnall also, isn't the task you're suggesting is possible just equivalent to “make an LLM you don't need to check the results of”?

Violet Madder Jan 2

@ErikJonker @naught101 @resuna @david_chisnall

The support students need for their work, is A HUMAN BEING WHO IS GOOD AT UNDERSTANDING AND EXPLAINING THINGS.

A good teacher/tutor/sibling/etc can break down an explanation and present it in different ways tailored to the student's understanding. They can look at a student's work and even if it's incorrect, see what the student's train of thought was and understand what they were trying to do.

Our society already drastically undervalues that crucial, mind-accelerating work-- arguably the most important of all human endeavors, as everything else relies on it.

Glorified stochastic parrots spewing botshit are no damned substitute.

@ErikJonker @david_chisnall

My first job was programming in COBOL before it was legacy. It is terrible. It always was terrible. It's not a natural language, it's not ambiguous, but trying to make it look like a natural language was an unmitigated disaster at every level. The same is true of Perl's "linguistic" design. Even just pretending to be a natural language spawns monstrosities.

Edit: see also Applescript.

Erik Jonker Jan 1

@resuna @david_chisnall it is extremely stable and durable for sure, ask any financial institution 😃

jhannafin Jan 1

@ErikJonker @resuna @david_chisnall OK, but that's not because of COBOL. You could write something durable and stable in any programming language. Financial software is written in COBOL because that was the language of the mainframe at the time. The fact that it's still largely in COBOL is because it's expensive to rewrite, the returns on a rewrite are hard to quantify, and the risks are huge.

@ErikJonker @david_chisnall

Have you ever written any code in COBOL? Everything in COBOL takes longer to write, the fundamental operations are simplistic and verbose, the program structure is stilted and restrictive, the way you define data structures is horribly antiquated, and a huge number of the problems that make writing COBOL so slow and painful are due to its mistaken "language like" design.

SBaldrick Jan 2

@ErikJonker @david_chisnall *than

Aedius Filmania ⚙️🎮🖊️Jan 1

@david_chisnall

Maybe in the Star Trek world they use an unambiguous logique language, easy to translate in any unambiguous language, but we loose that in the English translation.

That would explain a lot the universal translator ;)

Dave Irving Jan 1

@Aedius @david_chisnall
Bit meta.

@Aedius, we release that?

Space Hobo Jan 1

@Aedius @david_chisnall The Star Trek voice computers always seemed to react like very advanced #InteractiveFiction parsers. They would respond to a wide variety of queries and commands, but most actors spoke in a very constrained way and often the computers would prompt further or give error messages.

So you'd get back "please specify which of the 3271 bulkheads you wish to open" or "Unable to comply" and you'd have to re-phrase your order. There's a famous scene where Lwaxana Troi is frustratedly pulling a sausage out of a frozen margarita and saying "Oh I am just TERRIBLE with computers...", suggesting that interaction with them was still a skill that needed training or practice.

Space Hobo Jan 1

@Aedius @david_chisnall And it made sense in that world that a crew of effectively Space Navy officers and enlisted spacers had been taught to be precise in their language, as so many of the tasks in a crew setting like that are performative speech. You need to give unambiguous commands to shipmates AND equipment, and need to quickly give updates like "standing by" or "aye, sir" or "probe launched: contact in 634 seconds...".

it was so well covered that the ambiguities we see are always jarring. When someone is standing in a crowd of 10 people, and a mix of crew and locals are coming up and staying down, the call is always "four to beam up" and *somehow* the transporter operators know which four. How? Well that's just what filmmakers call "shoe leather": the kinds of details and procedure that would take so long that they distract from the story without even giving any useful flavour.

Archangel Zeriel Jan 1

@spacehobo @Aedius @david_chisnall

> When someone is standing in a crowd of 10 people, and a mix of crew and locals are coming up and staying down, the call is always "four to beam up" and *somehow* the transporter operators know which four. How?

I'd also point out the kind of domain-specific machine-learning pattern matching that I've used in the real world to great effect -- 99% of those scenes I can remember, the people who are to be beamed up are standing still and looking slightly upward, in a way that if you saw only a screenshot of the episode you'd probably pretty easily pick out the four people who were supposed to be beamed up.

It would not surprise me to find that (in the fictional world, natch) the computer underlying the transport operator's interface is highlighting a selection of people in a given target zone who are standing in the "beam me up" posture.

Contrariwise, most emergency beam-ups seem to involve either much more specific commands or "all crew in the area, only".

David S (inactive)Jan 1

@spacehobo @Aedius @david_chisnall the best Star Trek computer user interface is the one Scotty uses in the fourth feature film...

Space Hobo Jan 1

@Pionir @Aedius @david_chisnall The best of everything in Star Trek was in the fourth film.

Bakunin Boys Jan 1

@Aedius @david_chisnall look up lojban. Also while I think the idea of NLP as programming isn't taped out, it's also a bad plan to do this for humans. Humans are good at communicating gaps in logic. I feel like that's important.

Tim Nicholas Jan 1

@david_chisnall I think the unfortunate challenge is that some people do not want tools, they want servants or slaves. LLM holds that prospect for them.

What happens when a lowly servant misinterprets the intent of a command? They are blamed for it irrespective of ambiguity.

Dawn Ahukanna Jan 1

@tim @david_chisnall
“some people do not want tools, they want servants or slaves. LLM holds that prospect for them.” - YEP.

… and that’s why businesses are “ecstatically (re)placing” Human Intelligence-HI(adaptability) with unethically sourced and environment destroying Artificial Intelligence-AI.
Related - https://mastodon.social/@dahukanna/113741679088044261

@david_chisnall I'm not so sure. I often express myself in natural language to ask people to do things, and that usually works out pretty well.

So it's possible in principle, it's just not something that computers can do yet. Maybe one day they will.

@jarkman When I ask people to do things in natural language, it often fails miserably (the people do the wrong thing). You must be interacting with very capable people or your natural language is very precise and unambiguous.

Steve Hersey Jan 1

@rspfau @jarkman
Or, third option, you and your audience share sufficient context to resolve the ambiguities or at least make the right interpretation easy to infer.

This is a feature that no LLM is ever likely to possess, as that would require its training set to mirror your own experience and training, and also requires the kind of generalized, informed judgement that humans routinely do but LLMs purely suck at. Statistical next-word prediction is no substitute for a mental model of actual meaning.

That LLMs routinely produce utterly confident, wrong results is one of their core dangers. Humans at least know (mostly) to say, "I'm really not sure about this," before speculating.

@n1xnx @rspfau Sure, LLMs are terrible at many things right now. Maybe they'll never get better, maybe future machines will come to learn the world more richly and deeply.

Steve Hersey Jan 1

@jarkman @rspfau
I'm sure they *will* get better, but absent some quantum leap in computational and storage capabilities, and some truly horrendous amounts of energy input, I don't think they can ever become good enough to be trustworthy. (Just enough to be a real-world example of "a little knowledge is a dangerous thing.")

Wolf480pl Jan 1

@jarkman @david_chisnall
Because people actually do have a synchronized model of how other humans' brains works, and those who know you have a model that matches your particular brain. I think it's called "mirror neurons".

@jarkman @david_chisnall

You can only ask them to do one of a few different things. Imagine actually picking randomly from the space of commands. Humans expect and know to perform only a few items from a very restricted set of possible things to do.

Alexandre Oliva (moving to @lxo@snac.lx.oliva.nom.br)Jan 2

my experience, shared with many neurodivergents, is that neurotypicals and even other neurodivergents very often misunderstand us, and vice-versa, and the misunderstandings are occasionally very hard to recover from, becoming another source of discrimination against minorities. AFAICT minds that work in one way build thoughts in ways that don't carry over very well to minds that work in other ways, especially when there isn't awareness of and tolerance for the differences. I've known people who can understand and "translate" expressions of thoughts in ways that enable people with different mind structures to communicate more effectively. it's an amazing skill. I wonder if LLMs extend the experience of facing frequent misunderstandings to a majority of the people, or if they could help people translate between different mind structures, different perceptions of context, and avoiding triggers

@david_chisnall LLMs play directly into business guy's IT wet dreams. If you've ever developed software for business guys, you realize that the biggest problem is figuring out what the business guy wants. Even on those rare occasions when the business guy knows what he wants, his ability to express it in a manner that can be converted into software is limited at best. For these guys, the idea of LLMs magically interpreting their wants and needs, is wonderous.

Just like the no/low code solutions they were pedaled in the previous promise cycle.

@glennsills
Or COBOL.

naught101 Jan 1

@glennsills @david_chisnall

You know... When you put it like that, I'm suddenly way more in favour of LLMs.

André Polykanine Jan 1

@david_chisnall Although I get your point, you seem to miss a very important thing: there are lots and lots and lots of people who simply *can't* go beyond natural language. I know many people with little to no math knowledge who were exposed to computers and modern technologies in their fifties. Would they be able to formulate proper requests, as you suggest? Maybe, after an extremely steep learning curve, lots of tears, sometimes panic attacks and such. Do we want this, say, for our parents? Probably not. At least, I don't. Being a developer myself, I wish my mother had an interface best suitable for her needs.

Jan (DL1JPH)Jan 1

@menelion
"Language" doesn't have to mean spoken language. Look at old UI designs - most of the key elements carry an obvious meaning, often conveyed through visual means as much as words. This, too, is language. The crucial thing here is that it's unambiguous. Knowing the design elements enables the user to predict what an action is supposed to do and a developer to understand what their users would want to happen. It's become rather fashionable to break that pattern, though...
@david_chisnall

GunChleoc Jan 1

@DL1JPH @menelion @david_chisnall I once had a translation job where it looked like the marketing department got their hands on the UI strings and decided that the thing had to be jovial human sounding, so that the users could "relate" to it.

No screenshots, no context, no access to the development version of the program.

Translating that garbage was tough, lots of creative guessing about what they might actually mean.

@menelion @david_chisnall
Which itself is also an issue; should this natural interface merely execute a best guess, or should it educate its users as to what their choices are, what they mean, and what the consequences of them are?

If the latter, then it may be effective, but in the former case it would be a disaster, too.

naught101 Jan 1

@phil @menelion @david_chisnall

This is a great point, and I think a good LLM would probably ask clarifying questions if needed, that would help it locate the question within a specific context, before answering.

The fact that current LLMs don't ever do this says a lot about how they interpret language, or at least about how they always assume one interpretation even when more than one might be available.

@naught101 @menelion @david_chisnall I have difficulty believing that they even have a notion of what interpretation is; to them it's just bits and bytes, rather than a transformative process.

naught101 Jan 1

@phil @menelion @david_chisnall

Sure, feel free to interpret my use of those words metaphorically ;)

@menelion @david_chisnall
> I wish my mother had an interface best suitable for her needs

Yeah, that could be designed; it's unlikely to involve an LLM, and if it does, it'll be for some narrow functionality

Pamela Schure Jan 1

@david_chisnall Early morning and pre-coffee and my mind went to Luxembourgish which is primarily a Germanic language which added in French to deal with legal and administrative issues. I don’t speak it, but I think this combination is possibly the best way to obfuscate meaning. Otherwise agree with your premise and can only add that language exists in a context. LLMs sorely lack the context for language to be completely understood.

Barthélémy Rochat Jan 1

@david_chisnall makes me think of https://xkcd.com/191/

Lojban

xkcd

Owen Anderson Jan 1

@david_chisnall
> (and not what some other person saying the same thing at the same time might have meant)

see "Pierre Menard, Author of the Quixote" by Borges

Winchell Chung ⚛🚀Jan 1

@david_chisnall

Arthur C. Clarke had an example of the over-engineering trap. Solve this problem: allow a farmer to direct a draft-horse to turn left or right on command.

1/

Winchell Chung ⚛🚀Jan 1

@david_chisnall

Solution 1: genetically engineer the horse to enhance its intelligence. Teach the horse a college level of English language comprehension, so that the horse can understand commands like "It is time to vector your course in the widdershins direction", "go thataway, stupid!", or whatever phrasing suits the passing fancy of the farmer at that moment. Problem will be solved after a few decades of over engineering, and each draft-horse will cost a quarter of a billon dollars.

2/

Winchell Chung ⚛🚀Jan 1

@david_chisnall

Solution 2: take an off-the-shelf draft-horse. Teach it that "Haw" means turn left and "Gee" means turn right. Teach the *farmer* to employ this user interface when directing the horse. Problem solved.

3/