Specific line of thought to illustrate my general point:
Consider an LLM that helps manage email correspondence. It writes emails! It summarizes emails! Less reading! Less typing! More messages faster! Productivity boost!! Except:
- You have to babysit the LLM, guide it and check it to make sure it’s accurately preserving human intent (which is, after all, the whole point of communication…right??). That’s new work, and likely cancels out the slim time savings of reduced reading and typing.
2/
- But it's an LLM, so it’s still often wildly, convincingly incorrect. Miscommunication increases. Miscommunication has costs. Miscommunication generates new work. Which now gets done faster! And generates yet more work!
- IT staff has to administer the LLM, support the LLM, evaluate vendors, yada yada.
- People have to maintain the LLM itself, and the infra that supports it. Those costs are •large•.
3/
And if by some magic all of this actually spins up and gets working, then (1) the barrier to communication decreases (why not just send another email if it’s automated?), (2) individual communication load increases (because you can answer emails at a faster rate), and (3) the net efficiency of communication decreases (because of everything in the previous two posts).
Sound and fury, signifying nothing.
4/
I severely doubt many real orgs measure actual desired large-scale outcomes well enough to spot that net efficiency decrease. All this is going to look like increased productivity. Will •be• increased productivity in the ways that most folks actually measure it.
But here, with the bird’s-eye view of a hypothetical, it’s clear: the total amount of work happening to achieve the same ends has •increased•.
5/
I said “reduce total workload.” What are some thing that accomplish •that•?
“Do we really have the problem we think we have?”
“There’s a simpler way.”
“Work from home!”
“Hmm, I’m going to think about my reader, and edit for clarity and emotional impact before sending this email.”
“We’re willing to pay for experience / expertise.”
“Things are going well. Head home for the day!”
“Maybe we don’t need to do this thing anymore. We can just choose not to have this problem.”
/6
A lot of things that get billed as a productivity boost sound suspiciously to me like recipes for reducing operational slack and thus “going solid:” https://en.wikipedia.org/wiki/Richard_Cook_(safety_researcher)#%22Going_solid%22
/7
As both a software developer and a teacher, I’m increasingly interested in figuring out which costly things are avoidable, or can be simplified, or •just don’t matter•…and then doing less of them.
Breathing room can be a form of efficiency too. And it’s a more humane one.
Less about tools that boost productivity, more about tools that reduce total workload.
/end
@inthehands So much this! 💯
These LLM tools just lack so much context!
- What is actually important for the person that receive the email?
- What is actually important in this wall of text in the current context?
I've actually just done an experiment that shows this (with Chat LMSys):
Some details in the next posts...
1/3 (wow, a thread within a thread🤯)
@inthehands I've given it the following prompt:
"Please summarize the following text in max 4 sentences:"
and then I've given it the pure text of the following blog post:
https://blog.rust-lang.org/2024/05/17/enabling-rust-lld-on-linux.html
There is a summary at the end of the actual blog post (that's what makes this experiment so interesting!), _which is not part of the prompt_.
Please see the image below:
2/3
@inthehands ...and now have a look at the #AI summary below by GPT4o and Gemini 1.5.
While it perfectly got it right (this time!), the most crucial bit on how to disable this new linker is not present in the summary (see image below).
This is why context and details matter, which #LLMs will always miss!
Writing requires #empathy - an #LLM lacks it.
3/3