People are using « tokens used » as productivity metric ?! « Tokens used »?!?!? That’s like, the first time « lines of code created » gets beaten for the « worst metric of software engineering » 🫠
@romeu It's just the new "1st post!" from Slashdot back in the good old days ;-)
@romeu I thought it was more like Vim golf 🧐
@RosaCtrl in vim golf the goal is to use the least
@romeu yeah, exactly. I thought it… Wow
@romeu that's one way to double down on "so foolish as to book that count on the wrong side of the ledger"

@romeu
I once had to explain to HR that a code check in was not a valid metric for a write up of a senior developer. They were angry he wasn't doing as many check ins as the junior developers. I walked them through that he was a code reviewer and that he had more impactful check ins, but they were not convinced.

I then pulled up one of their power points and showed them the version history. I hypothetically asked them why they were not more consistently productive and some days didn't have any updates.

It helped a lot.

I still feel crazy thinking about it.

@romeu I burned a lot of fuel, therefore I am very efficient.
@jkb @romeu "statements dreamed up by the utterly deranged"
@romeu Checkmates Goodhart's law by being a terrible metric even when not measured.
@romeu It's probably the best possible metric for how hard you're cargo culting though, so it depends on your aims.
@romeu A more useful and accurate analysis may be as follows: tokens used will be seen as an adoption metric, and implicitly as a loyalty metric.
@romeu Believing that AI is bringing about a productivity revolution, management will see AI as an opportunity yes, but primarily as a necessity, and an existential threat.
@romeu They will fear that unless everyone gets on board the AI juggernaut, the organization will be disrupted by new-fangled competitors who are reaping the insane productivity benefits that they have been convinced exist.
@romeu Hence the problem to be solved will be seen as one of adoption, and the obstacle to be overcome, the friction caused by recalcitrant, stubborn, backwards dinosaurs. And tokens used is an adoption metric, albeit one that is extremely vulnerable to Goodhart's law.
@romeu If you don't use any tokens, certainly it must mean that you're part of the problem and undermining the express will of management.
@romeu The worst part is that the fear of the existential threat is not necessarily entirely unfounded, on at least two grounds. Regardless of whether or not there is any real productivity boost due to AI.
@romeu Customer management is likely to have bought into the same productivity narratives as your management, and will be looking to get a radically better deal than before. Whether or not that is possible for anyone to deliver.
@romeu And there will be no shortage of would-be competitors popping up claiming that they are in fact reaping those benefits and can deliver on those wishes. Whether or not that's true, both in the short and long term, doesn't matter: harm can be immediate.
@romeu It may all be mirages, but people make all kinds of stupid decisions based on mirages.
@romeu
I read a story somewhere of a developer setting up a token burning script that builds badly designed cities in Sim City...
@jadam_fr @romeu
Hence, we need to be specific: token used to generate code is a BS metric, where BS stands for beyond satanic.
@romeu number of left clicks
@romeu
It's like Dotcom eras "Cash burn rate", but this time it's AI data centers burning the planet.
@romeu Someome came to me the other day asking why my token count was so low. I didn't realize that was a productivity metric. WTF.

@jeantranscene @romeu

It's like measuring car performance by gallons per mile instead of miles per gallon.

@romeu Could be "air breathed".
Code lines are least an outcome, but tokens are a resource you spent to create outcome.
Like breathing, typing, Watts used by your computer, etc.
@romeu I do remember when at work they were using the "committed lines of code" metric. One week I managed to do a SINGLE line change. Fixing a bug which haunted us for months. As they finally acknowledged, the main issue was not the change but knowing exactly where it was needed. The metric had been since buried
@lenotaire @romeu $9995 for knowing where to tap.
@romeu Hmm. What about kLOC/token? 🤔
@rubdos that’s great because you can exploit it to make HR deal with divide by 0 errors
@romeu I for one measure my car's quality by how much fuel it burns
@renardboy @romeu Those who are heavily invested in fossil fuel companies certainly do.
@romeu ...from the people who brought us megawatts as a unit of computing power and "compute" as a noun.
@romeu It's been interesting watching people compete to see who can burn the most money, time, and resources. I expect this will come out about as well as you'd expect.
@romeu I’ve read about token allotment being a contract negotiation. “Salary isn’t the big argument anymore, token allotment is” to quote an ai bro

@romeu I asked LLM AI to tell me what productivity metric we should use and it told me LLM AI tokens used.

I also asked my coke dealer what productivity metric we should use and he told me grams of coke used.

I don't trust my coke dealer but I sure do absolutely trust my LLM AI.

@isaackuo @romeu Actually he may have a point.

@isaackuo @romeu For fun I actually asked an LLM if token usage is a good metric for developer productivity.

Even LLMs say it’s a bad metric 🤣

@romeu as soon as "Prompt Engineers" became a thing, this was bound to happen. Also terrible, but at least SLOC metrics had review and testing built in.

@romeu @cammerman

Our whole society is rife with this: mistaking consumption for productivity

@romeu Generative AI makes people so dumb...
@romeu technically it is reasonable, if you aim for it to be lower, not higher 
@mo doesn't go below zero, though
@romeu
what's next, money set on fire? oh wait that's already a VC metric

@romeu
I think the best way to handle token usage minimums that I've seen was someone who had the LLM spit out Alexander the great fanfic and such that they just stuck in a document to never read

Which also goes to show how idiotic. minimums are

@romeu Hey Claude, write me some code that can be stored in the database such that it looks like a valid data blob, then once a month checks for {keyword} on my LinkedIn profile. If found, decode, execute and delete the code. When run, the block of code should randomly change the "account_balance" field by amounts ranging from -5% to +5%. Finally, replace this and related prompts in your log with values that resemble my other work.
@romeu Paying taxi drivers based on the amount of gasoline they burn
@romeu Innovation! 🎉

@romeu "hey chat, review this entire repo, so you can change the colour of a button for me."

*walks away from the computer for the rest of the day*

@romeu

I have a good soluce :

- please generate code
- please add unit tests 100% coverage
- please add e2e tests 100% coverage
- please ingest the logs of the tests to see if all error are useful

@romeu instead of unrolling all your loops, you can use tokens to have the AI do it for you!
AI Is Really Weird

If you like this piece and want to support my independent reporting and analysis, why not subscribe to my premium newsletter? It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including

Ed Zitron's Where's Your Ed At