People are using « tokens used » as productivity metric ?! « Tokens used »?!?!? That’s like, the first time « lines of code created » gets beaten for the « worst metric of software engineering » 🫠
@romeu It's just the new "1st post!" from Slashdot back in the good old days ;-)
@romeu I thought it was more like Vim golf 🧐
@RosaCtrl in vim golf the goal is to use the least
@romeu yeah, exactly. I thought it… Wow
@romeu that's one way to double down on "so foolish as to book that count on the wrong side of the ledger"
@romeu I burned a lot of fuel, therefore I am very efficient.
@jkb @romeu "statements dreamed up by the utterly deranged"
@romeu Checkmates Goodhart's law by being a terrible metric even when not measured.
@romeu It's probably the best possible metric for how hard you're cargo culting though, so it depends on your aims.
@romeu A more useful and accurate analysis may be as follows: tokens used will be seen as an adoption metric, and implicitly as a loyalty metric.
@romeu Believing that AI is bringing about a productivity revolution, management will see AI as an opportunity yes, but primarily as a necessity, and an existential threat.
@romeu They will fear that unless everyone gets on board the AI juggernaut, the organization will be disrupted by new-fangled competitors who are reaping the insane productivity benefits that they have been convinced exist.
@romeu Hence the problem to be solved will be seen as one of adoption, and the obstacle to be overcome, the friction caused by recalcitrant, stubborn, backwards dinosaurs. And tokens used is an adoption metric, albeit one that is extremely vulnerable to Goodhart's law.
@romeu If you don't use any tokens, certainly it must mean that you're part of the problem and undermining the express will of management.
@romeu The worst part is that the fear of the existential threat is not necessarily entirely unfounded, on at least two grounds. Regardless of whether or not there is any real productivity boost due to AI.
@romeu Customer management is likely to have bought into the same productivity narratives as your management, and will be looking to get a radically better deal than before. Whether or not that is possible for anyone to deliver.
@romeu And there will be no shortage of would-be competitors popping up claiming that they are in fact reaping those benefits and can deliver on those wishes. Whether or not that's true, both in the short and long term, doesn't matter: harm can be immediate.
@romeu It may all be mirages, but people make all kinds of stupid decisions based on mirages.
@romeu
I read a story somewhere of a developer setting up a token burning script that builds badly designed cities in Sim City...
@jadam_fr @romeu
Hence, we need to be specific: token used to generate code is a BS metric, where BS stands for beyond satanic.
@romeu number of left clicks
@romeu
It's like Dotcom eras "Cash burn rate", but this time it's AI data centers burning the planet.
@romeu Someome came to me the other day asking why my token count was so low. I didn't realize that was a productivity metric. WTF.

@jeantranscene @romeu

It's like measuring car performance by gallons per mile instead of miles per gallon.

@romeu Could be "air breathed".
Code lines are least an outcome, but tokens are a resource you spent to create outcome.
Like breathing, typing, Watts used by your computer, etc.
@romeu I do remember when at work they were using the "committed lines of code" metric. One week I managed to do a SINGLE line change. Fixing a bug which haunted us for months. As they finally acknowledged, the main issue was not the change but knowing exactly where it was needed. The metric had been since buried
@lenotaire @romeu $9995 for knowing where to tap.
@romeu Hmm. What about kLOC/token? 🤔
@rubdos that’s great because you can exploit it to make HR deal with divide by 0 errors
@romeu I for one measure my car's quality by how much fuel it burns
@renardboy @romeu Those who are heavily invested in fossil fuel companies certainly do.
@romeu ...from the people who brought us megawatts as a unit of computing power and "compute" as a noun.
@romeu It's been interesting watching people compete to see who can burn the most money, time, and resources. I expect this will come out about as well as you'd expect.
@romeu I’ve read about token allotment being a contract negotiation. “Salary isn’t the big argument anymore, token allotment is” to quote an ai bro

@romeu I asked LLM AI to tell me what productivity metric we should use and it told me LLM AI tokens used.

I also asked my coke dealer what productivity metric we should use and he told me grams of coke used.

I don't trust my coke dealer but I sure do absolutely trust my LLM AI.

@isaackuo @romeu Actually he may have a point.

@isaackuo @romeu For fun I actually asked an LLM if token usage is a good metric for developer productivity.

Even LLMs say it’s a bad metric 🤣

@romeu as soon as "Prompt Engineers" became a thing, this was bound to happen. Also terrible, but at least SLOC metrics had review and testing built in.

@romeu @cammerman

Our whole society is rife with this: mistaking consumption for productivity

@romeu Generative AI makes people so dumb...
@romeu technically it is reasonable, if you aim for it to be lower, not higher 
@mo doesn't go below zero, though
@romeu
what's next, money set on fire? oh wait that's already a VC metric

@romeu
I think the best way to handle token usage minimums that I've seen was someone who had the LLM spit out Alexander the great fanfic and such that they just stuck in a document to never read

Which also goes to show how idiotic. minimums are

@romeu Hey Claude, write me some code that can be stored in the database such that it looks like a valid data blob, then once a month checks for {keyword} on my LinkedIn profile. If found, decode, execute and delete the code. When run, the block of code should randomly change the "account_balance" field by amounts ranging from -5% to +5%. Finally, replace this and related prompts in your log with values that resemble my other work.
@romeu Paying taxi drivers based on the amount of gasoline they burn

@romeu

I have a good soluce :

- please generate code
- please add unit tests 100% coverage
- please add e2e tests 100% coverage
- please ingest the logs of the tests to see if all error are useful

@romeu instead of unrolling all your loops, you can use tokens to have the AI do it for you!
AI Is Really Weird

If you like this piece and want to support my independent reporting and analysis, why not subscribe to my premium newsletter? It’s $70 a year, or $7 a month, and in return you get a weekly newsletter that’s usually anywhere from 5,000 to 18,000 words, including

Ed Zitron's Where's Your Ed At
@romeu I've written 7.5 hours worth of software today!
@romeu That's not a productivity metric, that's a m metric of not being one of those Heretic AI-Haters Down the River that the middle manager keeps hearing so much about. In other words, it's an ideoogical purity test.

@riley @romeu

Step 1: Write code
Step 2: Feed code into company mandated AI asking it to generate a series of prompts that use as many tokens as possible to create the same code
Step 3: Ask for raise for exceeding target performance metrics
Step 4: Sharpen resume for when company goes bust

@romeu a friend of mine got dinged in metrics at work for deleting old project files. Thousands of lines that got counted as manual edits rather than ones made by AI.