Does anyone know how much energy a single ChatGPT question uses?

I need to show this to my boss and colleagues to convince them AI is shit.

#AI #LLM #ChatGPT #Environment #Energy #Climate

@arutaz Let’s assume it runs on air-cooled DGX #H100 systems with 8 NVIDIA H100s each, which deliver 25.6 petaFLOPS at 10.2kW.
Due to its mixture of experts architecture using only two of its 16 experts, #GPT-4 inference supposedly needs only 560 teraFLOPS per token generated in its forward pass.
So we’re at 25.6*10^15 FLOPS / 560*10^12 FLOPS = 45,7
Giving you 4.48 tokens per kWsecond or 16128 tokens per kWh.
@arutaz It seems the question then becomes: Can I match the quality of a GPT-4 response if I write at 20 words per minute? For me that may be true for some tasks, but not others.
@faz Thanks for your answer 😊

@arutaz We finally have good numbers on this, from actual experiments:
https://arxiv.org/pdf/2310.03003

tl;dr: 3-4 joule pro token with llama 65B

so my 16 128 tokens would actually use less than 0.018 kWh, I was off by three (!!!) orders of magnitude.

So the ten hours on your 100W laptop become 10 minutes. If you can write a 12 000 words essay in 10 minutes, you're more efficient than llama65B.