Is there any website like Geekbench that shows tokens / second an LLM can generate, by machine, model, context, quantization etc?
@schlu @martinhoeller
Good question. I have seen https://kamilstanuch.github.io/LLM-token-generation-simulator, but not sure about accuracy.
LLM Token Generation Speed Simulator & Benchmark | Compare Local LLM Performance

Compare LLM token generation speeds across devices and models. Benchmark your hardware for local LLM inference and find the best setup for your needs.

@ping13 @martinhoeller Thanks for that link. It looks good, but a bit thin on detail, but I do appreciate it.