@simon breaking down #chatgpt/LLM stuff again to being more understandable--an excellent and accessible read.

https://simonwillison.net/2023/Jun/8/gpt-tokenizers/

Understanding GPT tokenizers

Large language models such as GPT-3/4, LLaMA and PaLM work in terms of tokens. They take text, convert it into tokens (integers), then predict which tokens should come next. Playing …