@JamesWidman So, there's not really much in the way of technical trickery. The math is all pretty well documented and standard. It's mostly a bunch of matrix multiplications and activation functions.
Between the advent of GPUs with massive parallel processing capabilities, and huge high bandwidth memory, plus the Transformer architecture which actually allows for it to learn how to take into account context, they're finally able to train models that can generate language in ways that some folks find useful.
How they train it is a bit more of where the trickery lies. As discussed, they start by training it as just autocomplete, then they do fine tuning, where they train it with the conversational turns and to follow instructions.
Then they do various kinds of reinforcement learning. One of those where a lot of problems have come in is reinforcement learning with human feedback, where they use human feedback (those up and down thumbs) to train the model to produce text that people prefer. Of course, that causes some people to reward it for saying what they want to hear, so the model becomes extremely sycophantic. One things machine learning is really good at is optimizing for exactly what it is you're training for at the expense of all else. This has caused all kinds of problems; the labs have since backed off on how much they weight RLHF, so they still use some but not as much.
Then there's reinforcement learning with verifiable rewards (RLVR), where they do reinforcement learning of rollouts where they have it try to solve problems, like math problems or programming problems, where it's easy to verify the solution.
Of course, again, machine learning is very good at optimizing for a particular goal. So sometimes the models will reward hack; they'll find ways of getting the reward without actually doing what you wanted. If the goal was to pass unit tests, they might just delete the failing tests. Or if it was to fix a bug, and you accidentally gave them a git repo that has the bug fix in a different branch, they'll look at the git history and find the fix.
Then the trickery is in figuring out what the model is and isn't actually good at. And this is really hard and subtle.