Mapping the Mind of a Large Language Model
https://lemmy.world/post/15661170

Mapping the Mind of a Large Language Model - Lemmy.World
I often see a lot of people with outdated understanding of modern LLMs. This is
probably the best interpretability research to date, by the leading
interpretability research team. It’s worth a read if you want a peek behind the
curtain on modern models.
I think the most interesting thing in this article is the fact that some concepts central to semantics (analogy, connotation) or psychology (bias) kind of emerge naturally in multi layered neural networks of sufficient size. Also that it can sound like different personalities (overconfident, secretive, delusional) if you manipulate the weight or the proximity of features. I’d like to see the same kind of study but for midjourney…