For people using LLMs for reviews/audits, I'm curious what tricks you've found useful.

Lately, a lot of people have told me that using multiple models is super great, but IME using one model with a bunch of different personas is more effective (ofc. you can do multi-multi as well), e.g., "use independent agents to review as linus torvalds, kyle kingsbury, tptacek, dan luu" has been working well for what I've been working on.

Ofc. I feel ridiculous invoking my own name, but it seems to work.

I've also heard the suggestion that you should invoke review personas, "SRE", etc., but a name seems to evoke something more specific than you get out of a role.

When I use LLMs I'm generally trying to get as much work as possible out of a unit of my time. I often want to type for a few minutes to get something to loop for hours to days. This named persona thing seems to help with that over just asking the LLM to do whatever it is I want to get done, which often works badly if prompted directly

unless there are a lot of guardrails, which take time to set up.

E.g., "linus torvalds" seems to nudge things away from spiralling with pointless complexity, "dan luu" nudges things away from claiming improvements without measuring and sometimes comes back with "must avoid a post-hoc rationalization" and prevents bogus reasoning from being carried forward, "kyle kingsbury" stops some bad concurrency bugs, etc.

I laugh every time time I type my name for this, but it's been working.

I've been experimenting with this more (not rigorous experiments). When you tell it to use a persona, it often says something like "I'll use a distributed systems lens" or w/e, making it sound like adding a 2nd persona with the same background wouldn't help, but (for example) having "marc brooker" and "kyle kingsbury" seems to help more than 2x of either or 2x generic distributed systems persona.

There are also personas that don't seem to work even though the person would be incredible,

such as adding "sanjay ghemewat" or anyone I've tried to try to seriously improve API design or code quality.

Maybe expecting that you magically get better API design by doing the moral equivalent of saying beetlejuice three times is obviously silly and shouldn't be expected to work, but this silly thing that shouldn't be expected to work seems to work in a lot of other areas.

What areas does it work in vs. not and why? Even if there were no practical implications, I'd be curious about this.

@danluu My guess is that there needs to be a substantial amount of info on what the person would value and focus on in reviewing, in the training set. I think it’s acts as a shorthand for what to look for and what preferences to express. If that hypothesis holds, it should be possible to make a review skill that perform similarly without mentioning the persons name.

@danluu It seems that a persona known for an idiosyncratic or opinionated style would be more useful (the model chooses the tradeoffs that persona is known for) than a persona who is more known for being the platonic ideal of some skillset.

That is, you could imagine the model by default tries as hard as it can to design APIs like Sanjay Ghemewat

@danluu this is survivorship bias, you’re seeing ghosts in the slot machine
@danluu ooc who have you tried for improving API design?
(does sanjay have a bunch of public code or a public trail of directly authored technical writing?)
@danluu it feels like adding a name is affecting where your next token completion process starts from within the model's multi-dimensional probability gradient surface. Since there are many paths to the completion process, starting it somewhere in the vicinity of that person's "ghost footprint" would shade the result accordingly?

@danluu I don't have an opinion or any data on this, but I have been curious about the idea since reading this paper: https://arxiv.org/html/2603.18507

also related: when upgrading to opus 4.7 the "claude intimacy" people found personas were triggering safety filters and getting rejected: https://starlingalder.com/claude_companion-guide_model-specifics_v002

Expert Personas Improve LLM Alignment but Damage Accuracy: Bootstrapping Intent-Based Persona Routing with PRISM

@danluu my last dig into this, the research didn’t really back the ‘personas’ approach.

Not saying it definitely doesn’t work; not trying to contradict your experience, the field is young and research is by-necessity behind practice … but, still. (Zheng 2024, the one I’d read; also see PRISM 2026.)

Both are pretty specifically about short/direct persona-prompts, though; not deep elaboration of roles and goals, which I suspect may yield better results.

@danluu and, still speaking just in general of persona-promoting as a concept; recall the disalignment paper. The whole ‘write bad code? become nazi!’ really strongly suggests to me that what constraining suggestions tend to do, is pare down the knowledge-manifold to subregions of training, which both 1. affects apparently-unrelated behaviours (see: their result), but I suspect *also* restricts the “knowledge” available to the model (see: PRISM.)

tl;dr I suspect personas are a good way to both reduce reliability/predictability *as well as* losing knowledge/context. (This is entirely my own speculation from the reading though lol.)

@danluu In the case of you, Kyle, and Linus, you all have many bytes of relevant technical text available in the ambient LLM training set, so a chatbot has a basis to do a good impression of you. who do you respect as an API designer who has a similar breadth of technical writing that is pretty sure to have been hoovered up into a chatbot’s underlying LLM?

RE: https://mastodon.social/@danluu/116514958701771749

@danluu I only tried this in a few cases, but so far I had good results when asking Claude to suggest a prompt with some personas based on your post https://fosstodon.org/@danluu@mastodon.social/116514958754649007 .

That causes a short summary for each persona, so that the review session doesn't have to spend work looking them up. The Kyle Kingsbury persona was also a good fit for my review.

I didn't expect this to work as well as it did. Weird stuff!

@danluu I started use Dwight Shrute for debugging. I make Dwight separate out facts and hypothesis. Now I use Dwight anytime Claude fails at any task in its first attempt. It works out way better than it ought to and the session always ends with a free fact about beets 🤷‍♂️