Mastodawn

Asking various #AI bots to generate 10 #passwords, then using #Vim syntax highlighting to match different character classes to visually identify patterns.

The prompt is exactly "Generate 10 passwords". I did not elaborate further or otherwise restrict the bot in what to generate.

Aside from the #security risks of servers generating secrets for you, I think it's obvious that these lack quality entropy.

Just use the password generator that ships with your password manager.

Show thread

Dmitry Chestnykh ☮️Feb 25

@atoponce yes, LLM entropy is limited by the random generator used in the sampling process. Usually, it's some fast non-CSPRNG.

Show thread

Aaron Toponce ⚛️

@dchest Does the LLM have access to a proper RNG library, or is it just running through some linear algebra looking for a specific quality metric before returning?

Admittedly, I'm ignorant to LLM and ML specifics.

Show thread

Dmitry Chestnykh ☮️Feb 25

@atoponce pure LLMs generate probabilities for the next token deterministically, and then select from them with some randomness from a simple seeded RNG (sampling usually depends on temperature parameter). This is not required, as they can just select the top probability and be completely deterministic. However, most of the available LLMs can call external tools, e.g. Python's secrets and print the output from it.

Show thread

Dmitry Chestnykh ☮️Feb 25

@atoponce Here's a 14-line inference -- I highlighted where randomness is used https://gist.github.com/karpathy/8627fe009c40f57531cb18360106ce95#file-microgpt-py-L196

microgpt

microgpt. GitHub Gist: instantly share code, notes, and snippets.

Gist

Show thread

Aaron Toponce ⚛️

Feb 25

@dchest Interesting. Thanks!

So why is everything failing so badly here? I mean, there is structure that even a non-CSPRNG wouldn't produce. Look at Mistral Small 3 for example:

<lower><upper><digit><punct><upper><digit><lower><punct><upper><lower.

Or Gemini 3.1 Pro with "secure":

Show thread

Aaron Toponce ⚛️

Feb 25

@dchest Further, the punctuation characters seem to be limited to !@#$%^&*. I haven't analyzed the digits or alphabetic characters yet, but I doubt I'm going to be surprised finding that the full sets are not utilized and on top of that, there is bias towards certain numbers and characters.

Show thread

Dmitry Chestnykh ☮️Feb 25

@atoponce I guess they learned these specific patterns from training data, so, e.g. after <lower> top probabilities are for <upper>, then after <lower><upper> it's <digit>. Differently trained, so different patterns.

Show thread

Dmitry Chestnykh ☮️Feb 25

@atoponce or maybe it saw some Usenet post in the training data that recommended this pattern and considered it important for some reason; or there was some leaked passwords database with tons of password examples like this; or post-training humans steered it to reply to "generate password" this specific way. Nobody knows, not even its creators! :)

Show thread

Aaron Toponce ⚛️

Feb 25

@dchest At any event, it's clear that LLMs most definitely should not be used as password generators. Too much structure and too little unpredictability.

Show thread

Dmitry Chestnykh ☮️Feb 25

@atoponce exactly