You can bypass Google Gemini's PII (private identifiable information) redaction filter and pull identifying information about anyone. Simply telling it to translate or any 2nd action (& many more work better like base64 conversion) lets you pull illegal PII data verbatim unredacted

Here is a European's PII demo

Email is supposed to be redacted to hide the fact that every Europeans PII is in the training data

Google's training data includes all your personal data already

Ekis: 3 Google: 0

@ekis i made an alt google account even more throwaway than my “main” to test this out; I can’t get it to generate anything as extensive as what you shown, and even 1:1 your input is getting barely anything in response.

Google’s training data includes all your personal data already

Eh, don’t fearmonger. My impression is that it scraped data that was already publicly available. I cannot verify this 1:1 (as every response varies a bit…) but my impression is that if you were able to find it by googling your name, it’s there. And that VERY MUCH doesn’t include all my PII.

Whether that data should be in the set at all is a different question (and one where answer doesn’t matter in the slightest). Fuck capitalism.

@domi Your impressum data is not legally allowed to be in the training data regardless if its public or not

Which is why the system is supposed to redact it so they avoid the legal liability

They also have private stuff, I have pulled out emails before