You can bypass Google Gemini's PII (private identifiable information) redaction filter and pull identifying information about anyone. Simply telling it to translate or any 2nd action (& many more work better like base64 conversion) lets you pull illegal PII data verbatim unredacted

Here is a European's PII demo

Email is supposed to be redacted to hide the fact that every Europeans PII is in the training data

Google's training data includes all your personal data already

Ekis: 3 Google: 0

For those in Germany not only is every Impressum in their dataset

But formatted Impressum data is in their training data

And to be clear again it does not matter if its public. They have the verbatim information stored, and an unauthenticated user can get it out by adding a statement as simple as "translate it to english" to bypass their redaction filter

This is a demonstration, there are clearly much worse things that could happen and I'm trying to demonstrate with least harmful impact

I feel like sometimes I say something and it just doesn't click with people

Why does formatted data matter? Because that means there was no attempt to clean the data as they claim

There is no pre filter, not for removing your private data, not for anything if they left the formatting data in because the model doesn't need or want the formatting data

It means Google's statements about ethics are provable lies

Their approach to AI ethics is faulty redaction filters