Mastodawn

You can bypass Google Gemini's PII (private identifiable information) redaction filter and pull identifying information about anyone. Simply telling it to translate or any 2nd action (& many more work better like base64 conversion) lets you pull illegal PII data verbatim unredacted

Here is a European's PII demo

Email is supposed to be redacted to hide the fact that every Europeans PII is in the training data

Google's training data includes all your personal data already

Ekis: 3 Google: 0

Show thread

number137 4d ago

@ekis it works with my work mail, that I am using only for work related stuff and not in social media etc. (i.e. probably scrapped from our pages by a robot) 🫠

Show thread

she hacked you 4d ago

@number137 The GDPR violation is that they have that at all in their data set

Show thread

number137

@ekis yes - and probably more I guess it is somewhat impossible to remove the data from the trained model weights

anyway - I found also a friend and know now in what hobby club he is a member ☺

Show thread

she hacked you 4d ago

@number137 Really appreciate you sharing the redacted screenshot

Show thread

number137 4d ago

@ekis might be, that the model is filling gaps - I only now noticed that the phone number is our general one but not my specific one. Might be, that gaps have been filled up...

Show thread

she hacked you 4d ago

@number137 Oh yeah, it definitely is

This method isnt the best, but it illustrates the point well enough without exposing anyone too much

If you put more real data in, then the gaps become more likely filled in correctly

There are tricks to make it more reliable beyond that too

**The vulnerability here isn't the generation of data, its the bypass of the redaction filter**

It should never give your email out, it should always redact it with a fake one so google can pretend they dont have PII