About once every ten years there's a huge scandal where it turns out the NSA has been spying on basically everyone on earth without a warrant, and there's a big angry blowup and Congressional hearings, and there's an agreement not to prosecute if the NSA agrees not to do it again. And then ten years later it turns out they kept doing it, or rather, shut down the old program and started up a new program doing the same thing in a new way. (1/2)
Next time this happens, it will turn out data streams diverted from things like chat rooms or Zoom conversations or the documents in your cloud text editor you thought were private to train "AI" were in fact being re-diverted right before they went into the AI training model and retained in unmodified state by the NSA in a giant hard drive bank under Virginia. And when we find out this has been happening for years, everyone will be SO SURPRISED, although I'm telling you now it's happening (2/2)

Also, at least one other country's state security services *besides* the US are *also* doing this. And these cases you will never find about.

Why do I say all this? Simple: It is to the benefit of the NSA/the Chinese MSS/the Russian FSB to do this. And nothing particularly stops them, since most companies are weak to insider attacks. If some companies turn out to be strong to it, they'll just try again at a weak one. So you might as well assume it is happening.

Most privacy policies have this big problem where if data is transmitted over the network from you to the company but not *retained* by the company, they feel they don't have to note that in the privacy policy. It's just omitted. Problem is, once data is inside a company's network they don't really control what happens to it. Companies *assume* they control the actions of both their employees and their computers. But that isn't necessarily true. There are lots of reasons it could become untrue.

Never forget that in 2018 it turned out at Twitter "passwords were written to an internal log before completing the hashing process". They just had a big plaintext log with years' worth of everyone's passwords. https://www.bleepingcomputer.com/news/security/twitter-admits-recording-plaintext-passwords-in-internal-logs-just-like-github/

Real easy accident. Just takes somebody accidentally printf()ing the HTTP post body sometime before the login code gets called.

Now if it could happen by accident and go unnoticed for years, imagine how easy it is to pay someone to ADD that "accidental" printf()

Twitter Admits Recording Plaintext Passwords in Internal Logs, Just Like GitHub

Following an internal audit, Twitter admitted today that due to a bug in its password storage mechanism it accidentally logged some users' passwords in internal logs.

BleepingComputer
Live your life assuming every piece of data of yours that gets fed into ChatGPT's training set is being accidentally printf()ed to a file right before it goes into the AI butter churn. This way, you won't be blindsided when in 10 years it hypothetically turns out ChatGPT actually was preserving plaintext of every scrap of training data it ever collected on purpose for easy re-training, no systematic access logs were ever kept, and an undisclosed number of National Security Letters were serviced
@mcc I actually came across info on browser stats that was not public anymore at the time saw it in charger/Bing or bard (forgot which). Not personal, but might hurt publisher
@mcc maybe autocorrect would benefit from ai, still dismal :-)
@andreas I'm real worried about the ChatGPT integration being added to SwiftKey D:
@mcc yup. Not using anything with chatgpt integrated personal use.
@mcc I always assume there is no secret data, or even information
only secret motivations and goals, sometimes
@efi I would not necessarily assume that. I would just assume there is no such thing as secret data that is transmitted over the Internet (unless, sometimes, if it is encrypted end-to-end directly to its intended recipient)
@mcc but see, I am not smart enough to trust e2e, even if im using a cypher on a piece of paper
I just assume if I know something, anyone else may also know
it leads to better plans quite often
like my plan to forget my own password on purpose
@mcc @jrose I mean, didn’t we need to assume this about Alta vista and google, et al?

@JetForMe @jrose 1. When we transmitted information to Google, it was assumed that Google was logging information sent "to them". It would be reasonable to therefore assume information Google is keeping in a long term log can be subpeonaed.

2. However, the "AI" gold rush means that data that you would not normally assume is durably logged, such as private IM conversations, are being diverted in unclearly-disclosed ways, and sometimes not to the same company you sent the information *to*.

@JetForMe 3. With Google, people who had thought about privacy generally assumed that Google would only preserve data, and would consequently delete it, according to its privacy policy; and assumed that governments would only receive that information in the case of a duly served subpeona. However it turned out Google's internal network was being snooped on by state-security-service third parties circumventing due process and privacy policies. That was what the whole Snowden thing was about.
@mcc I was thinking more of their search crawlers than data I willingly give them (like when using google apps).
@JetForMe I am less concerned about the eventual disposition of data ChatGPT gathered from public web crawling (trivially, this is public, so any nefarious actor could crawl it already) and mostly concerned about information which is not on the public Internet but which ChatGPT might get from their partners, in the many, many, many, *many* "ChatGPT integration" that are being rushed into what seems like every single web app.
@JetForMe I am also worried that many end users may not be clearly thinking about, when they ask a question of ChatGPT (its intended primary use case), where their questions are going and in what ways their questions are being retained or associated with them personally
@mcc all excellent points, thanks.
@mcc Yeah, that is very scary.
@mcc this reminds me of when the first shipped Android phone was running every keypress through a root shell. Was discovered when one dude's girlfriend asked why he wasn't replying to her texts for a few minutes and he replied "reboot" which made the phone reboot. So then he tried "sshd" and that's how root access was achieved on Android for the first time.
@nyquildotorg uuuuuuUUUUUUHHHH
@mcc if I hadn't personally done it, I would never have believed it. But it was real. Not sure if that's more or less embarrassing than the time an update made an entire month disappear from the date picker.
@mcc it's definitely stored, it's all protobuf files and python scripts
@mcc I it's following me? in my case it'll AI will probably be retired suffering a mental breakdown ensconced with zukkers & musky still pansy fighting in a rabbit cage?
@mcc our prod environment at work went out this weekend because the database disk was full. 70% of it was postgres log, dating back 2021
@mcc this is one of the big wins of WebAuthn/passkeys: the server never even _sees_ any secrets
@rcombs yeah, although I wish we coulda just added this to web browsers with SRP 15 years ago instead of needing a hardware dongle
@mcc Passkeys put this in the browser itself rather than using external hardware
@rcombs hm, ok. Maybe this is already what I want then
@mcc they also sync across devices via password managers, and a passkey on your phone can be used to one-time-auth another computer without giving it the passkey (via a QR code scan)
@rcombs I just want to verify a password without the password going over the wire.

@mcc passkeys discard the password in favor of public-key crypto, which avoids having the server store a hash of a low-entropy memorable secret that can be attacked; instead, a high-entropy asymmetric secret is stored locally and protected locally

it's also dramatically more phishing-resistant

@rcombs I don't think that's what I want because files are not real things I can trust
@mcc I mean, it's the same core concept as a password manager with generated random passwords, just with much stronger security guarantees
@rcombs @mcc this is literally untrue? Chrome requires hardware to use passkeys
@whitequark not on windows or macOS, and not on any platform with third-party manager extensions
@rcombs yeah windows and macos arent real
@rcombs oh I can't use them on my Android phone either
@whitequark huh, it should work in chrome on android 9 and later, at least?
@rcombs nope!! not in my case anyway, or i could use a passkey on linux remotely
@whitequark huh, wonder if it's a weird vendor thing... I'd probably chalk it up to the tech still being pretty early-days and not especially widely-deployed yet outside the apple ecosystem

@whitequark I think the issue with linux for chrome's built-in support is that it's built on the platform-provided tools for storing keys securely, and there aren't really widely-deployed equivalents for that on non-android linux yet, so they decided to leave it to third-party managers rather than providing something internal that would have weaker security properties than on other platforms

I do wonder what firefox's implementation will look like, once they get that out

@rcombs @whitequark So in other words you're not allowed to use passkeys unless you keep the files in TPM/secure enclave?
@mcc you are, just, chrome's builtin manager doesn’t; systems without hardware key storage are uncommon enough these days that it's expected that users on them will be knowledgeable enough to select a solution that fits their threat model, rather than use the built-in highly-strict one
@rcombs I have to admit this is not really dissuading me from my initial impression that this is all a good idea which in the end is going to have so many barriers to use that it winds up just being a way of pressuring me to let Google manage all my passwords, something up until now I have very carefully avoided

@mcc @rcombs if it makes you feel any better, I tried to add 2FA to something a couple days ago and Dashlane (in Firefox) prompted me to use it as a Passkey

so there’s nothing actually tying this to Google or even your browser

@mcc in the medium-term (i.e. the next several months), chances are whatever password manager you currently use will add support for passkeys and it’ll all just work fairly transparently
@rcombs okay, but doesn't that imply i'll have to either install the password managers on the machines I want to use, or allow the device with my password manager to physically connect to the computer where the browser is running (how, I'm not even sure, since I'm not used to desktops having bluetooth? I guess I just need to carry around a usb webcam to transmit a qr code?)? with a password manager on a trusted device I don't have to connect anything to anything bc passwords can cross an airgap
@rcombs Old-style hardware keys could cross airgaps because they could do challenge/response over a keyboard, but as far as I know this isn't part of the fido public-key paradigm, it's just assumed that pieces of hardware are able and willing to talk to each other
@mcc when a site prompts for a passkey on a machine that doesn't have one (via its built-in manager or any third-party extension installed), the browser displays a QR code, which you can scan with a phone that has the passkey on it; this mediates a bluetooth connection that the authentication then happens over (but the potentially-untrusted machine never sees the stored secret)