One phenomena I've run up against in talking about #Threads and #Bluesky the last few weeks is that people generally don't understand how, or underestimate how much, companies like #Meta use the data they collect to strip away any privacy we might reserve for ourselves.

So here's a thread. Feel free to mute me for a hour. For shorthand, DATA COLLECTORS will mean companies whose business model hinges on gathering as much data as possible about users, and COMSOC refers to commercial social media.

Important fact #1: Most ComSoc platforms are owned and operated by data collectors.

It's misleading to think of these companies as facilitating communication first, and collecting user data only secondarily. They'll make certain concessions (e.g. moderation) to keep users plugged in, but to see their services from the POV of the ComSoc business model, it's necessary to see them first and foremost as elaborate systems for capturing data and pinning it back to an actual person out in the world.

Understand: I'm not asking you to be cynical when I suggest that you understand ComSoc platforms as data capture systems. Just look at it structurally: Big, international platforms are expensive to run. As long as they can get by on venture capital, they can concentrate on building for community. But as the VC money dwindles, sustaining the platform requires (re)designing it to support some business model. And selling targeted ads and/or data is the ComSoc business model par excellence.
Understanding roughly how that model works is important if you want to navigate it. This is a basic primer on how online ad auctions work: https://themarkup.org/privacy/2023/06/23/how-your-attention-is-auctioned-off-to-advertisers Note how it talks about "data." Data is how they decide whether it's worth paying to show a particular ad to you specifically. Where does the data come from? The website provides some, but if it's displaying Facebook ads, then that data gets supplemented with what Facebook already knows about you from elsewhere.
How Your Attention Is Auctioned Off to Advertisers – The Markup

In mere milliseconds, online advertisers scrutinize your personal data and bid for your eyeballs

Some of that data comes from Meta's ComSoc platforms, but non-Meta websites pass data to ComSoc companies even when they're not serving targeted ads. Those embedded "Like" and "Share" buttons you see everywhere? They may just seem like a convenient way to reduce the friction of sharing links on social media, but ComSoc companies deploy them as little data beacons for tracking people across the web. https://www.consumerreports.org/privacy/how-facebook-tracks-you-even-when-youre-not-on-facebook-a7977954071/
How Facebook Tracks You, Even When You're Not on Facebook

Consumer Reports explains how Facebook tracks consumers across many websites, gathering data even if you who don't have a Facebook or Instagram account.

Consumer Reports
Meta Pixel is an extremely voracious consumer of data. U.S. Congress recently took Meta and several tax prep companies to task when it was discovered that Pixel was passing tax prep information back to Meta—including first and last names, income, filing status, and refund amounts: https://themarkup.org/pixel-hunt/2023/07/12/congressional-report-finds-meta-and-tax-prep-companies-recklessly-shared-taxpayers-data Some browser add-ins can disable ComSoc beacons, and I highly recommend that you install and activate one ASAP.
Congressional Report Finds Meta and Tax Prep Companies “Recklessly” Shared Taxpayers’ Data – The Markup

The investigation was opened in response to work published last year by The Markup

So important fact #2 is: Their platforms are not the only place ComSoc companies serve ads, nor are they the only source data collectors draw on for data about you.

Websites are constantly passing data to advertisers so that they can match ads to users, and the company keeps that data and uses it to build out your profile. In fact, Meta has profiles on people who've never signed up for a Meta service: https://www.vox.com/2018/4/20/17254312/facebook-shadow-profiles-data-collection-non-users-mark-zuckerberg

This is how Facebook collects data on you even if you don’t have an account

One of the more interesting takeaways to come out of Facebook CEO Mark Zuckerberg’s multi-day congressional testimony last week was confirmation that the social giant collects data from people online even if they don’t have a Facebook account. And there’s little you can do about it.

Vox
On a certain level, we all intuit the interconnectedness of data-driven online advertising, because we've all had those uncanny moments where a website serves us an ad for something that we briefly mentioned someplace else. But let me emphasize: The fact that not logging into a ComSoc platform is not enough to keep a data collector from tracking you across the web is no reason for defeatism. It just means you have to adjust your model for what's involved in data security.
That said, ComSoc platforms are still data collectors' most effective tool. Otherwise, they'd stop launching new ones and stop trying to get people to join them. One reason is that they're so much more effective for collecting data than the open web. If you've downloaded the Threads app, you've probably seen this list of data types the app collects. That's a much more granular view of a person than you could get from a beacon on a website.

Which leads us to important fact #3: Data security is not just about data collected, but also the inferences that can be made from that data.

Pulling in data from a variety of sources increases the size of the profile a data collector can keep on you, but some of the most alarming privacy violations come from juxtaposing different data points in order to draw reasonable conclusions about things the person might not have volunteered to a data collector otherwise. Say… an unannounced pregnancy.

In fact, retailers have already been using inferential data techniques to "predict" pregnancy for years, in order to get ahead of the competition in advertising maternity and baby products to consumers. As a statistician for Target put it, "We knew that if we could identify them in their second trimester, there’s a good chance we could capture them for years." https://www.nytimes.com/2012/02/19/magazine/shopping-habits.html That article made the rounds in 2012 to near universal mutters of "This is creepy…"
How Companies Learn Your Secrets

Your shopping habits reveal even the most personal information — like when you’re going to have a baby.

The New York Times
It shouldn't be too difficult to see how that sort of predictive analysis could be put to much more ominous ends, as in the case of the pregnancy prediction AI that Microsoft provided to a government with a history of targeting indigenous groups for population control: https://www.wired.com/story/argentina-algorithms-pregnancy-prediction/ But maybe governments don't necessarily need to launch big internal programs for data prediction. Maybe they can just get it straight from data collectors.
The Case of the Creepy Algorithm That ‘Predicted’ Teen Pregnancy

A government leader in Argentina hailed the AI, which was fed invasive data about girls. The feminist pushback could inform the future of health tech.

WIRED
And, in fact, ComSoc companies have shown themselves willing to share collected data with governments. Even before SCOTUS struck down federal abortion protections, Facebook helped investigators prosecute a mother and daughter for abortion: https://www.theverge.com/2023/7/11/23790923/facebook-meta-woman-daughter-guilty-abortion-nebraska-messenger-encryption-privacy That was a relatively late-term abortion, but the Dobbs decision is likely to open the floodgates on requests for data on any abortions. And while that case was about chat records, inferential data could complicate things immensely.
Meta-provided Facebook chats led a woman to plead guilty to abortion-related charges

A woman and her daughter each plead guilty to the charges after a search warrant surfaced Facebook messages discussing the abortion and subsequent burial of the fetus.

The Verge

Which is why important fact #4 is maybe one of the most important: Your data is not just about YOUR privacy.

One of the most basic forms of data ComSoc companies have about you are your connections to other people. The "social graph" builds that right into social media. And those connections allow them to use your data to infer things about the people in your social graph. Some of those are basic associational guesses: If you like hockey, your friends might, too. Some are not so casual.

@lrhodes Most people I know are very well aware of this but the daily practical joy of being in touch with their friends or of memes or information trumps the more longterm and theoretical or abstract fear of having their data used against them.