One phenomena I've run up against in talking about #Threads and #Bluesky the last few weeks is that people generally don't understand how, or underestimate how much, companies like #Meta use the data they collect to strip away any privacy we might reserve for ourselves.

So here's a thread. Feel free to mute me for a hour. For shorthand, DATA COLLECTORS will mean companies whose business model hinges on gathering as much data as possible about users, and COMSOC refers to commercial social media.

Important fact #1: Most ComSoc platforms are owned and operated by data collectors.

It's misleading to think of these companies as facilitating communication first, and collecting user data only secondarily. They'll make certain concessions (e.g. moderation) to keep users plugged in, but to see their services from the POV of the ComSoc business model, it's necessary to see them first and foremost as elaborate systems for capturing data and pinning it back to an actual person out in the world.

Understand: I'm not asking you to be cynical when I suggest that you understand ComSoc platforms as data capture systems. Just look at it structurally: Big, international platforms are expensive to run. As long as they can get by on venture capital, they can concentrate on building for community. But as the VC money dwindles, sustaining the platform requires (re)designing it to support some business model. And selling targeted ads and/or data is the ComSoc business model par excellence.
Understanding roughly how that model works is important if you want to navigate it. This is a basic primer on how online ad auctions work: https://themarkup.org/privacy/2023/06/23/how-your-attention-is-auctioned-off-to-advertisers Note how it talks about "data." Data is how they decide whether it's worth paying to show a particular ad to you specifically. Where does the data come from? The website provides some, but if it's displaying Facebook ads, then that data gets supplemented with what Facebook already knows about you from elsewhere.
How Your Attention Is Auctioned Off to Advertisers – The Markup

In mere milliseconds, online advertisers scrutinize your personal data and bid for your eyeballs

Some of that data comes from Meta's ComSoc platforms, but non-Meta websites pass data to ComSoc companies even when they're not serving targeted ads. Those embedded "Like" and "Share" buttons you see everywhere? They may just seem like a convenient way to reduce the friction of sharing links on social media, but ComSoc companies deploy them as little data beacons for tracking people across the web. https://www.consumerreports.org/privacy/how-facebook-tracks-you-even-when-youre-not-on-facebook-a7977954071/
How Facebook Tracks You, Even When You're Not on Facebook

Consumer Reports explains how Facebook tracks consumers across many websites, gathering data even if you who don't have a Facebook or Instagram account.

Consumer Reports
Meta Pixel is an extremely voracious consumer of data. U.S. Congress recently took Meta and several tax prep companies to task when it was discovered that Pixel was passing tax prep information back to Meta—including first and last names, income, filing status, and refund amounts: https://themarkup.org/pixel-hunt/2023/07/12/congressional-report-finds-meta-and-tax-prep-companies-recklessly-shared-taxpayers-data Some browser add-ins can disable ComSoc beacons, and I highly recommend that you install and activate one ASAP.
Congressional Report Finds Meta and Tax Prep Companies “Recklessly” Shared Taxpayers’ Data – The Markup

The investigation was opened in response to work published last year by The Markup

So important fact #2 is: Their platforms are not the only place ComSoc companies serve ads, nor are they the only source data collectors draw on for data about you.

Websites are constantly passing data to advertisers so that they can match ads to users, and the company keeps that data and uses it to build out your profile. In fact, Meta has profiles on people who've never signed up for a Meta service: https://www.vox.com/2018/4/20/17254312/facebook-shadow-profiles-data-collection-non-users-mark-zuckerberg

This is how Facebook collects data on you even if you don’t have an account

One of the more interesting takeaways to come out of Facebook CEO Mark Zuckerberg’s multi-day congressional testimony last week was confirmation that the social giant collects data from people online even if they don’t have a Facebook account. And there’s little you can do about it.

Vox
On a certain level, we all intuit the interconnectedness of data-driven online advertising, because we've all had those uncanny moments where a website serves us an ad for something that we briefly mentioned someplace else. But let me emphasize: The fact that not logging into a ComSoc platform is not enough to keep a data collector from tracking you across the web is no reason for defeatism. It just means you have to adjust your model for what's involved in data security.
That said, ComSoc platforms are still data collectors' most effective tool. Otherwise, they'd stop launching new ones and stop trying to get people to join them. One reason is that they're so much more effective for collecting data than the open web. If you've downloaded the Threads app, you've probably seen this list of data types the app collects. That's a much more granular view of a person than you could get from a beacon on a website.

Which leads us to important fact #3: Data security is not just about data collected, but also the inferences that can be made from that data.

Pulling in data from a variety of sources increases the size of the profile a data collector can keep on you, but some of the most alarming privacy violations come from juxtaposing different data points in order to draw reasonable conclusions about things the person might not have volunteered to a data collector otherwise. Say… an unannounced pregnancy.

In fact, retailers have already been using inferential data techniques to "predict" pregnancy for years, in order to get ahead of the competition in advertising maternity and baby products to consumers. As a statistician for Target put it, "We knew that if we could identify them in their second trimester, there’s a good chance we could capture them for years." https://www.nytimes.com/2012/02/19/magazine/shopping-habits.html That article made the rounds in 2012 to near universal mutters of "This is creepy…"
How Companies Learn Your Secrets

Your shopping habits reveal even the most personal information — like when you’re going to have a baby.

The New York Times
It shouldn't be too difficult to see how that sort of predictive analysis could be put to much more ominous ends, as in the case of the pregnancy prediction AI that Microsoft provided to a government with a history of targeting indigenous groups for population control: https://www.wired.com/story/argentina-algorithms-pregnancy-prediction/ But maybe governments don't necessarily need to launch big internal programs for data prediction. Maybe they can just get it straight from data collectors.
The Case of the Creepy Algorithm That ‘Predicted’ Teen Pregnancy

A government leader in Argentina hailed the AI, which was fed invasive data about girls. The feminist pushback could inform the future of health tech.

WIRED
And, in fact, ComSoc companies have shown themselves willing to share collected data with governments. Even before SCOTUS struck down federal abortion protections, Facebook helped investigators prosecute a mother and daughter for abortion: https://www.theverge.com/2023/7/11/23790923/facebook-meta-woman-daughter-guilty-abortion-nebraska-messenger-encryption-privacy That was a relatively late-term abortion, but the Dobbs decision is likely to open the floodgates on requests for data on any abortions. And while that case was about chat records, inferential data could complicate things immensely.
Meta-provided Facebook chats led a woman to plead guilty to abortion-related charges

A woman and her daughter each plead guilty to the charges after a search warrant surfaced Facebook messages discussing the abortion and subsequent burial of the fetus.

The Verge

Which is why important fact #4 is maybe one of the most important: Your data is not just about YOUR privacy.

One of the most basic forms of data ComSoc companies have about you are your connections to other people. The "social graph" builds that right into social media. And those connections allow them to use your data to infer things about the people in your social graph. Some of those are basic associational guesses: If you like hockey, your friends might, too. Some are not so casual.

Consider the case of a person living in a state where abortion is outlawed, traveling to a state where it's legal. Some states are working to outlaw cross-border abortion, too. Hypothetically, they could subpoena the patient's ComSoc records and find out that they made arrangements to meet up with a friend. So they subpoena the friend's ComSoc records, too, and use their location data to show that their next stop was at a clinic that provides abortions. What might the state infer from that?
Would inferential data of that sort by admissible in court? I would hope not. It's highly circumstantial. And we all have really high regard for the caliber and impartiality of U.S. judges right now, don't we? But even if it never made it to court, law enforcement and government officials could use inferential data to pursue investigations against people, which can often be just as damaging as a judicial proceeding.
The stakes don't have to be as high as the prospect of criminal prosecution. How we interact around ComSoc platforms makes it possible for data collectors to infer about other people all sorts of things that they might not have volunteered on their own. Facebook discovers who your mom is when she tags you in a photo. Its app has access to the health info on her phone, so now Meta knows you have an inherited risk of heart disease. Did you want them to know that? Would you have volunteered it?

Alright. So: To be clear, here are a few things I DON'T mean by all of this:

1. That you're flatly unethical if you choose (or feel that you have to) use a ComSoc platform.

2. That you should chastise people who do.

3. That there's a binary choice between data privacy and living in a panopticon of data collection.

4. That the fediverse is The Ideal Alternative to ComSoc platforms.

But I DO think that ethical behavior in this space requires a nuanced understanding of how the beast operates.

Nearly everything we do creates potential data, so my general recommendation is that you incorporate two principles into your decisions around social media and the internet in general:

DATA MINIMALISM: Take steps to avoid making more data than is required for YOUR intended purposes—as opposed to the purposes of data collectors.

and

DATA SECURITY: Be aware of the data generated by your actions and take steps to avoid exposing it to any entity you don't know and trust.

@lrhodes Excellent thread. Just adding that although the fedi is not a commercial platform, the data harvesters are here anyway and people should be equally cautious about sharing identifying info.
@lrhodes Thats the sort of location data that aided in the conviction of the person on the Serial podcast - much more complicated of course but it was admissible back then
@lrhodes Most people I know are very well aware of this but the daily practical joy of being in touch with their friends or of memes or information trumps the more longterm and theoretical or abstract fear of having their data used against them.
@lrhodes I think about this a lot when it comes to DNA sharing. I don't want to share my DNA with companies, but since most of my immediate relatives already have, it's kinda moot at this point

@lrhodes This alone should be reason enough not to trust #NSAbook or any other #GAFAM|s with anything one wouldn't publicly post...

JFC did the people learn nothing out of the #EncroChat & #ANØM / #OperationIronside / #OperationTrojanShield debacles?

Do I have to drag out that old #PRISM slide again??
https://mstdn.social/@kkarhan/110644134995666721

Kevin Karhan :verified: (@[email protected])

Attached: 1 image @[email protected] @[email protected] @[email protected] 1. #Apple ist #PRISM-Kollaborateur, müssen dies aber natürlich leugnen! https://de.wikipedia.org/wiki/PRISM#Apple 2. Apple integriert #Govware in #iCloud, weil diese es können und wollen! https://www.youtube.com/watch?v=Ev9_oDHNf-4 3. Apple wie alle #GAFAMs interessiert Privatsphäre nen shice! https://www.youtube.com/watch?v=shxTTon5lfs 4. Die #Verschlüsselung hinter iCloud ist absurd schlecht! https://www.youtube.com/watch?v=r38Epj6ldKU

Mastodon 🐘