sigh. There are a lot of posts going around right now claiming that #bluesky is feeding data to the various data sniffers that the US government uses for surveillance and that #Mastodon is not.

This. Is. Crap.

Anything you post on Mastodon is nearly as easy to vacuum up as things you post on BlueSky. You should treat *all* social media platforms larger than a Signal group of your college friends + Pete Hegseth as "assumed public." It's slightly more work to slurp up almost all of mastodon but really not that much more.

@dave_andersen People are on edge, I get that, but misinformation like that is so harmful.
@dave_andersen Good news everyone, the "decentralized" network is currently down (at least for me) so it's not feeding data to anyone!
@ricci @dave_andersen Very interesting 😎
@mackuba @ricci The Blue sky hosted pdses are down, but systems like bridgey still work because they run their own PDS, IIRC.

@dave_andersen

I am just going to encrypt all my posts from now on Mastodon from now on.

@dave_andersen

5UoNvzaSNBexUZZ1SyQ54J0nqvj7Df2QcDoLooQ0ObGPonOMZsZZuHs2GnrGHrDyx0r0SMaRBYg2Bh6CVaSWNdoCr2WRr

@darryl_ramm @dave_andersen

Ah I see but what did you do after that?

@darryl_ramm @dave_andersen Ả ĂẢẠÁ ĀA̰AĀ, ȦÄĀ ẢĀ'Ã ÂÅĀ AÃ ÁA̮A̮ÁA̧ĀẢÀÁ AÃ ÄÃẢÂA̋ A ȦAÃÁA̱, ȂÁA̧ÅA̋ÂẢȺAȦĂÁ ÁÂA̧ȂA̦A̯ĀẢÅÂ ĀA̰AĀ ÅÂĂA̦ ȦAÃÁA̱ AÂA̱ ÃÁA̽A̦ A̯ÁÅA̯ĂÁ (ĂẢẠÁ A̦ÅÄ, A̱ÁAȂ ȂÁAA̱ÁȂ) ȀÅÄĂA̱ ÄÂA̱ÁȂÃĀAÂA̱
@jnk @darryl_ramm @dave_andersen nicely encrypted message there. What encryption are u using
@darryl_ramm @dave_andersen SWYgeW91IGNvdWxkIGtlZXAgdGhpcyBiYXNlNjQgZW5jb2RlZCB1bmxlc3MgeW91IHdhbnQgdG8gcG9zdCB5b3VyIHByaXZhdGUga2V5IHRoYXQgd291bGQgYmUgZ3JlYXQhICBUaGFuayB5b3Uh
@dave_andersen you had me until we put Whiskey Pete into the chat. Are we at least denying him chat admin privs?
@cascheranno The first rule of houthi small group chat is never to let whiskey Pete be able to invite more members.

@dave_andersen Absolutely.

I chose to move much of my presence to the Fediverse because of its decentralized and independent model. Not because I consider anything I post here to be secure or of limited access.

If your data is on systems that allow public access, anyone can access that data.

I'm not sure how one can be more clear about that.

@dave_andersen mostly agree.

Bluesky gives the Firehose away for free though, and large swathes of the posting history are free to download in HuggingFace models.

Mastodon servers and accounts are each various degrees of accessible, though the public timeline on the average Mastodon server is easily scraped.

And everything is ultimately stored in plaintext.

@davey_cakes @dave_andersen

If I search for my name and a random topic I have posted a comment about on Mastodon I can find a link to that post using Google.

So my assumption is if Google search has it, basically everyone has it.

@helvick @davey_cakes @dave_andersen not sure if that's a good indicator, you can just untick "Include profile page in search engines" in settings/privacy

@31113 @davey_cakes @dave_andersen

I might not be fully representative then as I have my profile page searchable but these search results for me link directly to specific posts when the search query is about the post content.

It doesn’t work for anything recent, it won’t find yesterdays posts but stuff from two months ago for sure. I’m pretty sure this data is there because Google are slurping up the public feed.

@davey_cakes it's easier to archive bluesky but it's still basically trivial for a reasonable programmer to do the same with fedi. The difference is one of taking an hour versus taking a couple days, but that's not really any difference in the big picture.

@dave_andersen @davey_cakes

Only that Bluesky still has my real name e.g. the mobile phone number, which can be linked to my identity much more easily.

The question is when Bluesky passes on data or what data? Because not all data is passed on, only when a filter responds maybe to certain attributes?

I think that's another major difference. A preselection may already have been made.

@dave_andersen @davey_cakes Also there are tracking data that is not visible to the public. When are you online, how long or private messages, all your contacts etc.

Will this data perhaps also be passed on?

@dave_andersen @davey_cakes I don't have my own Mastodon server now, and theoretically the admins of my instance could also pass on more of my data. But norden.social is now a german registered association (eingetragener Verein).

I somehow have more Trust than in an american startup.

@davey_cakes @dave_andersen create a server, an account on it that asks people to boost to federate your poor solo server, then drop silent and just archive your federated timeline.

Pretty sure that should be enough. Sure, people who posts "only to followers" will be out, unless you follow everyone (are you Nicole??), but you should get the vast majority of the content.

@tshirtman @dave_andersen

It's not really enough, which is why people make followbots. And then some mod on a server with follower approval on, notices the request from the obvious follow bot, checks the one person server, suspends it, and lets others know.

For example a server which doesn't reveal its public timeline for non-members, has approval on registrations, and uses authorised fetch, is going to be a lot harder to scrape than a default settings server.

@tshirtman @dave_andersen generally, Mastodon and other Fedi software let you be awkward as fuck if you want to.

If it gets to the level that you have to do social engineering to get at particular servers - allowlist only servers, for example - there's really no comparison to the public Firehouse on Bluesky.

@davey_cakes @dave_andersen it's a bit more work, but you can have a multiuser server, with open registrations, moderate it responsibly and let it grow organically large enough to capture most of the trafic, a motivated individual with a modest bit of money can do it (and with benevolent mods, no less) and it's trivial for any intel agency.

I do advise treating things posted here as recorded forever and will be used to take you down if you have a difficult relationship with your state.

@tshirtman @dave_andersen

"I do advise treating things posted here as recorded forever and will be used to take you down if you have a difficult relationship with your state."

Yeah, 100%. And with the best mods in the world, your hosting company isn't going to jump on a grenade for you if the cops come knocking.

@dave_andersen even @signalapp has to comply with #CloudAct.

  • And we can be very shure they did simply because it's a statistical inevitability by the sheer amount if users they have…

Only real #E2EE (= #SelfHosting-capable with #SelfCustody of all the keys) can be considered safe.

@dave_andersen
I would argue though that there is a significant, if nominal, difference between Mastodon and the greater Fediverse being scrapeable, and Bluesky affirmatively going to the effort to deliver the data.
@DopeGhoti not when it comes to a question of whether ICE can see your data. It can in both cases.

@DopeGhoti @dave_andersen not only that, you can't know my email or IP unless you're my admin. Private posts exist, even tho I wouldn't trust them on my life. And, unless I missed the option, you can't see who I blocked or silenced, only guess.

Data is all data, not just public posts you can even read from a RSS client.

@dave_andersen Well, there is some validity in that Bluesky would be more likely of the two to actually help collate and organize said data, sending it more conveniently and willingly in a useful manner as opposed to Mastodon requiring building a system to, as you say, siphon it up. But, yeah, Mastodon is pretty much wide open (and even limiting post visibility has limits of its own) and should be treated as such.
@nazokiyoubinbou they don't need to. ShadowDragon is very happy to do it for them for both bsky and fedi.

@dave_andersen Yeah, ok, fair enough.

I guess I can only limit it to "one is more evil and willing to do it, whereas the other can't really do anything to prevent it."

In the end there really isn't much difference in regards to what actually happens to the data.

@dave_andersen agreed! let's focus on telling people about the actual merits of the Fediverse over bluesky, like decentralization and being censorship-resistant (not -proof, but certainly better than bluesky)
@dave_andersen There's a difference between being able to scrape data and actively feeding it to nefarious organizations. Especially since a lot more meta data could possibly go along with the posts.
@soviut what evidence is there that blue sky is doing anything active other than providing the fire hose that's public in the same way other data is scrape-able?
@dave_andersen @soviut This is a weird response because your OP didn't address the evidence they were facilitating it. (Which is what "is feeding data" implies.) And (IMO) that's what people really care about more than whether or not there's some way to get the data. It's the decisions the organizations make. (I'm not commenting on whether bs has made that or not. Just pointing out that's the issue, not the result, which is what your OP addressed.)
@dave_andersen bluesky is one entity. Mastodon is, checks notes, a metric fuckton. It is far easier to get one entity to buy into that kind of thing and willingly give you access, than it is to gather up that data by brute force. It may be "nearly as easy" but certainly not as simple as just getting direct access.
@calsnoboarder from a threat perspective it's the same. Any competent programmer could start to scrape 90+% of fedi with a week or few's work and a small amount of money for expenses. (Storing it all with images would begin to get more expensive but not terribly so for an enterprise budget). I mean, it's already been done multiple times.
@dave_andersen any motivated entity could do that and it would take weeks to get everything of value from Mastodon. Or you could just waltz in and get handed admin access to the live Bluesky network and make a copy. One is certainly easier than the other.

@calsnoboarder Yes. But they're both easy from the perspective of a company selling surveillance data to a government. If the thing you're worried about is contractors selling to the US government having your posts, the threat is the same on both platforms: they will. That's really the only thing I'm trying to get across. Anything you post here should be treated as if it's available to any government or newspaper in the world. Because it likely is. The threat model is the same if you're posting on bsky or reddit.

(Also, what evidence is there that bsky gives out more than just the firehose?)

@dave_andersen There isn't evidence... my point (and yours is absolutely valid), is that Bluesky, because it is owned by one entity, is much more likely to just give it up (whether its by cloning the servers or by giving the government or foreign entity direct access) than the same process over a distributed network controlled and operated by hundreds of users. Yes, all of it is "easily" accessible by motivated actors... that is true of your bank accounts and private communication too, not just social media activity. But it would be far easier to get those things from you by just stealing it or pressuring you into giving it up. The NSA knows everything about you and I, whether we post on social media or not. That ship sailed when personal computers shrunk to the size of a small room.
@dave_andersen It's bizarre to me that anyone posting on social media thinks their post is anything but public.

@dave_andersen Exactly this. You should assume that *everything* you post on Mastodon/ActivityPub is crawl-able and and that it will be fed into FB-style network mapping and all manner of AI-driven analysis tools (like for sentiment analysis).

This is something that I knew before I started posting/boosting. I'm happy with that simply because I think that networks like this are simply more resilient and much more conducive to having actual conversations.

@dave_andersen I will never understand how people can think that information that they post that is available to anyone with a *web* *browser* will somehow not be sniffed by anyone who wants it.

@dave_andersen

yeah the whole

"i am going to post this in public

but i want my posts to be private"

is a kind of insanity

that being said there's other factors. backend access to hidden account details for example

i'm not going to cast accusations to #bluesky i have no evidence of. but i will ask people to ask themselves which model they trust more: bluesky or #mastodon / #fediverse

@benroyce @dave_andersen Bluesky is run by cryptobrodudes. That's not a demographic that exactly engenders trust, is it?

I'd be willing to accept almost any accusation directed at cryptobrodudes.

@benroyce @dave_andersen read up on Moscow Rules. Layers of security, everyone! Signal on your isn’t very secure if they can get the phone provider to replace your keyboard input method with spyware.
@aizuchi @benroyce @dave_andersen your phone provider doesn't control the input app ?
@dave_andersen

Greater reliance on having to exploit network effects, plus having to actively pull data from multiple sources versus having it fed from a single source isn't exactly nothing (if your goal is truly to gather "everything").

That said, anything you put out that isn't strongly encrypted you should probably consider as being as private as a postcard sent through the mail.

@dave_andersen I keep the privacy policy of my single-user instance very simple "Privacy promises are precluded; presume none on this public platform."

Not that I have to remind myself that a public forum is public... but there was a field to fill out in the server admin section.