Bluesky To Sell Your Content To AI Data Miners

So it begins. Hidden in Jay Graber's recent charm offensive is this innocuously framed initiative: Bluesky is weighing a proposal that gives users consent over how their data is used for AI (https://techcrunch.com/2025/03/10/bluesky-is-weighing-a-proposal-that-gives-users-consent-over-how-their-data-is-used-for-ai/)

Not so fast.

1) Shows they are planning on doing content deals with AI companies.
2) Seems like it is Opt-out vs. Opt-in (see below).
3) It is just a voluntary robots.txt file

h/t @Lydie https://tech.lgbt/@Lydie/114149023344861046

more...

#Bluesky

Bluesky is weighing a proposal that gives users consent over how their data is used for AI | TechCrunch

Speaking at the SXSW conference in Austin on Monday, Bluesky CEO Jay Graber said the social network has been working on a framework for user consent over

TechCrunch
@mastodonmigration @Lydie Nah, sorry, you got that completely backwards here… This proposal is likely at least partially a response to occasional dramas when someone takes the public data from network (which anyone can technically do) and does something with it that some users don't like, since it's unclear exactly what you're allowed to do with it and what you aren't. So this would be a way to let users specify their intention of how this openly accessible data is allowed to be used.

@mackuba @Lydie

So Kuba, you are saying Bluesky will not sell your content to data scapers? Can you show us where it says that?

@mastodonmigration @Lydie https://bsky.app/profile/bsky.app/post/3layuzbto2c2x

None of this is about what *Bluesky* will be able to do, it's about what *anyone* is allowed to do with your public data. So if you check "AI = enabled" for example, this means you allow *anyone* to read your posts from the firehose and use them to train models. Which means Bluesky can't sell it, because it's already available for free then, so nobody would pay for it extra.

Bluesky (@bsky.app)

A number of artists and creators have made their home on Bluesky, and we hear their concerns with other platforms training on their data. We do not use any of your content to train generative AI, and have no intention of doing so.

Bluesky Social
@mastodonmigration @Lydie They generally don't have many options here around selling data that's all already public from the start, even if they wanted to.

@mackuba @Lydie

Simply not true. The data is not public. It is published under the explicit privacy policy terms. The distinction between what people can do and what they are allowed to do legally matters.

Again, are you saying that Bluesky will not sell user content to data scapers, and where do they assure their users of this? Simple question.

@mastodonmigration @mackuba @Lydie the literal first point of the BlueSky privacy policy is, and I quote, "Profiles and posts are public". I agree that we should be watching them but this whole thread feels like a hit piece interpreting work under the worst possible intentions.

https://bsky.social/about/support/privacy-policy#profile-posts-public

Privacy Policy - Bluesky

Bluesky

@McNeely @mackuba @Lydie

Would be curious what other intentions you would ascribe to a change in the software to give specific authority to have your content scraped by AI data miners?

When Twitter/X did this last year it triggered outrage.

@mastodonmigration @mackuba @Lydie right now there's no authority granted one way or the other. What is being proposed is a method to grant or deny that authority.

I think a useful analogy is a code base with no declared license. The code is technically copyright of the publisher, but being publicly available on the internet its dependent on others respecting that implied right (& we know it won't be). I think this is a proposal to create a very simplified licensing scheme.

@McNeely @mackuba @Lydie

Hmmm... Not exactly following you. Are you saying that there is clamor among Bluesky users to explicitly grant consent their content to be scraped by AI data miners and this change in the software is to respond to this desire? Does that actually make any sense to you?

@mastodonmigration @McNeely @Lydie More like, there is a desire to make it easier to express explicit denial of consent, which this proposal would help with. And on the other hand, people building or using e.g. bridges like Bridgy, wish that it would be easier to opt in to things like that for users who want to. And having some subset of users who explicitly opted in to e.g. using content for AI models would kind of make it more clear that the rest didn't.

@mackuba @McNeely @Lydie

This is a canard. There would be no need to express explicit denial of consent individually if consent were generally explicitly denied.

And you are postulating that with all the other things Bluesky has on its development road map, providing some small bunch of users who want their data used for AI training is what is being prioritized. Does that really make sense to you?

@mastodonmigration @McNeely @Lydie They're trying to build a more general protocol, and they're adding various things to the protocol that they feel are needed for it to be more useful.
@mastodonmigration @mackuba @Lydie I'm saying that the clamor to have control and options is being interpreted repeatedly as proof that something nefarious is afoot. No one wants to attach the GPL to their skeets they just want control.

@McNeely @mackuba @Lydie

What people want is for their content to not be used for AI training. Full stop. It would be easy for Bluesky to clearly assert this general prohibition. No one is clamoring to let AI data miners mine their content.

@mastodonmigration @McNeely @Lydie Some folks on Bluesky are pointing out that there have been various conversations before on the Fediverse too about having some kind of way to express consent on how you allow your content to be used, and that various misunderstandings come from the fact that there is no such way to express consent formally yet. So this is what Bluesky is trying to add in ATProto.

@mackuba @McNeely @Lydie

Hmmmm... not buying it. Why not just explain that then? Why lead at a high profile symposium with AI scraping? Where are all the Bluesky users clamoring for AI data scraping?

@mastodonmigration @McNeely @Lydie Twitter just unilaterally changed the rules so they can use the content for various things. Bluesky is thinking about adding a way to express if you allow or don't allow your content to be used for A, B, C, D. These are not the same thing.