Bluesky To Sell Your Content To AI Data Miners

So it begins. Hidden in Jay Graber's recent charm offensive is this innocuously framed initiative: Bluesky is weighing a proposal that gives users consent over how their data is used for AI (https://techcrunch.com/2025/03/10/bluesky-is-weighing-a-proposal-that-gives-users-consent-over-how-their-data-is-used-for-ai/)

Not so fast.

1) Shows they are planning on doing content deals with AI companies.
2) Seems like it is Opt-out vs. Opt-in (see below).
3) It is just a voluntary robots.txt file

h/t @Lydie https://tech.lgbt/@Lydie/114149023344861046

more...

#Bluesky

Bluesky is weighing a proposal that gives users consent over how their data is used for AI | TechCrunch

Speaking at the SXSW conference in Austin on Monday, Bluesky CEO Jay Graber said the social network has been working on a framework for user consent over

TechCrunch
@mastodonmigration @Lydie Nah, sorry, you got that completely backwards here… This proposal is likely at least partially a response to occasional dramas when someone takes the public data from network (which anyone can technically do) and does something with it that some users don't like, since it's unclear exactly what you're allowed to do with it and what you aren't. So this would be a way to let users specify their intention of how this openly accessible data is allowed to be used.

@mackuba @Lydie

So Kuba, you are saying Bluesky will not sell your content to data scapers? Can you show us where it says that?

@mastodonmigration @Lydie https://bsky.app/profile/bsky.app/post/3layuzbto2c2x

None of this is about what *Bluesky* will be able to do, it's about what *anyone* is allowed to do with your public data. So if you check "AI = enabled" for example, this means you allow *anyone* to read your posts from the firehose and use them to train models. Which means Bluesky can't sell it, because it's already available for free then, so nobody would pay for it extra.

Bluesky (@bsky.app)

A number of artists and creators have made their home on Bluesky, and we hear their concerns with other platforms training on their data. We do not use any of your content to train generative AI, and have no intention of doing so.

Bluesky Social

@mackuba @Lydie

Thank you for this clarification. But actually it just muddies the water more.

"In that situation, downstream projects will need to make their own policy decisions around whether content re-use is acceptable"

Who are these "downstream projects" that she is teeing up?

@mastodonmigration @Lydie Meaning anyone who builds something using the API, which is permissionless so anyone can connect to it at any moment and start saving some data. People building various apps, tools, services that somehow make use of the data (not by making some kind of deal with Bluesky PBC, but by just opening a code editor and writing some code that makes requests to api.bsky.app or bsky.network or PDS servers and downloading some JSON and doing stuff with it).
@mastodonmigration @Lydie So for example my website https://blue.mackuba.eu/stats/, where I download all posts and then once a day run a query "select count(*) from posts where ..." and save the result as another row and then draw a chart from that, is an example of a "downstream project".
Bluesky Stats

Bluesky daily/weekly activity statistics charts

@mackuba @Lydie

Understand that there are desirable uses, and the way these documents are written they love to give innocuous examples. The problem is that this type of presentation is misleading in that the policy change also permits the undesirable uses.

Again, simple question, are you saying that these changes do not presage a plan by Bluesky to sell or otherwise profit off sharing user content with AI data scapers? And, can you show us where they assure their users of this?

@mastodonmigration @Lydie Yes, I'm sure this proposal has nothing to do with what Bluesky is able to do, just with what anyone using the API is allowed to do. And the proposed default in this doc is what we have right now.

I can't find any place where they explicitly say they will not sell data to other companies, but this has been their general stance that they don't intend to do stuff like that, and like I said, all data being publicly accessible kinda makes that not a very valuable resource.

@mackuba @Lydie

Well it would certainly be nice if they would come out in no uncertain language and affirm this. And it would seem that the time when you are announcing a fairly big prospective change to user permissions would be the time to do so. Let's stay tuned.

@mastodonmigration @Lydie Btw, here's how a Bluesky protocol dev announced this proposal:

@mackuba @Lydie

Thanks. Good to know. Don't see how it changes anything, but helpful to see how it is being presented.