Mastodawn

Jamie is a friendly nut Jun 19

🧵 So to recap something that happened yesterday, as I see it:

Yesterday Mastodon gGmbH sent out an email to mastodon.social and .online users announcing a new TOS. The TOS would take effect July 1, and would ship in the next version of the Mastodon server software as a suggested "template" TOS instances could adopt.

This TOS had multiple problems, and several people said so loudly. This morning, Mastodon announced they're backing off and taking additional legal advice:

https://mastodon.social/@Mastodon/114709820512537821

The problems I saw raised were:

Raised by me: The new intellectual property clause was permanent/irrevokable; Mastodon retained rights to your posts even after they were deleted.
https://github.com/mastodon/mastodon/issues/35086

Raised by Cory Doctorow: The new TOS forced users to give up the right to sue in court, & forced use of arbitration instead. 😬
https://mamot.fr/@pluralistic/114706885462760813

Raised by Sarah Jamie Lewis and others: It was ambiguous if *other federated servers* counted as "users" under the TOS.
https://mastodon.social/@sarahjamielewis/114699476927561899

New Terms of Service IP clause cannot be terminated or revoked, not even by deleting content · Issue #35086 · mastodon/mastodon

Summary Since it first opened, mastodon.social has operated without any sort of explicit IP grant from the users to the service, which is unusual for a social networking service. Today Mastodon ann...

GitHub

When they announced this, Mastodon didn't seem to realize their new terms would be controversial. They also seemed to believe there had been adequate community review time because a version of the TOS had been posted in their git repo for a year. I raise my eyebrow at both these things (didn't think arbitration would be controversial, really?)… but Mastodon *is* a nonprofit that was working with a pro-bono lawyer, and their rapid backoff once the community started engaging shows good faith.

This is not over. Mastodon is, I hope? getting better legal advice now— I know Doctorow reached out offering to get the EFF to help— and that will probably by itself fix the ambiguity about who is a "user", and the ambiguity some people complained about whether German or US law was controlling.

However, the new TOS is still coming, and TOSes by nature balance the admins' and the user's interests, so that balance will need to be negotiated. Arbitration, for example, very well might come back.

The big problem, as I see it, is that Mastodon is now opening the pandora's box of making the content licensing situation of the Fediverse explicit, which in a federated environment will be *very* hard to do without crushing someone's toes, somewhere. (Personally, I really *liked* that all this time the licensing situation of Mastodon has been implicit; I was happy just letting laches and deletion notices do all the work. But I understand if Mastodon gGmbH doesn't feel they can do that forever.)

What we need, for this to work out better next time, is actual community engagement in the TOS process. (This is especially the case if Mastodon gGmbH intends the TOS to be used not just by them but by all the downstream users of their software!) We need more than two weeks notice (or rather, notices posted somewhere the user community will see it). Again, I think Mastodon gGmbH is showing good faith in how they're engaging with the community now, so we'll see what happens.

Postscript: Cripes, the 500 character limit really starts to chafe for this kind of writing. Maybe I'll stand up that GtS server after all lol

Mark Prior Jun 20

@mcc I wish there was a platform that didn't think that implementing a feature to randomise nodeinfo usage stats was a good idea.

@markrprior What does this mean?

Mark Prior Jun 20

@mcc if you use the nodeinfo api to query a gotosocial instance for information you might get random users and posts counts because there is a configuration option called "fiddle" that rather than return valid numbers, 0 or null will generate some garbage. It's claimed this is a reaction to bots that don't obey their robots file (which defaults to disallow) but they don't check that the user hasn't changed the robots file.

Mark Prior Jun 20

@mcc for example
look at https://gunbark.dev/nodeinfo/2.1

'usage' => {
'users' => {
'total' => 690105,
'activeMonth' => 220493
},
'localPosts' => 1591429
}

Irenes (many)Jun 20

@markrprior @mcc more specifically, it was in reaction to a specific database that, for ideological reasons, openly sought to gather aggregate user data across the fediverse, including from sites that attempted to prevent such use

Irenes (many)Jun 20

@markrprior @mcc note that the fiddle option is fully under the control of the instance admin

Irenes (many)Jun 20

@markrprior @mcc the guy running the database responded by being, essentially, fine then we just won't count GTS data, but he tried pretty hard to make that sound as if it was punishing GTS in some way instead of exactly what everyone asked him to do in the first place

Mark Prior Jun 20

@ireneista @mcc really depends on whether you want the instance to be discoverable or not which I assume was their original motivation.

Mark Prior Jun 20

@ireneista @mcc except it is up to the instance admin to know and ensure that their robots.txt file is consistent with their configuration as the software won't check. The instance I referenced (gunbark.dev) is an example where they are inconsistent as I too operate an account that counts instances and users in the Fediverse. In this instance's case I report it but I set their user count to 1 as the data is clearly bogus. In contrast wildebeest returns null for data they consider private.

Irenes (many)Jun 20

@markrprior @mcc we're not sure which point of ours that's intended as a rebuttal of?

Mark Prior Jun 20

@ireneista @mcc that the fiddle option is safe if you respect the robots.txt file.

Irenes (many)Jun 20

@markrprior @mcc oh, we weren't trying to say it's safe. your point seems fair.

Irenes (many)Jun 20

@markrprior @mcc we're kind of against measuring people without affirmative, individual consent, anyway, just as a general science ethics thing

Mark Prior Jun 20

@ireneista @mcc well assuming it's a single user instance then wildebeest's approach works. If the instance is open to more than one user then it is tricky but still comes down to the instance admin rather than the API user as to whether they make available a count or not. People are interested in these counts so a realistic under estimate is better than something completely without foundation, which is why I started counting as the existing one seemed wildly optimistic.

@markrprior @ireneista @mcc you're very wrong, by default GTS manages the robots.txt file, see https://gts.tbh.gay/robots.txt for example

Anyways, how is this at all relevant other than you just wanting to complain that a software has options to hurt scrapers

Mark Prior Jun 20

@Ember @ireneista @mcc "by default" is key. It assumes that it exists and does what it expects but it doesn't check leading to the gunbark.dev case.

@markrprior and? Have you considered just... Not scraping information?

Mark Prior Jun 20

@Ember and what about those instances and users that are happy to make basic statistics available?

pull my strings Jun 20

@markrprior @Ember @ireneista @mcc so fucking weird that the instance I'm on is apparently your use case for "they're fediversing wrong"

especially in this day and age, I'm pretty comfortable using and preferring software that doesn't accurately reply usage stats to strangers

Mark Prior Jun 20

@cold @Ember @ireneista @mcc I'm not saying you are Fediversing wrong. I am saying that it was explained to me that the fiddle configuration was safe because if a bot respected the robots.txt file then it wouldn't query a server for statistics if fiddle was enabled. Your configuration, whether intentional or not, illustrates that that is not the case. I had suggested that checking the robots.txt file before enabling fiddle would be a good idea but that was too hard.

Irenes (many)Jun 20

@markrprior @cold @Ember @mcc we want to reiterate that we have at no point made any claims about anything being "safe", nor do we see "safe" as an important distinction in this situation.

Irenes (many)Jun 20

@markrprior @cold @Ember @mcc we also don't know who said that to you about the scenario where a bot respects robots.txt, but it wasn't anyone in this thread

Mark Prior Jun 20

@ireneista @cold @Ember @mcc maybe "safe" is the wrong word. In the case of an API perhaps "the data is reliable/accurate" is better.

You are correct the advice didn't come from this thread, it came from a issue I opened on the gotosocial source repository.

Irenes (many)Jun 20

@markrprior @cold @Ember @mcc oh, fair. we don't necessarily endorse (or know about) the views of the GTS people, we just like and use their work. at any rate it's nice to have that cleared up, thank you.

Mark Prior Jun 20

@ireneista @cold @Ember @mcc from my point of view an API is a source of truth. If I'm an app and I want to know how many characters I can include in the post I ask the API and use that number. If I get back 2000 but the instance barfs when I try to send a message with that number then the user will probably blame the app. If I ask what languages are in use and only EN is returned I might not offer translation. If NULL is returned I might have defaults but bogus data is unhelpful.

Mark Prior Jun 20

@mcc also regarding the 500 character limit you might be interested in investigating infosec.exchange as it is a large Mastodon instance with a 11,000 character limit. 500 is the standard limit but a number of instances have changed it and I assume apps will query the API to find out the limit. Finding instances with larger limits was the reason why I started looking at the API.

Andy Piper Jun 19

@mcc I (and we) appreciate the detailed feedback and will work with the folks involved to see how we can make this better. Thank you.

Alpha Male Martha Stewart 🍉🌈Jun 19

@mcc the TOS should be handled like the creation of the GPL V3. There was a looooong public process where everyone was allowed to comment on every single line. They even created software just to handle this.

Furbland's Very Cool Mastodon™Jun 19

@mcc I appreciate that @Mastodon is backing off and reviewing these changes after people voiced their opinions, that's something I've never seen a big company do. Definitely a step in the right direction, let's hope they keep it up!

Alyssa Coghlan Jun 19

@mcc Thank you for spending the time needed to engage with this (and for explaining it along the way).

marxist-luiginist Jun 19

@mcc Coming from NGO organizing, I get the feeling Mastodon would benefit a lot from including much more of the community in it's internal structures so these kinds of ideas would be discovered and countered before they were officially released. Thanks for a great summary!

minecraftPickaxe

⸸ cmdr ░ nova ⸸ :~$ 🏳️‍⚧️Jun 19

@mcc In this entire discussion I've seen zero people go "It is at least good that one of the biggest instances is trying to do something about AI content scraping and exploitation" because that's what it's about. Seeing people throw a fit over this makes me feel like, maybe not enough people are taking the threat of AI seriously

@daemon_nova Well, first off, since I haven't permitted use of my content for AI scraping, then as far as I know AI scrapers have no permission to use it, and if they're already breaking one law to scrape the server, I don't know why they wouldn't break two. Second off, the terms never mention AI, what they do is ban scraping for any purpose, and although I suspect that's what most users would want, it's not clear to me scraping for archival is something I oppose.

minecraftPickaxe

⸸ cmdr ░ nova ⸸ :~$ 🏳️‍⚧️Jun 19

@mcc it was explicitly stated in the e-mail about the ToS change that this is about LLM training

minecraftPickaxe

⸸ cmdr ░ nova ⸸ :~$ 🏳️‍⚧️Jun 19

@mcc Second of all, personally holding the idea that your content isn't allowed to be used in AI training is as useful as someone posting on Facebook "Meta doesn't have the right to use my posts for ads!"

@daemon_nova But it isn't. Allowed to be used for AI training, I mean. They don't have a license to use it, and copyright is reserved.

minecraftPickaxe

⸸ cmdr ░ nova ⸸ :~$ 🏳️‍⚧️Jun 19

@mcc Furthemore, a large instance like mastodon.social actually making _some kind of statement_ against AI is good. Especially due to .social's soft stance on federating with Threads. I'm personally tired of watching all the smaller instances take real action against threats and then mastodon.social is just like "we're the floodgate, anything goes"

@daemon_nova Yes, the email is not legally binding, the TOS is

Irenes (many)Jun 20

@mcc that was where we were with it, too. we thought not saying anything was the best protection against central control.

Florian Gilcher Jun 19

@mcc I would voice a strong objection to EFF. US legal advice has a color that is very problematic to German companies.

Like, it starts at simple problems like where I needed to explain to a lawyer that Germany does not have case law _at all_.

Florian Gilcher Jun 19

@mcc We have the FSFE and the Eclipses lawyer firms are mostly in Germany. We've got the people.

@skade My OSS project was looking into the Eclipse Foundation and they seemed to mainly work with "corporate" open source projects— IE, places with sponsors and money. They didn't seem to have the kind of ideological drive that would lead to giving Mastodon legal advice. But what do I know.

FSFE's something, but FSF's approach to law is… weird, to me. How independent, if at all, is the FSFE from the origianl FSF?

Florian Gilcher Jun 19

@mcc FSFE is not FSF. They are very much independent organisations. (I literally asked them that question very strongly with the last Stallman shitshow and while this was a private conversation I cannot share - I joined their membership after that)

On the Eclipse: I would not use them as a structure for such a thing, but their lawyers are awesome.

Followed your recommendation and pinged Andy on the thread. Happy to form a few connections.

Florian Gilcher Jun 19

@mcc The FSFE is a strong lobby organisation to the EU, so they care a lot about not taking part in fights too hard. In that light, this statement basically is a _very strong_ statement against the FSF.

https://fsfe.org/news/2021/news-20210324-01.en.html

That said, Stallman dudes are everywhere around Free Software.

Statement on Richard Stallman rejoining the FSF board - FSFE

We learnt through a public announcement that Richard Stallman is again part of the board of directors of the Free Software Foundation, one of our independe...

FSFE - Free Software Foundation Europe

@skade I'm glad to hear that, about the FSFE.

@skade Well, I'm not the person to raise that objection to. I notice Andy Piper who appears to work with Mastodon gGmbH is in this thread.

I don't know about the EFF, which I believe was started in the US, but I know Creative Commons, which is a cousin organization, is very intent on writing "legal code" which works in every jurisdiction possible.

@mcc Whenever I read about arbitration clauses I laugh because it shows REALLY bad faith.

So, if arbitration comes to Mastodon, that's a kill switch right there and a goodbye from me.

It's a shame, why can't we have nice things?

@Johns_priv I hope they will not keep the arbitration clause.