Someone should set up a third-party tool that mastodon/fediverse users can report abuse to. It should be staffed by people who are skilled at evaluating this stuff and who are paid to do it.

Recommendations/instructions to block instances/individuals could then be sent out to instance admins or even individual users (in an automated way), either as a collective or by subscription.

Or, rather, is anyone doing this already? #federatemoderation

Obviously, this is "just" RBLs and services like Block Together, but we have an opportunity to build better infrastructures to support this. It's not sustainable that individual instance operators are having to painstakingly share and maintain blocklists.
@blaine Account reputation score could be possible to do technically (even in a distributed system), but there are many potential negative impacts from such scoring systems.
@autiomaa @blaine are there reputation systems that leverage circles of trust in the graph to at least filter new/naughty actors for moderation?
@gwilymgj @autiomaa @blaine This feels like the only sustainable approach -- transitive trust. The further away a person is from you, the harder they have to push through the relational friction in order to reach you.
@pauldaoust @autiomaa @blaine 👍 I haven’t heard it called that before. And maybe similar for blocking, the closer someone is to you them blocking someone carries more weight
@gwilymgj @autiomaa @blaine yeah, exactly. Influence diminishes the further out you get.
@gwilymgj @autiomaa @blaine There will still probably be centralisation around 'trust bootstrappers' -- I see these accounts as highly influential people or professional moderators.

@autiomaa It sounds to me as the digital version of the National Social Credit System.

@blaine

@youronlyone In some ways, yes. But we need to remember that people almost always have some level of reputation. What makes the difference is that reputation isn't a linear number value, but a wide set of different scales (per topic etc.).

Problem with digital identity reputation is that it gets more difficult over the time when amount of accounts grows. Even more in a federated system where people can setup new instances & servers in a few hours.

@blaine As I dive deeper in the technical aspects powering the fediverse and thought about the small admins myself, I think it’s absolutely needed. Some service that an instance can use or not. Maybe different services. Also it would be nice to administrate several instances at once. Is something like that possible? Federated Administration, I would call it.
@blaine As long as there is notification to the offending party and some sort of appeals process.

@Robotbeat yup! To a point. In some ways, this could be construed as inventing a whole new set of de-facto laws around engagement.

On the other hand, individuals don't owe anyone anything. I could block everyone who uses the word "orange" and the only one that would suffer is me. 😅

On Facebook and Twitter, the ability to appeal mattered a lot more, because there were no other communities to go to. That's not true here.

@blaine Absolutely! But creating a massive blocklist that everyone uses kind of recreates the de facto law thing.

@Robotbeat 💯

Yeah, a giant single blocklist is a terrible idea.

<looks sideways at twitter>

@Robotbeat

If ordinary users are going to be “rated” by various organisations and instance moderators, then there should also be “reputation scores” for those organisations and instance moderators (and their instance itself).

It should go both ways.

Is the server, and/or organisation, abiding by The Santa Clara Principles or not?

Users can then be more informed which server they want to join. Same with admins trusting third-party “signals” from these organisations.

@blaine

@blaine an outside moderation tool will, I think?, always have less granularity than the instances' preferences granularities. As in, you could have a vendor provide a few specialized views into the fedi, but any shared blocklist is going to have some amount of misalignment with any specific instance's preferences.

I wonder if there's a way to share moderation decisions by weighting the influence of neighboring nodes in an affinity graph. (this isn't the same as the federation graph, ragelove might be willing to federate with fosstodon but have no trust in their moderation decisions.) If aleph high-trusts ragelove, and ragelove defeds naziparty, how can aleph benefit from that decision automatically? Maybe if multiple high-trust peers send along the same moderation recommendation, or several medium-trust peers, etc.

(This sounds kinda like "whuffie" in an early Cory Doctorow novel. I may be cyberpunkpoisoned by reading it during an impressionable period.)

@eqe @blaine it’s also analogous to how security companies share threat intel. Some security companies have higher/lower “credibility” for certain intel types (malicious domains, AV signatures), and the other vendors incorporate shared information into their ML models with the appropriate grain of salt, mixing it with their own local data.

@eqe @blaine

If those representations of moderation decisions and alignments were also inspectable and comparable by everyone, then deciding which instance to join could also become easier.

A reputational map of the participating fediverse would also be an interesting byproduct.

@blaine very intrigued by this! Would love to help anyway I can.
@blaine it might be helpful to begin to categorize and tag complaints. Hate speech; misinformation; banal reply… This would allow for a couple of interesting improvements including a means of more accurate moderation instruction and better self moderation. Maybe you want to moderate misinformation but have no tolerance for banal replies. And, of course, moderators could define each category.
@blaine this is a super interesting idea 🤔
@blaine @leigh sounds a lot like what we do at Meedan with mis/disinformation. We could set up a “tip line” for moderators who could annotate the Toots in question to allow for admins or users to decide what to do with it. I don’t think we need to be overly prescriptive since the spirit of the Fediverse is openness and local decision making.
@huslage @leigh 💯 I love the idea of annotating things. It'd be straightfoward to e.g. build a content-addressed version of a post and use that to attach content warnings, etc from verified folks in an accessible place.
@blaine @huslage @leigh seems like a money problem more than a technical one. Also a policy writing one.
@blaine @huslage @leigh there’s established taxonomies for annotating abusive content - https://aclanthology.org/2021.acl-long.247.pdf see Table 2 etc -
@blaine @huslage @leigh I should also note a caution: While there are models (Jigsaw’s Perspective API/service) that can robustly score abusive language & hate speech, they can’t be relied upon to detect Salonfahige hate speech & context always matters. Some social media (Reddit, probably twitter) scale up moderation by leaning heavily on model scoring, & we’ve seen that circumvented by bad actors.
@blaine @leigh @PennyOaken I don’t think there are AI models that work very well in this context. TikTok tries to do it too and it’s a nightmare. Maybe I’m wrong though.
@blaine @leigh @PennyOaken the Taxonomies are a cool place to start from
@huslage @blaine @leigh My impression from throwing my own gathered data at Perspective is that it’s about 80/20 on toxic & hateful speech, but the 20% it fails at properly IDing means a ridic high specificity problem. Kids are clever at circumventing & so are abusive stalkers.
@blaine reports are already forwarded to the original instance though, how would this help?

@gsora e.g. if the hosting instance (e.g. p o a . x t) doesn't block the user, then either the network needs to defederate from the server, OR communicate that the user should be blocked everywhere.

But actually this speaks to the need for formal organizations to help with this. I'm not an expert, and don't really know the answers here. It's important that we delegate and help each-other where we can.

@blaine @gsora

That is already happening and has been abused before, and is continually being abused today.

Bad actors create fake accounts, pretend to be real accounts, then launch their server-disinformation campaign. "Server-disinformation", meaning, they will post content most server admins hate, then they will report those accounts in various blocklists and servers, with claims such as, "this server is so and so".

Server admins blocks the entire server w/o investigation bec. of 1 user.

@youronlyone @gsora absolutely. This is a really good example of why we need sophisticated folks who can evaluate these sorts of attacks and make the right call.

It's unreasonable to expect admins of small instances to be able to do this themselves.

@gsora @blaine There are tons of Bad Actor instances already that exist to support awful abusers. As the size of the Fediverse swells, it will be an attractive target for automated abuse, particularly as the friction to setting up a new instance specifically for hosting abuse accounts approaches zero.
@blaine idk if it's fully what you want but i'm keeping an eye on http://rapidblock.org/
The RapidBlock Project — Home

Home of the RapidBlock Project for sharing blocklists of bad actors on the Fediverse.

The RapidBlock Project
@blaine Well not just blocking but also de-federation. I suspect a common attack vector by organized bad people would be to set up a quickie instance and unleash a flood of shit before word gets around. Federate-to-anyone-by-default is essential I guess, but…
@timbray for sure – that's where I was going with blocking instances. It may be that setting up a new instance requires some kind of sign-off from some already-trusted folks. Doing that for new users at scale wouldn't be tenable, I think, but for instances a "reputation score" or web of trust seems entirely appropriate. DKIM/RBL is only ever going to get us so far.
@blaine @timbray I think it’s safe to let admins be admins to some degree.
@blaine Slight variation on that - setting up an instance is still free, but there's a site that sanity-checks the first say 48hrs of traffic, and you don't federate to new sites until the sanity-checking site emits an OK signal. Or something like that.
@timbray @blaine
Sort of an #RBL for the #Fediverse?
@GenghisKen @timbray @blaine a starting point could be a list oft all instances with a public blocklist, which is ranked by registered users and other criteria, based on the public (#mastodon) )API. #fediblock
@timbray @blaine this all feels wrong to me somehow, like we're missing the point. If I set up an e-mail server, I can send and receive e-mail. In order to end up on a block list, I have to provably do something bad with my e-mail server -- and not just once or twice, it has to be a repeated and obvious digression. About the most generic activity e-mail blocklists do is block based on IP address, and usually just people's home internet connections -- which is fair given the way SMTP works. 1/2
@timbray @blaine SMTP allows (by default) relaying messages for others. Mastodon is a lot more complex to set up than an SMTP server, and it uses a lot more compute and bandwidth. I think it makes it more difficult for malicious actors to hide. It should be a lot simpler to identify and block malicious instances, so being overly proactive might be overkill and run the risk of defederating innocent users/admins.
@peepstein @timbray long-term, I don't think we can rely on the cost/complexity factor. ActivityPub is going to get really cheap to host, very quickly. Otoh, we have user-level verification, which is something SMTP has always lacked, even with DKIM.
@blaine @timbray If you’re including bots or servers that speak ActivityPub directly and don’t actually have a real site behind them, that’s certainly a concern. I wonder if there isn’t a mechanism to identify them in a more straightforward manner but not entirely certain how to do that without giving it a bit more thought.

@peepstein @timbray @blaine I agree. That we're even discussing mandatory centralized authority structures and imposing costs just to participate indicates that we're missing the point. The value of ActivityPub is in its permissionless and decentralized nature. We should resist the impulse to try to build authority in as a default.

I'm all for secondary services to allow people to pool resources around categorization. But participation shouldn't be contingent on submitting to such a service.

@justin @peepstein @blaine

Um, have a look at this thread, it may help explain why we're sounding obsessive about our defensive alignment: https://twitter.com/rahaeli/status/1594724708309553152

rahaeli on Twitter

“Okay, look, I can't sleep so you all get the cranky rant: I have seen multiple people responding to my criticism of Cohost and Mastodon, and Shep's criticism of Cohost and Hive, saying that we're just "haters" who are trying to destroy alternatives to Big Tech.”

Twitter

@timbray @peepstein @blaine Thanks for that; it does help contextualize the conversation for me.

I don't disagree with any of that. If you run a service for the public, you need controls in place to minimize abuse.

@timbray @justin @blaine absolutely. I am very confident that there will be some larger players who will Hoover up a lot of the fediverse userbase and will be able to manage these requirements. The other end of the spectrum is probably single user instances. I’d guess the middle will get squeezed and squeezed until there’s not many of them left.

@timbray @blaine

I just starting admining a small server and one thing that was super annoying was trying to copy over block lists from other instances that I trust.

Along those lines maybe the reason for blocking an instances becomes more important... community standards vs. hate/etc.

Is it just me or is the user experience around managing that terrible?

@blaine i agree with this. And it should be a moderation worker cooperative.
@schock @blaine i really appreciate you always beating the drum for worker co-ops :3