Is #mastodon becoming an echo chamber? This post from @carnage4life has me questioning our community. The Mastodon team is finally getting some traction, the product improvements are increasing, The #UX is improving, yet people posting on multiple platforms are making comments like this. It's confusing.

I *know* people here don't want this to be a classic social media-clone but we'd *like* journalists to be here right? They aren't coming with examples like this!

OK, this is going even MORE sideways so I need to make a few things clear:
1. I took a complex point and made it poorly
2. My goal was to ask for more inclusiveness
3. I am sickened by what happend to BlackTwitter and I don't want it recur
4. But I can't speak for BlackTwitter nor should I
5. I apologize to black mastodon users for making such a poor comparison
6. I'm not endorsing "AI Slop" they were a foil to make my point
7. I'm certainly NOT trying to compare AI bros to Black twitter (but, as I said, I can see how people made that connection. I'm trying to correct that here)
@scottjenson Respectfully, Mastodon is less of an echo chamber than any social media on earth presently. That's exactly why people bounce off of it. There is absolutely nothing built into the system to shepherd you into a "good" experience. It's just humans. Love 'em or leave 'em, but if you stay it's on you to curate your experience. A lotta people don't like that, and that's OK. All Mastodon users ever asked for was the option to exist in a space they owned. Not for Threads, Bluesky, and Twitter to all shut down and everyone to be on Mastodon.

@tael Respectfully, we're looking at very different feeds. I assume you've heard of the reply guys getting in peoples mentions telling people off for not doing alt text, or content warnings? THAT is why good people are bouncing and that is exactly why it *is* an echo chamber. If you don't conform to these rules, instead of, you know, just not following them, or blocking them, or hell, defederating their server (all of which I would be fine with) people feel so privileged that they get in these new comers face and tell them "you're doing it wrong".

We should be encouraging communities that we don't agree with. This isn't "the nazi bar" story. There ISNT a single bar! It's the "you can't come into MY bar" story. That's what federation was built for. Why is that so hard for people to live with?

@scottjenson
Not using AT's and CW's is “using it wrong”, if you know what those are for.

Newcomers don't have that knowledge and it's not always a part of the written rules, as if every newcomer had read them.

I'm kind of baffled that you, as an advisor on Mastodon's product strategy, are fazed by people correcting this, instead of advocating on how to alleviate that.

@tael

@dzwiedziu @scottjenson @tael

In fact, here's some free ideas that could decrease this friction:

1. Include explanation of *what signal missing alt text sends* in UI warning about missing alt text. "Many people can't access this post without alt text. Posting without alt text signals a lack of respect to those communities. Use #/alt4me to ask for help with alt text." This would decrease posts which lack alt text out of ignorance, and fit those who persist it explains ahead of time why they get people reminding them about the issue.
2. (More technically challenging) For posts without alt text, maintain a counter per-user and display (in the alt text slot) "this is this user's Nth post without alt text" capping out at 1000 or something. Make sure to subtract when alt text gets edited in. This helps those who would reply with a reminder understand whether it's someone new who just doesn't know/understand, or whether it's someone who doesn't care. For those who don't care, you'd see much less falloff.

Harder to do similar with CW as it's not possible to know automatically when one was warranted. Then again I've very rarely seen direct CW-nagging.

By the way, do you know who I *do* see on the fediverse? Blind users and people dealing with various traumas. If you had a button that would instantly silence CW and alt-text nagging, guess who would leave? I half-understand that @scottjenson is unconsciously trolling at this point, fixed in his opinions and unwilling to listen to any of the criticism here so he's trying to form arguments that defend his own ego, but every change-to-others'-social-behaviors that he's asking for here is one that would directly make some portion of fedi users less comfortable, and he hasn't been willing to face the racism issue head-on, which is almost certainly a *much* bigger reason for leaving than anything he's mentioned.

The counter-point about echo chambers is spot-on here.

@tiotasram @dzwiedziu @scottjenson @tael

An image analyzing LLM that provides alt-text would be imminently useful all across the internet.

Is #LLM still bad when its used for #disability and #assistivetechnology?

@crankylinuxuser
An LLM can't analyse an image by itself.

This is computer vision and neural networks.
LLM's might be then used then to rewrite the CV/NN output.

@tiotasram @scottjenson @tael

@dzwiedziu @tiotasram @scottjenson @tael

Yes, it can.

If you're using SafeTensors, the model can absolutely include the tensors required for image analysis. The output gets converted to visual tokens in the language space. Basically the image analyzer tensors are an 'expert' in a MoE like system.

If you're using gguf with llama, then the mmproj- files are the separated visual tensors. This allows using bf16 image with a heavily reduced LLM, like a int4 model.

I'm also still learning a ton of this on the fly, by running and administering LLM operations on my network. As a systems engineer, this is the state of the art, and I absolutely will learn to run it for myself and any company I end up at.

@crankylinuxuser @dzwiedziu @scottjenson @tael

First: such systems do exist. Many Blind users use them regularly. They are quite sophisticated, but current architectures have inherent limits including inescapable biases and predictable patterns of failure.

There's a fine line here: I'm not opposed to use by Blind users for their own purposes, even if I'll sometimes feel warranted to warn about the shortcomings that the boosters always minimize. They've got enough to deal with in this world and it's absolutely not my place to criticize their choices on any grounds (I've seen Blind users on here who do have ethical objections they raise with other Blind users, but that's not my conversation to jump into).

But I am opposed to their use by those who can write alt text themselves: you're effectively offloading the systemic risks into the disabled people you're ostensibly trying to serve, without giving them a say in the matter and often without any warning.

Some would say: isn't some alt text better than none even if it's poor quality?

The answer is: no it's not. Imagine the following scenario: user posts a picture of a group of Black people at a concert. Comments "Having fun at the club." They use AI captions and they were too tired to double-check this time. AI-generated caption reads "A group of gorillas dancing in a club." (Mis-labeling Black people as gorillas actually happened with an image-labeling system at one point already, so this is absolutely possible). Now a blind user who can only read the caption thinks this is a joke. They comment in reply "Haha how did they train those gorillas to dance?"

So your choice to offload your alt text work to an AI ends up creating a racist incident *and* it makes the Blind user seem like a completely aggressive and unapologetic racist, since sighted users mostly won't see the alt text. Because we know ahead of time that the AI will make exactly these lines of mistakes (and that we can't possibly be vigilant enough to catch them all), it's irresponsible to use it in this way, on top of all the orthogonal reasons that it's ethically wrong to use most modern LLM systems.

@tiotasram @[email protected] @dzwiedziu @scottjenson @tael
If the alt text is mechanically generated and not checked it should contain statements or at least markings of both those facts.

@crankylinuxuser @tiotasram @dzwiedziu @scottjenson @tael

Listing things not in an image https://mastodon.social/@urlyman/116373375986994242

Speculating about things not in an image https://mastodon.social/@urlyman/116373334109155559 in the context of a ridiculously verbose description.

Generated by a machine that one cannot have a discussion with about fidelity because there is no mind or embodied sense of reality there but nonetheless deploys the first person. Seems like a bad idea