@C. I have two major issues with the Mastodon HOA.

One, they try hard to force "Mastodon standards", Mastodon culture and Mastodon's unwritten rules upon the whole Fediverse. Including places that not only aren't Mastodon, but that are very much not Mastodon. Simply because they can't see where a message is from. In fact, many of them are still fully convinced that the Fediverse is only Mastodon.

And so you have members of the Mastodon HOA yelling at someone who is allegedly "doing Mastodon wrong", but that someone is actually on Friendica and has been since as early as 2011. As in about five years longer than Mastodon has even existed. And seriously, the only places in the Fediverse that are even more different and farther away from Mastodon than Friendica (without specialising in something that Mastodon absolutely can't do) are Friendica's own descendants: Hubzilla, (streams), Forte.

The Mastodon HOA probably don't know that Friendica exists. They definitely don't know that either of the other three exists. They definitely don't know that any of the four is significantly different from Mastodon in any way. And frankly, they don't care a bit. If it appears on any Mastodon timeline, it's Mastodon to them, and it has to adapt to Mastodon's culture and follow Mastodon's rules.

Two, they don't coordinate anything among each other. They're just a bunch of lone wolves. Everyone has got their own standards, but everyone thinks their personal standards are the one and only Mastodon/Fediverse gold standards, and everyone enforces their own standards. And, of course, everyone thinks their standards can and must apply always, including in the most obscure edge-cases.

For example, they've got standards for describing real-life photos on Mastodon with a character limit of 500. And they try to enforce these standards always and everywhere. However, these standards don't necessarily work perfectly when I post a rendering from a super-obscure 3-D virtual world on (streams) with a character limit of over 24 million where I've got loads of room to write an additional long image description and put it into the post text.

The Mastodon HOA, or at least some of their members, appear to be constantly raising their minimum quality requirements for image descriptions. They must be absolutely accurate, and they must be sufficiently detailed that nobody will ever have to ask for a detail description. Oh, and they must explain whatever the audience may not know about the image or the description. (At this point, it's fair to mention that explanations must never go into the alt-text.)

Sure, I can do that. I have done so in the past. But I can't do that within Mastodon's alt-text character limit of 1,500 (Mastodon truncates longer alt-texts from outside). I can do that even less within Misskey's alt-text character limit of only 512 (Misskey and the Forkeys should truncate longer alt-texts, but due to a bug, they delete them entirely instead, giving the impression that you haven't written an alt-text at all). I can only do that in the additional long description in the post text.

If the Mastodon HOA demand I transcribe literally any and all text within the borders of an image, I can do that, too. In fact, I have done so in the past. I can transcribe bits of text verbatim which the Mastodon HOA can't even read. Which the Mastodon HOA couldn't even find in the image because they're so tiny. But there's no way that I can squeeze 20+ individual text transcripts into 1,500 characters of alt-text along with the rest of the visual description, much less into only 512 characters. The text transcripts will have to go into the long description in the post text, whether the Mastodon HOA want or not.

This means that the post will exceed the holy limit of 500 characters by huge magnitudes. This, in turn, means that when I've satisfied one Mastodon HOA member, another one comes and sanctions me for exceeding the holy 500-character limit. That is, chances are it's actually the same Mastodon HOA member.

In other words, if the content of an image is obscure enough and requires enough description, the only winning move when I want to post such an image is to not post it at all.

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #CharacterLimit #CharacterLimits #CharacterLimitMeta #CWCharacterLimitMeta #500Characters #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #MastodonCulture #MastodonHOA
friendica – A Decentralized Social Network

@Woochancho @Diego Martínez (Kaeza) 🇺🇾 @🅰🅻🅸🅲🅴  (🌈🦄) Especially whenever humans have advantages over LLMs.

When I describe my own original images, I have two advantages.

One, I know much more about the contents of the image than any AI. That's because my original images always show something from extremely obscure 3-D virtual worlds. On top of that, I may add some extra insider knowledge or explain pop-cultural references in the long description in the post if it helps understand the image and its descriptions.

Two, the LLM can only look at the image with its limited resolution. That's all it has. In contrast, when I describe my images, I don't just look at the images. I look at the real deal in-world with a nearly infinite resolution.

For example, an LLM can only generate a description from a picture of a virtual building. But when I describe it, my avatar is in-world, standing right in front of the building whose picture I'm describing. I can move the avatar around, I can move the camera around, I can zoom in on anything. I can correctly identify that four-pixel blob as a strawberry cocktail wheras the LLM doesn't even notice it's there.

I've actually done two tests using LLaVA. I've fed it two images I had described myself previously to see what happens. It was abysmal. LLaVA hallucinated, it interpreted stuff wrongly and so forth, not to mention that LLaVA's description, even after being prompted to write a detailed description, wasn't nearly as detailed as mine.

In one image, there's an OpenSimWorld beacon placed rather prominently in the scenery. LLaVA completely ignored it. I described what it looks like in about 1,000 characters, and then I explained what it is, what OpenSimWorld is and how it works in another 4,000 characters or so.

It's an illusion that AI will soon catch up with any of this.

Oh, by the way: How is an AI supposed to pinpoint exactly where an image was made if the image shows a place of which multiple absolutely identical copies exist? Or if the image has a neutral background that doesn't even hint at where it was made? I can do that with no problem because I remember where I've made the image.

#Long #LongPost #CWLong #CWLongPost #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #AI #LLaVA #AIVsHuman #HumanVsAI
Netzgemeinde/Hubzilla

@Pino Carafa Well, my problem is not the alt-text.

I used to limit my alt-texts to 1,500 characters because Mastodon and its forks truncate longer alt-texts at the 1,500-character mark. In the future, I will limit them to 512 characters because Misskey and its forks should truncate them at that mark if they're longer, but instead, they delete them.

But in addition to my alt-texts, I describe my original images once more (= twice altogether). The other description is what I call the "long description", and it goes directly into the post text (as opposed to the alt-text). I don't have a character limit to worry about (over 16.7 million), so I can do what's outright unimaginable from a Mastodon point of view.

It's this long description that's causing trouble.

That is, I wouldn't wonder if the Mastodon HOA were to sanction me for my alt-text not being detailed enough when I limit it to 512 characters. In fact, I wouldn't wonder if they were to sanction me because a 1,500-character alt-text of mine is lacking important elements (descriptions of certain details, transcripts of all text within the borders of the image etc.).

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #CharacterLimit #CharacterLimits #CharacterLimitMeta #CWCharacterLimitMeta #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #MastodonHOA
Netzgemeinde/Hubzilla

@Pino Carafa An additional advantage of this would be that I could first ask just how detailed a description they need. Like, if they really want me to spend two full days, morning to evening, to write something that'll take their screen reader three hours to read out loud.

The problem, however, is that the virtual worlds that I frequent change a lot. Everything is built by users. A place that I've shown in an image may change mere days or hours after I've been there, so when I go back to take a closer look for a detailed description, it doesn't look like on the image anymore.

Or that place may be gone entirely. For example, I could post some images from an in-world event, from places specifically built for this event. Then, two months later, someone asks for a more detailed description. But I can't write a more detailed description because I can't go back to these places, simply because these places were closed and shut down a few days after I had posted the images.

Lastly, my impression of Mastodon is still that a significant number of users do not want to ask. Whatever information they may need, they expect it all to come with the post immediately. Having to ask for a detail description or for an explanation appears to be about as bad style as having to ask for a description in the first place.

I've literally seen Mastodon toots in which people say that if they don't understand a post or an image in a post, they want an explanation to come with the post.

I've also seen a Mastodon toot in which someone said that it isn't sufficient to just say what's in an image, but you also have to describe what it looks like. Right away. And in my case, this is actually absolutely justified.

It's a catch-22: If I don't describe my images sufficiently, I risk being sanctioned by the Mastodon HOA for not describing my images sufficiently. But if I do, I risk being sanctioned by the Mastodon HOA for exceeding 500 characters in one post.

Oh, and if I chop my image descriptions into tiny chunks of no more than 500 characters, it's disturbing for my own ilk, the users of Friendica, Hubzilla, (streams) and Forte, who are used to not having any character limits and everything being in one message, no matter how long it is. Besides, how many Mastodon users are willing to read a thread of 120 or more posts and find that more convenient than one post with 60,000 characters?

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #CharacterLimit #CharacterLimits #CharacterLimitMeta #CWCharacterLimitMeta #500Characters #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #MastodonHOA
Netzgemeinde/Hubzilla

@Pino Carafa @ᴮᵉⁿ ᴿᵒʸᶜᵉVOTE IN THE PRIMARIES @🅰🅻🅸🅲🅴  (🌈🦄) It does depend on the image, yes.

Most of the images in the Fediverse that aren't just text are real-life photographs. Real life is something that people know, that people are familiar with. For one, it isn't that exciting, and besides, even blind folks have at least got a rough idea about what stuff looks like.

When I post memes, my visual descriptions are limited to what's important in the context, and I only write one visual description per image which goes into the alt-text. However, I do add a full explanation in the post text because it appears to me like a sizeable amout of Mastodon users expect explanations for things they don't understand to be delivered to them immediately without them having to ask.

But my original images aren't real-life photos. They aren't screencaps from anything familiar either. They're renderings from extremely obscure 3-D virtual worlds.

On the one hand, I can't expect anyone to have an idea of what anything in my images looks like. If anyone sighted doubts this, I ask them to check what an avatar in Meta Horizon looks like, what an avatar in Roblox looks like and what a modern avatar in Second Life looks like (Flickr and Primfeed are good sources for the latter).

On the other hand, people may be super curious about these worlds beyond what matters in the context of a post, even or especially if they aren't fully sighted. Or the post itself is about the image, as in about the whole image as opposed to something specific in the image.

This means that I have to describe the entire image with every detail in it. And I don't describe the image by looking at the image with its limited resolution. I describe it by looking at the real thing, in-world, where the resolution is near-infinite.

My sighted audience sees a little white square with six pixels in a row that are ever so slightly less bright. They may not even notice it. I see a sign with two lines of text on it, I describe it all the way to the typeface, and I transcribe the text verbatim. This is how I sometimes end up with over 20 individual bits of text in one image that need to be transcribed.

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta
Netzgemeinde/Hubzilla

@Mastodon Migration Basically telling other people how they should be using Mastodon is not cool unless they are violating some instance rule.
As, by the way, is telling Fediverse users who are not on Mastodon to use whatever they use instead like Mastodon users are expected to use Mastodon.

Please don't be a Mastodon HOA enforcer.
Especially since the alt-text police of the Mastodon HOA have much higher alt-text and image description minimum standards than blind or visually-impaired people. And they seem to be raising their standards further and further.

I always try my best to be way ahead of anyone's image description minimum standards, also in order to demonstrate to the Mastodon HOA that I'm not a lazy bum, and that I do try hard to describe my images properly. For my own original images, this means that I have to describe each one of them twice, with a fairly short description in the alt-text and a much longer one in the post itself.

This, however, clashes with the Mastodon HOA, too, because they also enforce Mastodon's default 500-character limit Fediverse-wide by generously blocking everyone whom they catch exceeding it at first strike.

CC: @🅰🅻🅸🅲🅴  (🌈🦄)

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #CharacterLimit #CharacterLimits #CharacterLimitMeta #CWCharacterLimitMeta #500Characters #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #AltTextPolice #MastodonHOA
Netzgemeinde/Hubzilla

@Jourei At least once have I just not posted something because describing the image in any meaningful way would've been quite a task.
Though nowadays I'll post it anyway, perhaps I should call in helpers in those cases in the future.
I could post lots of pictures that I've made myself, maybe several a week. Instead, I haven't posted any since mid-2024.

I, too, refuse to post any image that doesn't have descriptions (two of them for each one of my original images) which are up to my own constantly rising standards. But these standards mean that even a fairly simple image may require several hours to one or two days to describe and explain.

In fact, I have been working on the descriptions of a series of avatar portraits for about a year and a half. So far, only the common preamble for four images and the individual long description for one of them are written. Distilling an optimal alt-text from them will be difficult because recent discoveries had me lower my personal alt-text character limit from 1,500 to 512, and I haven't even managed to put one together that doesn't exceed 1,500 characters.

And that's for avatar portraits with a neutral white background. Imagine the effort necessary for a landscape or a cityscape or something like that. So much about it being done in 10 seconds.

I recall finding a beautifully built, highly detailed harbour scene. It didn't even have anything in it that'd trigger anyone, I guess, so I deemed it safe enough to post. But I found it outright impossible to properly describe within a reasonable amount of time and with a reasonable effort. I ended up choosing a much different scene that still ended up taking me two full days to describe in the long description plus the morning of the third day to write an alt-text.

To be honest, I avoid having certain elements in my pictures now, such as vehicles and buildings unless they're very simple.

CC: @🅰🅻🅸🅲🅴  (🌈🦄)

#Long #LongPost #CWLong #CWLongPost #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta
Netzgemeinde/Hubzilla

@xylophilist

#ImageDescription #ALTtext
We are standing in a field, gazing out into the night.
The Northern Lights have begun to appear in the sky above the horizon. The lower section is a yellowish green, fading to purple towards the top.
Behind the Northern Lights, the stars can be seen twinkling.

On the ground, the lights from houses and traffic can be seen here and there.

thanks to
@no_brainer 😉
@🔥Cassandra🔥 When I describe a meme-based image macro, it's rather concise. But I've got very good reasons to describe my original images at greater detail in alt-text and then once more at vastly greater detail in the post itself. It depends on what people may not know about the image but want to know.

I hope to limit my future alt-texts to a maximum of 512 characters, though. That's Misskey's limit, but Misskey currently has a bug which makes it delete longer alt-texts instead of truncating them.

That said, AI also hallucinates and guesses too much without actually knowing. This occurs the more, the more obscure the topic of an image is and the less the AI knows about it.

So if you want your image description to be accurate, you'll have to write it entirely yourself.

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta

NVDA: AI Image descriptions progress, discussion #19807The on github
https://github.com/nvaccess/nvda/discussions/19807

latest good NVDA installer of AI image descriptions can be found here.
https://download.nvaccess.org/snapshots/try/try-image-desc/

After early alpha testing feedback, on-device AI image descriptions were removed from 2026.1.
This is due to these main reasons:

low quality of descriptions
lag when enabling the feature
lag while the feature is running
no option for higher quality, NPU/GPU description
no VQA: Visual Question Answering: e.g. asking follow up questions about the image
no translations: descriptions are only in English
To reintroduce the feature in alpha, we want to fix the following things first:

have a simple, basic but accurate describer that can run on mid-range CPUs.
So far we have done this, by recently improving the model slightly.
minimize lag when enabling and running the feature: ensure NVDA remains responsive
ideally a way to tell the confidence of the description
The next biggest priorities are:

Add a higher intensive model, that runs on NPUs and GPUs, and offers VQA
Add models to translate output
After:

We would like to offer a wider range of models via a model manager
A big technical challenge here is the lag importing numpy introduces, which the python onnxruntime requires.
We investigated creating a C++ layer, but the implementation is still experimental and not working for ARM64EC: microsoft/onnxruntime#15403
We could consider offloading onnxruntime, numpy and the describer to a separate process, similar to the 32bit shim.

#nvda #screenReader #imageDescription #Blind #llm #AI #openSource

AI Image descriptions progress · nvaccess nvda · Discussion #19807

The latest good NVDA installer of AI image descriptions can be found here. After early alpha testing feedback, on-device AI image descriptions were removed from 2026.1. This is due to these main re...

GitHub