@xylophilist

#ImageDescription #ALTtext
We are standing in a field, gazing out into the night.
The Northern Lights have begun to appear in the sky above the horizon. The lower section is a yellowish green, fading to purple towards the top.
Behind the Northern Lights, the stars can be seen twinkling.

On the ground, the lights from houses and traffic can be seen here and there.

thanks to
@no_brainer 😉
@🔥Cassandra🔥 When I describe a meme-based image macro, it's rather concise. But I've got very good reasons to describe my original images at greater detail in alt-text and then once more at vastly greater detail in the post itself. It depends on what people may not know about the image but want to know.

I hope to limit my future alt-texts to a maximum of 512 characters, though. That's Misskey's limit, but Misskey currently has a bug which makes it delete longer alt-texts instead of truncating them.

That said, AI also hallucinates and guesses too much without actually knowing. This occurs the more, the more obscure the topic of an image is and the less the AI knows about it.

So if you want your image description to be accurate, you'll have to write it entirely yourself.

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta

NVDA: AI Image descriptions progress, discussion #19807The on github
https://github.com/nvaccess/nvda/discussions/19807

latest good NVDA installer of AI image descriptions can be found here.
https://download.nvaccess.org/snapshots/try/try-image-desc/

After early alpha testing feedback, on-device AI image descriptions were removed from 2026.1.
This is due to these main reasons:

low quality of descriptions
lag when enabling the feature
lag while the feature is running
no option for higher quality, NPU/GPU description
no VQA: Visual Question Answering: e.g. asking follow up questions about the image
no translations: descriptions are only in English
To reintroduce the feature in alpha, we want to fix the following things first:

have a simple, basic but accurate describer that can run on mid-range CPUs.
So far we have done this, by recently improving the model slightly.
minimize lag when enabling and running the feature: ensure NVDA remains responsive
ideally a way to tell the confidence of the description
The next biggest priorities are:

Add a higher intensive model, that runs on NPUs and GPUs, and offers VQA
Add models to translate output
After:

We would like to offer a wider range of models via a model manager
A big technical challenge here is the lag importing numpy introduces, which the python onnxruntime requires.
We investigated creating a C++ layer, but the implementation is still experimental and not working for ARM64EC: microsoft/onnxruntime#15403
We could consider offloading onnxruntime, numpy and the describer to a separate process, similar to the 32bit shim.

#nvda #screenReader #imageDescription #Blind #llm #AI #openSource

AI Image descriptions progress · nvaccess nvda · Discussion #19807

The latest good NVDA installer of AI image descriptions can be found here. After early alpha testing feedback, on-device AI image descriptions were removed from 2026.1. This is due to these main re...

GitHub

@choomba #BiBesch / #ImageDescription

Supposed X-post of klara_sjo: "There will be no WW3. They've abandoned numbered releases and switched to a live service model with seasonal events."

@Halla Rempt Since you wanted criticism and advice, here it is. I'll add links to the corresponding pages in my wiki if there are any. (In case you're unaware: Parts of the Fediverse can do embedded links without a URL in plain sight. So if it has a different colour from the rest of the comment here, it's a link even if it isn't a URL.)

First of all: Don't start alt-text with "Photo of". Do mention any medium that isn't a digital photograph. But do not say if something is a digital photograph. It's generally considered the default medium on the Web, so mentioning it is redundant and needlessly inflates your alt-text.

Next: Don't use line breaks in alt-text. Yes, they make your alt-texts look prettier. But those who rely on alt-text can't see them anyway.

Besides, most screen readers expect alt-texts to be only one paragraph. They generally start reading out alt-text with something like, "Graphic." If there are multiple paragraphs, they'll take each paragraph for a separate alt-text and start reading out each one with, "Graphic."

Don't use the quotation marks on your keyboard in alt-text. Again, yes, they make your alt-texts look prettier. But, again, those who rely on alt-text can't see them anyway.

Besides, these quotes are not generally accepted as standard elements in alt-text. Hence, many frontends don't support them, not even in the Fediverse. Mastodon does.

But Hubzilla, for example, which is actually older than Mastodon (and which I'm commenting from right now), doesn't. Hubzilla keeps these quotes in alt-text as their HTML entity: &⁠quot;.

So, for example, you have this alt-text:
Photo of three books. These are manuals with grey covers, entitled "Owners Manual", "BASIC Users Manual" and "DOS Users Manual" The background is once again my painting table.

Hubzilla renders it as:
Photo of three books. These are manuals with grey covers, entitled &⁠quot;Owners Manual&⁠quot;, &⁠quot;BASIC Users Manual&⁠quot; and &⁠quot;DOS Users Manual&⁠quot; The background is once again my painting table.


Now, let's assume someone blind uses Hubzilla with a screen reader. The screen reader will read your alt-text out loud like this:
Photo of three books. These are manuals with grey covers, entitled and quot, Owners Manual and quot, and quot, BASIC Users Manual and quot, and and quot, DOS Users Manual and quot, the background is once again my painting table.

And then there are (streams) and Forte. The former is a fork of a fork of three forks of a fork (of a fork?) of Hubzilla by Hubzilla's own creator, the latter is a fork of (streams) by the same guy. These two internally use this very same quotation mark as an alt-text delimiter. This means that once they hit a quotation mark in an alt-text, they assume it marks the end of the alt-text.

Hence, they render your above alt-text like this:
Photo of three books. These are manuals with grey covers, entitled
And they continue right after the end of your alt-text while completely ignoring the rest of your alt-text.

While we're at it: If you really want to make sure that screen readers pronounce your acronyms correctly, write them in a way that ensures just that.

Let's take "BASIC" as an example. Some screen readers may pronounce it, "basic" because they recognise it as a word. Some screen readers may pronounce it, "bee ay ess eye see" because they spell everything in all caps out.

If you want all of them to read it out as a word, don't write it in all caps.

In contrast, if you wanted all of them to spell it out, you'd have to insert full stops like so: "B.A.S.I.C."

These are two exceptions of the rule that text must always be transcribed 100% verbatim.

Don't explain things in alt-text. Explanations must always go into the post text where everyone can access them.

Why? Because there are people who cannot access alt-text. Often due to physical disabilities. In order to access alt-text, at least one sufficiently working hand is required. And there are more than enough cases in which people do not have sufficiently working hands at all.

Also, GNU/Linux users who run graphical browsers on minimalist window managers such as i3wm that are entirely controlled by keyboard can't access alt-text either. They'd have to move a mouse cursor either to the little "ALT" button or until it hovers above the image. But they don't have mice or other pointing devices. They control their machines entirely by keyboard.

But if all these people can't access your alt-text, they can't read your explanations. At all. They're lost to these people.

Lastly: Keep your alt-texts and image descriptions strictly neutral. Alt-text is no place for personal opinions.

CC: @Alt Text Hall of Fame

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta
Jupiter Rowland - [email protected]

@Halla Rempt I'm working on a very extensive wiki about how to describe images and write alt-texts in and for the Fediverse. It's still very incomplete, though. If you're interested anyway, here is the link.

I could still take a look at your alt-texts.

CC: @Alt Text Hall of Fame

#FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta
Jupiter Rowland - [email protected]

@abovearth Makes me wonder where I'd end up:

  • 1,400 characters in alt-text describe the image, but with no explanations and with no text transcripts
  • Another 100 characters notify the reader of a long, detailed description in the post itself
  • 60,000+ characters (you've read that right, over sixty thousand) in the post text describe the image at full detail, even more than full detail because they cover details that aren't even visible at the image's resolution, complete with extensive explanations and 20+ individual text transcripts
  • Especially the latter description tries hard to adhere to as many image description rules and guidelines that I've read in the past as possible
  • When I learn about another image description rule/guideline, I promptly declare both the alt-text and the long description obsolete

Guess that'd be lawful evil or something.

#Long #LongPost #CWLong #CWLongPost #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta
Netzgemeinde/Hubzilla

I don't want to discuss alt-texts and image descriptions to try and weasel myself out of having to write them.

I want to discuss alt-texts and image descriptions to get them right myself. To improve and optimise them both for all those who need them in some way and for those who enforce them.

I want to know the alt-text activists' quality standards. So I can exceed them.

#AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta
Netzgemeinde/Hubzilla

What if I switched from describing my original images twice...

..."short" (still fairly long) alt-text + fully detailed long description in the post text...

...to describing them three times...

...same as above, but I'd keep these two descriptions to myself, just in case, plus an actually short alt-text (512 characters or fewer for Misskey compatibility or even 200 characters or fewer) that'd be the only description that I'd publish right away?

There would be no excessively long alt-text (at least not right away). There would be no tens of thousands of characters of long description in the post (at least not right away) although I couldn't guarantee that the post won't exceed 500 characters.

At the same time, this would give me "notes" that I could source if someone asked me to describe some detail. And if some Mastodon alt-text activist came and complained that my description is lacking all over, I could replace the short alt-text with the already existing long alt-text and add the long description to the post text right away.

Granted, my workload would increase some more. Most of it might end up for nothing most of the time. And nobody would get text transcripts unless they'd ask for them because I couldn't possibly fit them all into just a few hundred characters.

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #A11y #Accessibility
Netzgemeinde/Hubzilla

@Author-ized L.J. That's the problem: Whatever I do, I'll lose either way.

On the one hand, I feel a great pressure to describe and explain everything in advance. That way, nobody would ever have to ask me to describe a detail or explain something. And nobody, not even the most die-hard Mastodon alt-text activists, could say that I'm careless and that I only do the very bare minimum or not even that. There are people out there who are eager to block everyone who doesn't describe their images enough or lecture them or attack them for being lazy.

The last time I've described an image for Hubzilla, I refused to write detailed descriptions for the images within that image. That would have escalated and cost me weeks to describe them all because I'd also have had to describe dozens of images within these images and even more images within these images. Mind you, someone who travels to the place I've described couldn't actually see what I'd have described because the images in my image themselves have a limited resolution. But I genuinely felt bad for not describing these images.

Besides, if I only described my original images once, namely in the alt-text, and then briefly and concisely, and if someone came and asked me to describe certain elements at greater detail, I couldn't always do that. Sometimes I couldn't go back to the place shown in the image and take a closer look and write a more detailed description because that place simply doesn't exist anymore, or it has been modified, and it doesn't look like the image anymore. The details that I'd have to take a closer look at are gone.

On the other hand, my experience is also that posting more than 500 characters at once reduces my reach on Mastodon tremendously. I think I must have over 700 or 800 followers, but my reach on Mastodon is similar to that of someone with not even a dozen followers. And I don't think that's because what I post is so uninteresting or because of my rather controversial thoughts about the Fediverse, accessibility in the Fediverse, image descriptions etc.

Basically, I can't possibly post images without risking being sanctioned by anyone.

I've briefly considered putting my long descriptions into separate HTML documents and linking to them. In theory, that would reduce the length of my image posts greatly. However, this is entirely untested. I don't know if it'd work at all, i.e. open the HTML document in someone's browser rather than downloading it to their device as a file. I don't know either if a plain HTML document with no style sheet would be accessible to screen reader users.

What I do know, though, is that Mastodon hates external links with a flaming passion. That's also because the vast majority of Mastodon users is always on phones, using dedicated Mastodon apps. They hate their browser popping open when they tap a link all the same. Also, they tend to distrust external links because the linked documents or pages may not be sufficiently accessible.

Everything would be a whole lot easier if there were Fediverse-wide standards for image descriptions that take the requirements of blind or visually-impaired people into consideration as well as Mastodon's unique culture. If these standards were known to everyone both on Mastodon and in the non-Mastodon Fediverse. If everyone from blind or visually-impaired users to neurodivergent users to fully sighted alt-text activists agreed upon these standards all the same. And if these standards covered extreme edge-cases like mine as well. If there was a generally agreed-upon consensus on a whole lot of questions like:
  • Is it okay to have to ask for detailed descriptions of certain details in an image that don't matter within the context of the post?
    Or do they have to be described right away if there's a chance that someone might be curious about them? What if nothing specific in the image matters more within the context than everything else?
  • Is it okay to have to ask for explanations if you don't understand the topic of an image?
    Or do images about very obscure niche topics have to come with enough explanations for everyone to understand them right away (not counting technical or jargon terms which always have to be either avoided or explained)?
  • So there's the rule that all text within an image must be transcribed verbatim. How far does this rule go?
    Let's suppose I have a few dozen individual bits of text within an image. Most or all of them are so small that they're unreadable. Some are so tiny that they're actually invisible at the image's resolution. Still, technically speaking, they're there. And: I can read them. Instead of reading them in the image, I can read them at the source. So I can transcribe them all.
    What is the rule then?
    Do I have to transcribe them although they're unreadable because the rule says all text has to be transcribed?
    Do I have to transcribe them although they're unreadable because not doing so and writing that they're unreadable with no transcript is or may be considered lazy?
    Do I have to transcribe them because they're unreadable, and even fully sighted people need a transcript to know what's written there?
    Mustn't I transcribe them because they don't show themselves as text in the image at the image's resolution (if they actually don't)?
    Mustn't I transcribe them because I must only describe what's visible in the image at the image's resolution to the naked eye?
    Do I have to transcribe them in my special edge-case in spite of the two above lines because this might be my last and only chance to transcribe them, for they may be gone tomorrow, and I would no longer be able to transcribe them if someone asked for a transcript? Or must I remember to keep personal transcripts of all the texts I come across in my images, just in case someone asks for a transcript of a bit of text that no longer exists?
  • Must all text transcripts always be in the alt-text as opposed to an extra long image description in the post? Even if I have 20+ individual text transcripts to squeeze into Mastodon's limit of 1,500 characters of Misskey's limit of 512 characters?
    Or is it okay to
    • transcribe them in a separate long description in the post text
    • not put these transcripts into the alt-text
    • mention in the alt-text that there is a long image description in the post, that all the texts in the image are transcribed there, and how exactly to find that long image description?
  • If any of the above requires a separate long image description because the image description won't fit within the alt-text character limits, is it preferred for the long description to be in a linked document that will open in the browser (given one has the means to write and host such a document, and users on Hubzilla, (streams) and Forte do have these means)?
    Or must the long description be where the image is at all costs? Must it be in the post itself for the convenience of app users even if it inflates the post to a hyper-massive length to the inconvenience of Mastodon users?
Unfortunately, this would require some very extensive discussions on Mastodon, involving mostly Mastodon users. But Mastodon isn't fit for this kind of discussion or debate at all.

Worse yet: I've recently found out that none of the things above must be discussed on Mastodon. Ever. You must not discuss that stuff. You must do it. But you must do it right off the bat. For whichever individual definition of "right".

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #CharacterLimit #CharacterLimits #CharacterLimitMeta #CWCharacterLimitMeta #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #Transcript #Transcripts #A11y #Accessibility
Netzgemeinde/Hubzilla