I built possibly the earliest same-tab tool for getting AI image descriptions in Aug 2022, using a then newly updated Microsoft AI API. I listened to Disabled folks about it. I've payed close attention to the subject since.

My criticisms of @mozilla plans for Firefox built in AI alt text generation aren't because of knee-jerk AI hatred or some ignorance. They are because I understand this to a breadth and depth extremely rare and I want acknowledgement of major issues being ignored 🧵…

Boost? 💜

This issue is complex in several ways contributing to conflict here. Those saying objectors must not understand things are themselves not understanding the roots of objections.

The text of @mozilla announcements on social media talked only about AI generated descriptions accessible via screen reader. This is commonly asked for by those needing alt text, and the above tool was built for exactly that reason: it can be useful to fetch an AI description of an image when unable to see it.

🧵…

The post image was a UI for use when writing alt text, and the linked blog confirmed that this was on the roadmap.

Usage patterns by those writing alt text can be readily predicted, because these tools already exist and are fairly widely used. Current patterns of usage often show a lack of any care to edit or even review generated descriptions before posting images with them as alt text to social media.

Ignored objections from alt text dependent folks call out these descriptions as poor.

🧵…

The detail that this is specifically about social media posts is important, and it is not widely understood that the description needs and process are substantially different from other page types. I think based on existing image post patterns that Firefox AI description write usage would primarily be on social media.

There post context is crucial. The goal is equal access to a post and that context informs what in the image is important to describe. Sometimes those aspects are abstract.

🧵…

Images posted to social media are very diverse in a number of ways. It is evidenced by the images used to test @mozilla description AI being, as far as I could tell, entirely stock art that this diversity is not being accounted for. Test images are often ones unlikely to be posted to social media as is

The descriptions being generated are also assessed based on needs for posting images in places other than social media

The result: assessing project progress does not consider likely usage

🧵…

Criticism of this @mozilla image description AI and others is frequently met with impressive demos, but this repeats a frequent AI advocate fallacy: AI is often impressive but objections are that it is not reliable in many surprising ways. That concern can not be addressed with just a few examples.

Other remote AIs offering more advanced description produce outputs which are substantially more detailed and advanced, but they fail in shared and unique ways. They also chug power and water.

🧵…

There are still more details, but I'll leave it here.

I believe tools for those writing alt text which include image description AI could contribute to web accessibility.

To actually contribute they need to include and deeply understand the needs of Visually Impaired people relying on alt text. Current tools seem to largely prioritize the time and effort of those writing alt text over equal access to the web.

Questions or comments? I am happy to chat here or via email: [email protected]

💜

@hannah The "it doesn't understand context" problem is also why translators are saying that AI can't do translations either, for purposes other than "give me the gist of this text". As with alt text, it's all about the content producer, not the reader. Provided it's fast, "cheap" (we ignore the cost to the planet, right?) and looks like it means something, that's all that matters.