Mastodawn

NVDA: AI Image descriptions progress, discussion #19807The on github
https://github.com/nvaccess/nvda/discussions/19807

latest good NVDA installer of AI image descriptions can be found here.
https://download.nvaccess.org/snapshots/try/try-image-desc/

After early alpha testing feedback, on-device AI image descriptions were removed from 2026.1.
This is due to these main reasons:

low quality of descriptions
lag when enabling the feature
lag while the feature is running
no option for higher quality, NPU/GPU description
no VQA: Visual Question Answering: e.g. asking follow up questions about the image
no translations: descriptions are only in English
To reintroduce the feature in alpha, we want to fix the following things first:

have a simple, basic but accurate describer that can run on mid-range CPUs.
So far we have done this, by recently improving the model slightly.
minimize lag when enabling and running the feature: ensure NVDA remains responsive
ideally a way to tell the confidence of the description
The next biggest priorities are:

Add a higher intensive model, that runs on NPUs and GPUs, and offers VQA
Add models to translate output
After:

We would like to offer a wider range of models via a model manager
A big technical challenge here is the lag importing numpy introduces, which the python onnxruntime requires.
We investigated creating a C++ layer, but the implementation is still experimental and not working for ARM64EC: microsoft/onnxruntime#15403
We could consider offloading onnxruntime, numpy and the describer to a separate process, similar to the 32bit shim.

#nvda #screenReader #imageDescription #Blind #llm #AI #openSource

AI Image descriptions progress · nvaccess nvda · Discussion #19807

The latest good NVDA installer of AI image descriptions can be found here. After early alpha testing feedback, on-device AI image descriptions were removed from 2026.1. This is due to these main re...

GitHub