Fediverse folks:

Using cutesy extended font tricks like ₜₕᵢₛ basically triggers a denial of service attack on folks who use screen readers.

Instead of saying "this" it says "Mathematical Sans-Serif Italic Small T Mathematical Sans-Serif Italic Small H Mathematical Sans-Serif Italic Small I Mathematical Sans-Serif Italic Small S"

Let's be cool the folks who have to use screen readers to participate in our community and don't do that.

@troublewithwords oh jeez that seems nightmarish. I wonder if it's possible to develop a screen reader that can mitigate the issues with that sorta edge case. Personally not super familiar with development of accessibility technologies but this seems like an area for the tools to improve just in case there are still people using those extended font tricks despite your warning.
@troublewithwords isn't this slightly on the screen reader software to not normalize this sort of text down to something it can read? because people who use screen readers must encounter this all the time.
@joshbuddy
Yes, but it will take time.
@troublewithwords
@MisterMadge @troublewithwords Totally. I guess what I'm a little surprised about though, is, people have been abusing unicode in weird ways for many years, so it's not exactly a new issue. Just not a high priority then? Sort of makes you wonder if a browser extension might be a simpler solution.

@MisterMadge @troublewithwords sorry, y'all kind of made me dig deep on this now.

so, this is covered in unicode normalization forms https://www.unicode.org/reports/tr15/. it actually looks like this can be done by a browser extension, soo, i'll make one for firefox i guess

UAX #15: Unicode Normalization Forms

Specifies the Unicode Normalization Formats

@MisterMadge @troublewithwords Here is a quick proof-of-concept. Works for the example in the original toot. I'll package it up nicely later, but would this be useful for instance? Also, no fan of this name, so would love to hear a suggestion for that (if its useful)

https://github.com/joshbuddy/normie

GitHub - joshbuddy/normie: Unicode text normalizer browser extension

Unicode text normalizer browser extension. Contribute to joshbuddy/normie development by creating an account on GitHub.

GitHub

@joshbuddy
Dude! This is legendary!

I don't have access to a computer this weekend to try it out. What does "Mathematical Sans-Serif Italic Small T Mathematical Sans-Serif Italic Small H Mathematical Sans-Serif Italic Small Mathematical Sans-Serif Italic Small S" normalize down to?
@troublewithwords

@MisterMadge @troublewithwords that's the hope, right? (i'm new to this world of screen readers and unicode)
@MisterMadge @troublewithwords so, i've made a lot of progress on a browser extension to get rid of hacky unicode usage. what i'm having a hard time finding right now though, is an example of some good ol' math in unicode that breaks with this approach. anyone happen to know where i could find something? i'll keep searching ofc, but if you have anything at hand, would be greatly appreciated
@joshbuddy @troublewithwords I have an error at line 1 of index.js
"Uncaught SyntaxError: Cannot use import statement outside a module"
@MisterMadge @troublewithwords i should add things to the readme. it needs to be built now, so run `npm run build` and then load the extension from the `dist` directory
@MisterMadge @troublewithwords just following up, were you able to get it running?
@troublewithwords Thank you for posting this!

@troublewithwords
@ThermiteBeGiants

Honest question: how can we help screen readers handle this situation better? Who makes them, and where do we file bugs?

@spacehobo @troublewithwords @ThermiteBeGiants Not sure how ChromeVox or VoiceOver track this kind of thing, but I can't find a bug for it on espeak? https://github.com/espeak-ng/espeak-ng/issues
GitHub - espeak-ng/espeak-ng: eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.

eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents. - espeak-ng/espeak-ng

GitHub
@troublewithwords @ThermiteBeGiants @spacehobo From what I’ve heard, they don’t give a shit. They sell their crappy products for thousands of dollars a piece and they can get away with doing the bare minimum.
@ThermiteBeGiants @troublewithwords @spacehobo Also I just read something about doing prompt injection fir Bing Chat using a web page written using that kind of characters, so it suggests LLMs are able to read text written like this. I have yet to try.

@troublewithwords
I would call for better options and improvements in screen readers, too.

After all, there a thousands of non-latin languages that deserve to get all the benefit of Unicode on platforms such as this that are also not friendly to screen readers, I suspect. As well, I'm sure there are unprivileged communities that use all sort of non-standard communication modes (think emoji usage among the youth in racialized communities).

@troublewithwords
Finally, even among privileged communities I'm not sure it is right to hang all the pain of such a screen reader incompatibility with just the behaviour of people who want and indeed need to express them in the many varied ways available to people. I agree we should be aware of and try to accommodate, but in the long term this should really be fixed in the tool. Either that or extend Mastodon to allow for the cutesy text equivalent to image alt text.
@troublewithwords in case it helps: the @PleaseCaption maintainers are working on a maths block nudge to go with your alt text nudges. Though it kinda strikes me we need a server-side version which watches the local timeline and nudges in the voice of the node, so we’re not relying on people subscribing one at a time.
@troublewithwords Is there a way to alert the screen reader builders that there's a bug with their software though? I'd expect at least something more innocuous such as "Mathematical Sans-Serif Small this".
@lertsenem @troublewithwords the following might help in this regard.
http://freedomscientific.com
email: [email protected]
Note: The first website is for #FreedomScientific who produce the most used #Screenreader For Windows.
the email address provided is to make the appropriate report to #Apple.
#Accessibility
Freedom Scientific – High-quality video magnifiers, braille displays, screen magnification software, and #1 screen reader, JAWS.

@troublewithwords @Lynessence Yep, As soon as I start hearing that, I check out and scroll right on by.
@troublewithwords it's not a feature it's a bug.

@troublewithwords

I wouldn't do this, and agree that no one should without a very compelling reason.

But I also think there is something badly wrong if screen readers don't have an "ignore fonts" option.

@troublewithwords IMHO that’s the fault of the screen reader and needs to be reported. And why does it report the font for subsequent letters?
@troublewithwords These programs are meant to be readers, not describers, so saying "Mathematical Sans-Serif Italic Small T Mathematical Sans-Serif Italic Small H Mathematical Sans-Serif Italic Small I Mathematical Sans-Serif Italic Small S" should be considered a bug. Because you wouldn't read it that way, would you? cc (because via) @foosel
@troublewithwords @foosel (mind you I'm not saying this is a trivial bug whose solution should be obvious. I have ideas how this might be solved that I know would fail in other circumstances. But that doesn't mean it's not a bug.)
@baeuchle @troublewithwords @foosel But those are the characters which are there. If they were be used in their proper context (e.g., in a mathematical equation) then you very much would want them read out like that to disambiguate them from other uses of similar plain letters.
@[email protected] @[email protected] @[email protected] At least on Mastodon 4.1.2+glitch you can use HTML or Markdown. Use the content-type drop down in the middle of the row of options below the post entry box. This is HTML using the u, b and i elements.
@edavies @troublewithwords @foosel even in mathematical context, i would not want them to be read like this. Have you ever read a mathematical formula out loud saying „mathematical character E equals sign mathematical character m multiplication sign mathematical character c superscript 2“?

@baeuchle Of course not, but:

1) The way a mathematician (or chemist or …) would read them out is very context dependent and

2) For most screen readers most of the time the vast range of odd Unicode characters are a really odd corner case and just using the existing Unicode database for naming seems a reasonable compromise vs bloating the code with lots of special cases.

@troublewithwords @foosel

@edavies @troublewithwords @foosel both of these are only saying „doing it correctly would be hard“ to which I agree; that doesn't mean it's not wrong and screen readers shouldn't strive to do better.

Normal letters don't have their unicode name read out, latin small letter d latin small letter o space latin small letter t latin small letter h latin small letter e latin small letter y question mark

@troublewithwords Does italic or boldface type cause the same problem?
@troublewithwords So humans should adapt their communication to the shortcomings of a specific client software? 
@troublewithwords Didn't know this was a thing - at least this kind of formatting. Thanks for spreading awareness :)
@troublewithwords @nazgul What if (and I know this is crazy) we fixed the screen readers?
@xvf17 @troublewithwords To know when to switch between “science mode” and “social media mode”?
@nazgul @troublewithwords There is no world in which the reported behavior is the best choice.
@nazgul @troublewithwords And yes, I see nothing wrong with adding a mode for accessibility
@troublewithwords i'm floored by the number of people who think screenreaders need to be able to cope with people purposefully misusing special characters to be cute, as if that's even truly possible considering how desperate people seem to be to add quirkiness to their visual output, so it's always changing, always new special characters being abused
@troublewithwords Do mathematical expressions sound like garbage like that, too? Or do they use the wording a mathematician would use?

@troublewithwords Agreed. That said the true way to solve this would be to pester Gargron for rich-text support so crude unicode hacks become obsolete. (text-to-speech should also be improved but the majority of them are still proprietary)

Unmodified Mastodon is the only Fediverse software which doesn’t even supports displaying rich-text, stripping all the HTML tags except links and span.
Meanwhile the rest of the Fediverse supports both displaying and formatting messages with rich-text.

@troublewithwords @dankeck I'm not sure it is a denial of service as much as spam, and screenreader users are just going to move onto the next post instead of sitting through your mess.
@troublewithwords Perhaps an alternative would be to post that in a picture, and then use alt text?
@troublewithwords
This makes me think, is there such a thing as alt-text for text?