A video showing off a bunch of wooden kinetic contraptions:

  • a shrimp riding a unicycle on a toilet roll as it gets unrolled
  • a swing band consisting of shrimps playing guitar, keytar, trumpet and saxophone, as they swing, dance, or spin around.
  • a hand cranking the mechanism; you can see the wooden tumbles go up and down as it spins. A lid, with pyramids on top, above the mechanism has the text "what is hidden under the pyramids?". As another hand lifts up the lid, a blue, green and grey alien are revealed, playing some kind of brass wind instrument, turntables, and a keytar.
  • a frog with top hat, riding a galloping snail
  • a skeleton dancing
  • a skeleton with red hearts for eyes, dropping its jaw
  • a shrimp behind a wooden table with a cutting board, chopping orange, grey, and grey foods.
  • 3 beans with faces and limbs, dancing on top of a can of beans; one has a boombox, while another spins on its head.
  • a wooden kitchen table, full with food, with mice swinging and dancing around with the food.
  • 3 pumpkins; one playing a drum, another a harmonica, and the third a saxophone
  • a cat with a blue wizard har, stirring on a cauldron with a green brew.
  • a dog riding a Small Red Car
  • A frog with a top hat riding a unicycle on top of a roll of toilet paper being unrolled in its holder.
  • a shrimp in a suit, riding a skateboard over a threadmill, hopping over shells, fish, and other things.
  • a beaver and a a hedgehog, clapping their paws.
  • a bunch of pasta dancing at a rave: rave-ioli. There's a purple turntable and big speakers.

#MediaDescription #AltText #DescribedForYou
CC @Natasha_Jay

@Chris Another good question is what the Fediverse (Mastodon specifically) expects in an alt-text for a video. A summary which should probably go outside the video? Or a visual description of what's shown in the video, just like an alt-text for an image, but for moving and constantly changing visuals and maybe even time-coded?

#AltText #AltTextMeta #CWAltTextMeta #VideoDescription #VideoDescriptions #MediaDescription #MediaDescriptions
Netzgemeinde/Hubzilla

@iFixit and it doesn't look like you can attach documents to posts
You can't on Mastodon. I could, both here on Hubzilla and on (streams) where I post my images.

But I wouldn't have to. Vanilla Mastodon has a character limit of 500. Hubzilla has a character "limit" that's so staggeringly high that nobody knows how high it is because it doesn't matter. (streams), from the same creator and the same software family as Hubzilla, has a character "limit" of over 24,000,000 which is not an arbitrary design decision but simply the size of the database field.

By the way: Both are in the Fediverse, and both are federated with Mastodon, so Mastodon's "all media must have accurate and sufficiently detailed descriptions" rule applies there as well unless you don't care if thousands upon thousands of Mastodon users block you for not supplying image and media descriptions.

In theory, I could publish a video of ten minutes, and in the same post, I could add a full, timestamped description that takes several hours to read. Verbatim transcript of all spoken words. Detailed description of the visuals where "detailed" means "as detailed as Mastodon loves its alt-texts" as in "800 characters of alt-text or more for a close-up of a single flower in front of a blurry background" detailed. Detailed description of all camera movements and cuts. Description of non-spoken-word noises. All timestamped, probably with over a hundred timestamps for the whole description of ten minutes of video.

Now I'm wondering if that could be helpful or actually required, or if it's overkill and actually a hindrance.

CC: @masukomi @GunChleoc

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #Mastodon #Hubzilla #Streams #(streams) #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #MediaDescription #MediaDescriptions
Hubzilla - Join the Fediverse

@masukomi @iFixit And this is only mostly a transcript of the spoken words.

What if someone actually took upon themselves the effort to describe a video with a timestamped/timecoded combination of visual description, spoken word transcript and non-spoken word audio description? Especially if the visual description is on the same high level of detail that's expected in the Fediverse?

CC: @GunChleoc

#FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #MediaDescription #MediaDescriptions
Netzgemeinde/Hubzilla

I have a question regarding #accessibility and #AltText / #ImageDescription / #MediaDescription.

I need to come up with an image description for the attached image for my job. I'm at a complete loss. This is alt text nightmare mode.

Are there established methods for describing complex figures such as this one?

I would be very happy about some input from people who actually need image descriptions.

@Christopher M0YNG Just out of curiosity: What would be an appropriate textual description for a video?

A description of what the video is about?

Or a detailed, time-coded description of the actual visuals throughout the whole video plus a detailed, time-coded transcript of the audio in the video?

If the latter, what details are required, regardless of topic and content?

CC: @Stefan Bohacek

#VideoDescription #VideoDescriptions #MediaDescription #MediaDescriptions
Netzgemeinde/Hubzilla

I'm thinking about adding at least one of my channels to Trunk. I mean, it isn't like I don't have enough followers; they've risen above 500 again. But Trunk would help people follow me for a better reason than just one cool post or comment, all still without having to figure out how to check my profile.

That said, Trunk requires you to volunteer on at least one list, in at least one topic. That's where things get difficult.


For one, there's Described Media. I'm not even kidding: It's a list for people who describe the media which they post. People who add alt-text to their images. Even though everybody in the Fediverse is expected to do it all the time, at least if their posts reach Mastodon in some way.

I do it. But I don't do it "the standard Mastodon way". For one, Mastodon's limitations, especially the 500-character limit for posts, don't apply to me. I don't have any character limit in my posts. Thus, nothing forces me to describe and even explain an image only in alt-text because I've got plenty of space in my posts.

Besides, my images require absolutely massive image descriptions, especially taking all those typical image description guidelines into consideration. That's because none of them are prepared for the edge-cases that are my images. And with "absolutely massive", I don't mean, "800 characters? Are you nuts?! Who's gonna read that?!?" I mean up to over 60,000 characters, and I can guarantee you this is not a typo. Maybe even more in the future.

I'm not quite convinced that I'm a good example of a provider of media descriptions, partly because by adhering to general image description rules, I break most of Mastodon's image description rules, partly because next to nobody has the patience to read one image description that's longer than 120 toots or have it read to them by a screen reader, partly also because my own image descriptions become obsolete so quickly whenever I discover something new that I should do in image descriptions.

Even if none of this mattered, I don't post images often. Maybe once every couple months. That's because I have to schedule my image posts due to how much time they consume. The 60,000-character description took me two full days to research and write, breakfast to after dinner. And it might become even rarer in the future. I've started a dedicated (streams) channel to be able to post images with sensitive content, including but not limited to eyes and faces. But posting these will eat up the time I could also use to post perfectly safe images on this Hubzilla channel.

The Described Media list is rather for people who routinely whip up 200 characters of alt-text in under a minute or so, but who do so at least daily.


An even more obvious list, at least at first glance, would be 3D Virtual & Augmented Reality, seeing as the primary topic of this channel is OpenSim. In fact, in the long run, I could add two or three channels to this list.

But OpenSim does not fit on it. The list is for actual virtual reality, for new virtual reality and augmented reality developments of the 2020s. "The Metaverse" as envisioned by most. It absolutely requires VR or AR headsets, full stop.

OpenSim has been using the term "metaverse" routinely since as early as 2007, the year of its inception. But the list is not about "metaverse". It's about VR.

And OpenSim is what's commonly called a "pancake". It's made for desktop and laptop computers and their 2-D screens. It does not really work on VR headsets. It does not work on stand-alone VR headsets with integrated graphics hardware at all. That's mainly because VR headsets require a constantly guaranteed frame rate of 60fps. It isn't simplified and cartoonish and geared towards mobile graphics hardware like Horizons or Rec Room or the like. Instead, it's largely photo-realistic, high-detail stuff with high-resolution textures.

You may get 60fps out of a dedicated graphics unit on a not-too-highly-detailed sim when you're alone. But have more than a few avatars around, and your fps will drop below 60. Join a party or any other event with a couple dozen avatars, and you're heading for slideshow-level fps. That's because the avatars aren't made by the OpenSim devs and optimised for high performance. They mostly entirely consist of user-supplied stuff and optimised for good looks. Some two years ago, one average avatar had more vertices than an entire scene in World of Warcraft. They've only gotten much, much more complex since then.

A liquid-cooled 4090Ti overclocked to kingdom come won't give you 60fps at 1080p at OSgrid's Event Plaza on a Friday night. So, what chances does a stand-alone, passively-cooled headset based on phone hardware have if it has to whip up even more pixels? And none of this is even taking recently-introduced Physically-Based Rendering into account which absolutely requires dedicated graphics hardware with no less than 4GB of dedicated VRAM, preferably at least 8GB.

That is, you couldn't use OpenSim on a stand-alone headset anyway. There are only two OpenSim-compatible viewers available right now, they're only available for desktop operating systems, and their highly complex UIs (pull-down menus like you've last seen in Photoshop etc.) are entirely geared towards desktop and laptop computers.

In brief: OpenSim is not VR, and it's unlikely to ever truly become VR.

Okay, I still have the option to ask one of the four Trunk admins to add an extra "Virtual Worlds" list, arguing that OpenSim, just like Second Life, is not VR and thus doesn't fit onto a VR & AR list. But they might argue that it's close enough to VR & AR for a separate list not being justified.

#Long #LongPost #CWLong #CWLongPost #FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta #MediaDescription #MediaDescriptions #AltText #AltTextMeta #CWAltTextMeta #ImageDescription #ImageDescriptions #ImageDescriptionMeta #CWImageDescriptionMeta #OpenSim #OpenSimulator #Metaverse #VirtualWorlds #VR #VirtualReality #AR #AugmentedReality #Trunk
Trunk for the Fediverse

@Robert Kingett, blind If at all, I would have to do it all myself with no technical aids involved. Nobody would be able to help me with it.

After all, I'm not talking about videos shot in real life with a video camera, especially not scripted ones.

I'm talking about unscripted, spontaneously produced video captures from very obscure 3-D virtual worlds. In order to describe these videos properly, extensive detail knowledge about this super-niche topic, its technology and its culture would be absolutely mandatory, and it would have to be up-to-date by hours at most. This detail knowledge is also necessary to be able to judge what has to be explained and described.

Also, it'd be impossible to properly describe these videos by watching these videos. They can only be described by logging into that world, teleporting to the place where the video starts and then looking at everything that's shown in the video from up close and sometimes even from different camera angles than in the video itself.

#Long #LongPost #CWLong #CWLongPost #MediaDescription #MediaDescriptions #AudioDescription #AudioDescriptions
Netzgemeinde/Hubzilla

@Robert Kingett, blind I don't trust anything generated. At least not with super-obscure niche content like what I post.

And audio descriptions in general are why I'll never publish videos in the Fediverse.

I'd have to go into similar detail as for my pictures, only for moving pictures plus sound plus voice-over now. My descriptions would have to be so detailed that the video would have to pause to let the audio description catch up with the visuals. In fact, the video would spend more time paused while the audio description is rambling than actually moving, and it would never spend more than a few seconds moving at a time.

For one, I would have to describe and explain what the video shows at the very same level of detailed as I describe my images. And at least once I've described one single image at such a level of detail that it'd probably take a screen reader one full hour to read the image description aloud.

Besides, I would have take into account that it's a video. Everything would need timestamps. And instead of only describing the camera position and the camera angle, I would have to describe the camera movements like so:

Seven minutes, eighteen point one three seconds. The camera quickly rotates to the left around a vertical axis through a point roughly two point four metres straight ahead of the avatar. It starts rotating from the direction in which the avatar is facing, roughly twelve degrees to the east of north. The barn which has first appeared at five minutes, fifty-two point two eight seconds comes into view again, including all decoration around it. The camera only rotates around this vertical axis and not around any horizontal axis. The avatar does not rotate with the camera.

Seven minutes, eighteen point six four seconds: The video pauses to let this description catch up.

Seven minutes, eighteen point seven one seconds: The video no longer pauses. The camera reaches a rotation angle of roughly twenty degrees to the south of west. The rotation speed of the camera slows down. It continues to rotate to the left.

Seven minutes, eighteen point nine three seconds: The video pauses to let this description catch up.

Seven minutes, nineteen point zero four seconds: The video no longer pauses. The camera stops rotating at an angle of roughly twenty-five degrees to the west of south.


That is, in order to cater to deaf-blind users, I would have to have two time codes. One, the time code of the original video, not taking the pauses into account. Two, the time code of the described video with catch-up pauses.

And the video with catch-up pauses would be dramatically longer than the original video. Ten minutes of video would take me weeks to describe, probably over a month. And it would end up many hours long, depending on how much there is to describe and explain.

So a time code in the Braille description for deaf-blind users might actually read, "Six minutes, thirty-seven point five five seconds in the original video, fourteen hours, three minutes, forty-nine point two one seconds in this described version of the video."

By the way, no, an AI can't do that.

#Long #LongPost #CWLong #CWLongPost #MediaDescription #MediaDescriptions #AudioDescription #AudioDescriptions
Netzgemeinde/Hubzilla

@obrhoff @BlippyTheWonderSlug there's no #AltText and since I do usually use the :fediverse: on the go in #EDGEland, I don't load media per default, so an #ImageDescription / #MediaDescription would be nice.

Plus I do have blind and deaf followers so #accessibility is quite important.