Mastodawn

Gianni Rosato Apr 12, 2023

HEIC & its consequences have been a disaster for the human race

Show thread

R. L. Dane

Apr 12, 2023

@gianni

Because it's patent-encumbered?

Show thread

Gianni Rosato Apr 12, 2023

@RL_Dane
Among other things, yes.

I'd be curious to know the other things 😁

I've started using WebP where possible.

I wish my phone would snap pictures in that format directly.

JPEG is starting to get AARP flyers in the mail. 😂

Show thread

Gianni Rosato Apr 12, 2023

@RL_Dane
WebP is often worse than JPEG, especially for photographic images. It is better at low fidelity, about even at medium, & worse at high fidelity. It is decent with non-photographic images, though. This surprised me initially, as it goes against the whole purpose of WebP.

Lossless WebP, on the other hand, is fantastic. There hasn't been a better lossless format until lossless JXL.

That's really fascinating, thanks for sharing. I've been fascinated by codecs for many, many years (though I'm not terribly well versed on the subject), initially general-purpose ones like stuffit/arc/arj/zip/compactor pro/compress/gzip/bzip2, etc, but later JPEG fascinated me with its magical (for the time) powers.

I'm surprised that JPEG XL was still using DCT. I thought all the cool (video codec) kids had moved on to wavelets.

Show thread

Gianni Rosato Apr 13, 2023

@RL_Dane You'd have to ask Jon about that. I know JPEG2000 used the DWT, but VarDCT is unique to JXL. I'll ping him here in case he's around
@wb

Show thread

_wb_Apr 13, 2023

@gianni @RL_Dane Basically all modern codecs (h264, h265, av1, jxl,...) still use the DCT in one way or another. JPEG 2000 is an outlier. The DCT is still the best idea ever for lossy image compression. JXL extends JPEG by having multiple block sizes (not just 8x8 but also e.g. 16x32 and 8x4) and has many other improvements, generalizations and extensions, but the DCT is still one of the most important core coding tools.

Show thread

_wb_Apr 13, 2023

@gianni @RL_Dane
VarDCT, or rather the general idea of variable block sizes, is not unique to JXL: most modern video codecs also have something like that. The way JXL does it is more flexible than in video codecs though (more block types and positioning options, fancier entropy coding of the block selection itself), and it is also the first codec to combine variable blocks with progressive decoding (which is not trivial).

Fascinating, thank you for taking the time to answer.

So I'm guessing the codec analyses the image and decides on the unique mosaic of block sizes, or is it just one block size per image?

If almost everything is still DCT, why don't newer algorithms have the kind of chunky artifacts as old JPEG? They just seem to get blurrier, not as artifact-y.

Show thread

_wb_Apr 14, 2023

@RL_Dane @gianni
JPEG treats 8x8 blocks independently, causing blockiness. More modern codecs apply deblocking filters after inverse DCT. Essentially these are doing some kind of (selective) blur that gets rid of the block edges. Video codecs tend to do it quite aggressively to keep things smooth even at very low bitrates, which can lead to loss of detail and texture.

Interesting! I didn't realize the modern codecs were just "covering up" the "edges," so to speak.

Can you elaborate a little bit about what you mean by JPEG treating the blocks "independently?"

Are the newer codecs applying some kind of averaging across blocks, or making the blocks overlap or something?

Show thread

_wb_Apr 16, 2023

@RL_Dane @gianni
In modern codecs, you have deblocking filters, sometimes overlapping transforms, and directional prediction which cause dependencies between blocks. One issue with that is that it causes generation loss (accumulated artifacts after repeated lossy recompression) to spread further. In JPEG, "what happens inside the 8x8 block stays within the 8x8 block" (only exception: chroma subsampling/upsampling).
See also: https://youtu.be/FtSWpw7zNkI

Generation Loss: JPEG, WebP, JPEG XL, AVIF

YouTube

Show thread

Gianni Rosato Apr 16, 2023

@wb @RL_Dane Does JXL not have any directional prediction or shared data between blocks? How does it handle generation loss so well?

I assumed the VarDCT built on every coding technique that previous DCT variants used, although I’m not well versed in what those were & I may be off base

Show thread

_wb_Apr 21, 2023

@gianni @RL_Dane JXL indeed does not use directional prediction, since it turned out to be mostly useful at very low qualities, not really at the qualities typically used for still images. So it was not included in the spec. Also jxl has more subtle filters (video codecs have more aggressive ones), and defines things at high precision while video codecs use limited precision which is more hardware-friendly (since they need everything to be implementable in hardware).

Show thread

Gianni Rosato Apr 21, 2023

@wb @RL_Dane That's really interesting! Also seems like an easy way to hold video-based image formats back, if there are places where they won't ever improve because of the fact that everything needs to be hardware compatible (which iirc, hardware decoding isn't done for video-baned image codecs)

Show thread

_wb_Apr 22, 2023

@gianni @RL_Dane
Hardware decodability certainly does constrain bitstream design a lot. For jxl we might at some point define a profile for a limited subset that is suitable for hw decode, but for most use cases of still images, hw decoding is not needed or useful. Hardware encoders can always just use only a suitable subset of the spec, but decoders need to handle anything, and in video codecs, hw decoding is essential. This is one of main differences between video and image codec design.

Show thread

R. L. Dane

@wb @gianni

I vaguely remember seeing HW JPEG acceleration in years past (when it was still computationally quite costly for the CPUs of the era). Also, I wonder if digital cameras of the 2000s used any kind of hardware encoding for JPEG, or if they just had an optimized software stack running on their custom ARM CPUs.

...

...

Just remembered working with a fairly early Kodak digital camera in the year 2k that was so slow that the image on the "viewfinder" lcd updated the RGB channels on progressive scans, so fast-moving objects on the preview screen were pretty trippy-looking. ;)
Hah

Show thread

Gianni Rosato Apr 22, 2023

@RL_Dane
I think smartphone ISPs utilize hardware encoding right now for JPEG - not entirely sure though. They definitely do for HEIC if you're on an iPhone.

The early dawn of JPEG hardware was admittedly before my time, but hearing what you're saying, that sounds super interesting. I was born into a world where 'jpeg' and 'image' were basically already synonymous
@wb

Yeah, early JPEG was pretty exciting. To my knowledge, it was the first GOOD lossy algorithm of any kind -- QuickTime had lossy video codecs, but they were super simple: selectively updating the screen by rather large blocks, but no complex math or DCT AFAIK.
There were lossy audio codecs for voice, but they were pretty rudimentary. Nothing good for full-spectrum audio (music) until MP3.

...

...

I remember waiting *minutes* to decode a small JPEG on my 8mhz Mac SE. And of course, trying to compress 1-bit images was a miserable experience, so I didn't use it much until I finally got a color machine in '94 ;)

Show thread

Gianni Rosato Apr 22, 2023

@RL_Dane
That sounds like an incredible experience. Looking back, it looks like DCT-based coding algorithms are what truly revolutionized lossy multimedia compression, and the way it happened was really interesting! Obviously I wasn't there so I can't say much, but that's really cool
@wb

Show thread

_wb_Apr 22, 2023

@RL_Dane @gianni JPEG basically made digital photography possible — early CompactFlash cards would have capacities like 2 to 15 MB, so without lossy compression it wouldn't be very practical (you would only be able to store 1 or a few photos instead of dozens).

It also made network transfer of photos possible — I remember before JPEG (and GIF) we would use blocky ANSI art to get something graphical on our BBS (this was before the web took off).

By comparison, JXL is only a small improvement.

Show thread

Gianni Rosato Apr 22, 2023

@wb
I've only ever heard about JPEG's origins. Experiencing them - and helping cultivate them - must have been a uniquely incredible experience. I can only imagine seeing an innovation that I was a part of completely change the world like JPEG did.
@RL_Dane

Yes, I remember waiting quite a while for GIFs to load -- even progressive (interlaced?? I forget the right term) GIFs weren't that much help.

I wish I still had it, but someone took a "digital" photo of me way back in 1992 with a Kodak Xapshot camera at a Mac Users' Group meeting in Austin.

...

...

The camera wasn't actually digital, it recorded onto a video disc, but it got digitized when you connected it to your computer for "upload." IIRC, you actually needed a video capture card to digitize the photos, but I can't remember for sure. ^___^

Isn't HEIC basically just still-image-h264/5?
If so, the video encoding hardware could be utilized for sure.

Show thread

Gianni Rosato Apr 22, 2023

@RL_Dane
It's an h265 I-frame, and I believe video encoding hardware is used. I don't own an iPhone so I haven't looked into it too much.
@wb

I have a batch_de-heic script that calls #ImageMagick `convert` to deal with pictures from others' iPhones 😆

I might see if convert can handle JXL. File size is definitely more important than quality for the purposes of these particular photos.

Show thread

Gianni Rosato Apr 22, 2023

@RL_Dane
Because smartphone photos look like trash (being a bit cheeky here) most of the time, I usually resize them and then convert. ImageMagick does work with JXL if compiled to work with it, like the imagemagick-full package in the AUR. I usually compress smartphone pics to AVIF because I am usually targeting medium/low-medium fidelity
@wb

Show thread

_wb_Apr 22, 2023

@gianni @RL_Dane Yes, that's correct. Apple devices also do the encoding in independent tiles of 512x512 pixels, probably to keep the hardware footprint smaller. This is not so great because it leads to visible seams at the tile boundaries (though this is a relatively subtle artifact at high qualities).

Show thread

Gianni Rosato Apr 22, 2023

@wb
Considering smartphone photos are generally smoothed out oversharpened crap, I'm willing to write the entire sector's decisions off as nonsensical when it comes to image encoding. At least in the Android world, JPEG still largely reigns supreme.
@RL_Dane

Show thread

_wb_Apr 22, 2023

@RL_Dane @gianni I think most cameras (including phones) still use hardware jpeg encoding even today. Phones could easily do it in software instead, but the hardware for jpeg encoding is cheap anyway (no royalties, small gatecount footprint). Helps to save battery and allows doing burst photography. Encoders don't have to support the full codec though (unlike decoders), they just need to produce valid bitstreams. JPEG hw encoders e.g. use only baseline jpeg with fixed huffman tables.

Ahh, that makes sense. I wouldn't have thought that encoders would be more economical (because of increased complexity of encoding vs decoding), but the point about not having to support such a wide range of possible algorithms makes a lot of sense.

That also explains why cameras record to relatively high bitrates -- not just for best possible quality, but possibly also so that the encoding can be a bit simpler/"faster."

Show thread

Gianni Rosato Apr 22, 2023

@RL_Dane
Precisely. Only implementing a subset of a codec's possible features to help lower costs & increase speeds is a no-brainer for a hardware encoder, especially when speed is usually the most important factor. You're right about high bitrate too, fancy lossy coding techniques for smaller filesize aren't as important when filesize is secondary to speed & fidelity
@wb

That definitely explains why a phone can encode h265 in realtime while really aggressive settings on an hour-long video can take a day to transcode on even a fast CPU. :D

Show thread

Gianni Rosato Apr 22, 2023

@RL_Dane
Yeah, and that h265 video will probably look as good as h264 medium or fast, my guess.
@wb

Show thread

_wb_Apr 22, 2023

@RL_Dane @gianni Encoding can be more complicated than decoding (if you want to do perceptual modeling and compress well), or it can be simpler than decoding (if you just do something simple). Slow software encoding will produce better results than fast software encoding or hardware encoding. This is why I don't believe hardware will help much to make avif encode faster: sure hw avif encode will be fast, but it will compress worse than software jpeg encode, so I don't really see the point...

Show thread

Gianni Rosato Apr 22, 2023

@wb
I think for video, hardware AV1 encoding couldn't make more sense. My guess is that's what RL_Dane is referring to with "high bitrates" - I would assume hardware AVIF would probably not be a very worthy endeavor, unless you need higher bit depth or HDR
@RL_Dane