YouTube

…BC people asked:

This video is about the recent rise of audio interfaces that can record directly in 32-bit IEEE float instead of the standard 16-bit integer audio [or high-end-audio-production 24-bit integer].

This is the best video I know about the subject, talking about the Zoom UAC-232 and how it probably works.

https://www.youtube.com/watch?v=s0g0XXm9XJk

This is the thumbnailed video. I can't vouch for it.

https://www.youtube.com/watch?v=0Wk_VPEi8Z8

The 232 does appear to have problems with software support…you need Reaper

Zoom UAC-232 - 32 Bit float - USB Audio Interface - Review

YouTube
Laaaaa la laaaa laaaa laaa
@mcc sung to the theme of Katamari Damacy?
@mcc I can hear the cowbell.
@beej I'm glad. That was my hope
@mcc Blue Öyster Cult?
@mcc my beloved REAPER

@mcc

Not worried at all. Unafraid, even.

@mcc "Never clip again"? Clipping will happen when the input to the analog-to-digital converter (ADC) maxes out. Doesn't matter what format the ADC spits out, after that point.
@hyc This particular ADC appears to have an *absurdly* high end to its dynamic range, high enough you'd not actually hit it in practice with even production audio gear

@mcc I'm reminded of design notes for Forth in the 1980s - the rationale for only supporting 16 bit ints. Because none of the real world sensors and actuators you'd ever interface with had anywhere near a full 16 bits of dynamic range.

After you capture this blazing sound, what DAC is going to play it back? What amplifier is going to play it back without destroying your speakers?

@hyc As you'd maybe know if you watched the Julian Krause video, this is intended to be used for professional audio production. In other words, it is assumed that you are recording and then applying a post-production step where you rebalance the sound. The advantage is that you can record and then adjust the sound afterward, instead of knowing how loud the thing is, setting a gain and recording. It potentially saves time. It's probably more for measuring very quiet things than very loud.
@hyc However, you're right insofar as *unless you're lazy*, this offers few advantages over 24-bit and some concrete disadvantages (in the form of limitations on your software workflow)

@mcc @hyc

I have not watched this video, but: A lot of music production software (DAWs and their plugins) uses 32-bit floating-point arithmetic internally, so the added step of data coming in as f32 doesn't seem that weird. That said, I've never found the dynamic range of 24 bits lacking.

@hyc @mcc Maybe watch the video to see it in action, it does work and has some merit.

I think the way it works is by employing multiple DACs each set at different gain levels to make sure some part of the signal is always captured, and then encoding the end result in a 32-bit float value so it can be boosted or attenuated without maxing it out (hence the "no clipping").

The only practical use of this is to make it possible to salvage a recording after the fact if too much gain was applied.

@Tijn @hyc @mcc I mean, that would be somewhat like HDR photography (multiple exposure lengths to reconstruct radiance without risk of clipping in either direction, then possibly postprocess to employ the extra info)

@cvtsi2sd @hyc @mcc Yeah, it is a bit like that. But instead of making different exposure levels visible at once, like in HDR, this simply helps you to turn down the gain of your audio signal if it turns out it was recorded too hot.

So it's useful, but only in a very specific scenario that can be somewhat easily avoided anyway by simply not turning the gain up too high.

@Tijn @hyc @mcc yep I meant the "front-end" part of HDR, without spatial tone-mapping (I guess that would correspond somehow to applying compression to the captured track when mastering?). I do think it's a nice feature, though, avoids a common failure mode, especially in less-than-ideal live recording conditions.
@hyc @mcc the idea is to change the gain down (or up, if recordng low volume sounds) in a daw afterwards. you're not gonna clip or lose any sounds, because the recording has such a high dynamic range. this makes it so you don't have to mess about with a recording devices gain before recording (the new Zoom field recorders like the H1e don't even have a gain knob for the microphone, because it isn't supoosed to be necessary.)

- posted by Lumi
@sorrowl @mcc ok, I suppose it might be handy if you have to mix a couple hundred channels together or whatnot.
I think there are even DAPs (and DACs too) which will play back such audio? I see some DACs sold on aliexpress for under $500 USD which use the AK4493 chipset which can do 32bit 784KHz PCM and even 512DSD (I have no idea what records that, since even Sony discontinued their DSD recorders, the ADC chipset in the Zoom UAC-232 only supports 256DSD as an example) playback supposedly.

But then, I already own a portable recorder which is capable of 32bit 384KHz recording, so Zoom's non portable UAC-232 which does 32bit 192KHz recording, doesn't seem too impressive to me personally.

It is at least cheap? (Maybe around $199 it seems from what I can find online) but higher bit rate and sample rate recorders (and playback systems) exist. They tend to cost a lot more than $200 though some are at least under $1000 USD. Going above 384KHz (for recording, not playback as playback devices are invariably more economical as far as I have observed) seems as if it was more than $2000 USD last I checked a couple of years ago, but presumably that will come down in time.

I agree though the "never clip again" thing is total BS.

Moreover, it's extremely rare still to find any place which sell lossless audio music that is better than 16 bit 44.1KHz.

Even Bandcamp which gets way too much love these days (especially after Epic bought them and fired everyone and basically pulled a complete dick anti-union move before selling them off again to Songtradr) has file size limitations.

So you'll encounter emails from Uwe Schmidt (behind various projects, e.g. Lassigue Bendthaus, Atom™, Señor Coconut, Stereonerds, etc.) apologizing for reducing the length of some of his higher bitrate recordings, so that they are available on Bandcamp in truncated form to fit into Bandcamp's file size limitations rather than not at all at higher bitrate. He has done the due diligence in his own recordings and I definitely hear a difference, but I've found I prefer to source his lossless audio more directly from his own website when possible as a result.

Most music producers, as far as I have observed, tend to use higher bit rate (even the marginal gains 24bit 96KHz as an example) for their "mastering" if at all since it gives them a little more wiggle room; but I still encounter some sticking to 16bit 44.1KHz and even huge commercial deejays have posted videos of themselves outputting shitty lossy MP3s to their USB sticks before going to play some sold out stadium shows, so I know that few actually give a crap about high bit rate lossless recording as nerdier audio sensitive types such as I do.

If I had to guess, whatever audio chip set (the https://www.akm.com/us/en/products/audio/audio-adc/ak5534vn/ by the looks of things thanks to Julian Krause's opening the device) Zoom is using in their UAC-232 just has 32bit 192KHz codecs as a given, so they might as well expose it to their users?

It's probably because even sourcing hardware audio chip sets that don't support such things in 2024 is increasingly rare.

If you look at the data sheet for the AK553x series ADCs, they claim to supports up to 32bit 784KHz sampling, so while I am not a Zoom customer, I would be wanting to know why they only expose 192KHz to users and not 384KHz or 784KHz.

There's even 64bit audio recording stuff out there; but I haven't owned any such gear personally and even if it's worth it (to someone like me, it probably would be, but I can't afford it currently), I can only imagine that the recording industry and distribution channels (and most likely, a lot of DAWs) aren't ready for it given that most of them barely seem ready for 24bit 96KHz, still.

CC: @mcc@mastodon.social
AK5534VN | Audio A/D Converters | Audio Components | Products | Asahi Kasei Microdevices (AKM)

111dB 768kHz / 32-bit 4 ch Advanced Audio ADC The AK5534VN is a 32-bit, 768kHz sampling, differential input A / D converter for digital audio systems.

Audio A/D Converters | Audio Components | Products | Asahi Kasei Microdevices (AKM)
@mcc tbh it seems like a lot of faff to solve a problem that can also be solved by not turning the gain up so high when recording.
@Tijn @mcc right but counterpoint, the dial needs to be set to 11
@Tijn I think the zoom h1n "oops track" is also an okay solution maybe
@mcc I genuinely wonder if this is simply because IEEE float capable ADCs and DSP stacks have become cheap enough to throw in audio gear. ultimately the outlier samples cannot be represented in a speaker driver without dynamic range compression (your speaker cannot go beyond its excision limits), and the analog front-end on the ADC is going to have input signal limits anyway, so if you dissect the ADC/AFE clip bounds into 2^24 chunks that's 24-bit PCM with the same result.
@gsuberland Krause has an interesting theory about how it works. Is there actually a float capable ADC out there?

@mcc there are some fancy ADCs that do adaptive dynamic range, which you could use for floating point representation (essentially translate the adapted dynamic range level to an exponent and the value to a mantissa). they don't directly output IEEE floats; you'd do that in the DSP.

high performance FPADCs are a subject of ongoing engineering efforts in the test equipment world.

@mcc the cheaper way to go is to stick an AGC IC on the frontend with really wide reference rails and read the attenuation level from that as an inverse scaling factor when translating the linear values. but that's a bit more fraught and honestly doesn't seem all that beneficial versus just having an ADC do 24-bit PCM across wider rails. or 32-bit if you want fairly pointless number increases.

@mcc @gsuberland

I don't know if there is now, but with clever circuit and FPGA design you can in theory do it with a fixed-point ADC + some very rapidly adjusting auto-gain circuits in front of it (giving you the exponent.)

10 years back my employer did some experiments on that for RF signals, but the designers could not get an auto-gain circuit to slew levels fast enough without distorting the fixed-point values while adjusting the incoming power.

But that was at multi-MHz sample rates.

@mcc @gsuberland

Silly PS: I never know whether it reads better to say "sample rate" or "sampling rate."

@gsuberland @mcc Companies don’t publish much about how they get to 32-bit float but Sound Devices and others appear to be using 2 or more 24 bit ADCs with different, but overlapping input gain levels and combining the results in software. https://patents.justia.com/patent/20160241252 This multi ADC switching is something the field recording community has observed as ultrasonic noise bursts. https://paquinsound.blogspot.com/2021/12/the-problem-with-19232bit-recordings.html
US Patent Application for High Dynamic Range Analog-to-Digital Conversion with Selective Regression Based Data Repair Patent Application (Application #20160241252 issued August 18, 2016) - Justia Patents Search

A multi-stage analog-to-digital conversion method and system use window functions and translation to match high gain frames of data to target frames of data. The technique selects window data packets for the output stream based the stage of data having the highest gain satisfying selection criteria, such as requiring a frame of data for the respective stage to satisfy a predetermined accuracy of fit value compared to a target frame of data for a zero gain stage.

@mcc 32-bit float for audio uses only the ±1 range. An audio interface that produces float samples will be made such that this range coincides with the limits of the ADC chip. Anything outside of that will be clipped by analogue circuitry.

Floating-point is useful in a processing chain since it lets you normalise the range at the end without worrying about clipping or loss of precision at intermediate stages. In an audio interface, it is quite pointless. Likewise for distribution.

@mansr I am working off a video here and not direct experience, but it appears that the Zoom 32-bit interfaces specifically are able to go outside the -1 to +1 range, which creates problems because a lot of audio software (and some entire Audio APIs?) are not equipped to support that.
@mcc That's just idiotic. As you say, a lot of software is ill prepared for that, and it still has to clip somewhere. The noise limit of any practical audio ADC is somewhere around 21 bits, so 32-bit float can easily represent the full signal range without going outside ±1.
@mansr If 1.0 is "normal", and the effective true high range is in the range of something like 256.0, then from an end user UX perspective it might be inconvenient if every single signal were coming in at 1/256 power and your first step in the *normal* case was to bost by 256x.
@mcc as a 90s PC i do indeed fear the 32-bit float
@VDoesntKnow *turns on flashlight under chin* fdiv
@mcc okay but I want to click on this video
@foo I will always boost this
@foo @mcc so weird how floats are automatically everywhere but rationals are almost never first class
@leon @foo @mcc Rationals are supported by most common languages. I use them when possible, as well as high precision math libraries. Check other languages in [https://en.wikipedia.org/wiki/Rational_data_type].
Rational data type - Wikipedia

@pedromj @foo @mcc yeah, they’re usually supported but they’re definitely not *first class*. If I type a literal 0.5 pretty much anywhere that isn’t going to be a rational, despite the fact that’s the least lossy representation, and I’m almost certainly only going to be able to use them with a small subset of functions.
@pedromj @foo @mcc like if I was designing a language I’d make you use an unsafe-style keyword escape hatch every time you wanted to force it to use a floating point, with compiler warnings explaining to them that it’s almost certainly not what they want because it almost certainly isn’t

@foo @mcc This isn't a fraction - it's scientific notation. A fraction is easy AF to do:

struct Fraction {
numerator int;
denominator int;
}

@mcc This gives major Julian Krause vibes.

But also, not knowing what channel it's from, this kind of thumbnail is always such a toss-up. It's either some highly accurate, in-depth technical info, or a massive pile of nonsense.