The author may have identified that "the idioms come from the use of system frameworks", but they absolutely got wrong just about everything about why apps are not consistent on the web (e.g. I was baffled by their reasons listed under "this lack of homogeneity is for two reasons" section).
First, what he calls "the desktop era" wasn't so much a desktop era as a Windows era - Windows ran the vast majority of desktops (and furthermore, there were plenty of inconsistencies between Windows and Mac). So, as you point out regarding the Win32 API, developers had essentially one way to do things, or at least the far easiest way to do things. Developers weren't so much "following design idioms" as "doing what is easy to do on Windows".
The web started out as a document sharing system, and it only gradually and organically turned over to an app system. There was simply no single default, "easiest" way to do things (and despite that, I remember when it seemed like the web converged all at once onto Bootstrap, because it became the easiest and most "standard" way to do things).
In other words, I totally agree with you. You can have all the "standard idioms" that you want, but unless you have a single company providing and writing easy to use, default frameworks, you'll always have lots of different ways of doing things.