@MostlyCoraGrace The <article> tag generally wraps the only content worth actually keeping / reading on a page...
Answering @inthehands: you might want to take a look at the Readbility.js code to see what elements its looking for on pages to render the actual main article payload. Though I suspect much of that is itself an ad hoc mess.
I've suggested a browser built on an FYWD principle.
Some people say that means Fine Young Western Dinosaurs.
Others insist it means Fuck Your Web Design.
The parser would be premised on a strict rejection of Postel's Law.