So, have you ever said to yourself, "Self, I wish that I could read online news more like a newspaper? Well do I have your ticket!
Backstory: I frequently read news headlines/stories for a friend no longer able to read. Among the news sources is CNN, particularly its "lite" headlines page. Though I found myself frustrated that 1) there is no context other than the headline, 2) the headlines themselves are utterly unorganised, and 3) a huge portion of them are fluff (sport, entertainment, style, food, etc.). I was hoping I might be able to do better. I'd hinted at this a few months ago on HN: https://news.ycombinator.com/item?id=42535359.
One bright note is that CNN's URLs contain much of the organising information already, with section heading as well as the date (YYYY/MM/DD). Parsing those is a Simple Matter of Programming (SMOP). Organising them takes ... a bit more work, but what I ended up with my Mark I implementation was the "lite" headlines page organised by section, with the sections themselves ordered by my own interest in them. That alone was a vast improvement.
I still wanted to go further; in particular, I wanted to include lede 'graph context for stories. Which meant pulling down the related articles themselves. ...
... Which I've spent the past few days doing.
Mark II such as it is consists of a few shell scripts to pull, parse, and generate the website. What I've laid out is very much styled as a classic newspaper: all text, no images, and of course, no ads. It takes a few minutes to update the page, which I then view locally, or copy to my e-ink tablet. For those wondering, no, this is not publicly available. (Yet?)
What's curious for me is how much saner this presentation is than virtually any current online news site, most of which far more resemble picture galleries (with utterly gratuitous images) than information services. The ability, as with the days of print newspapers, to glean the main gist of a story without having to click through, and then ward off cookie, paywall, TOU, nag, autoplay video/audio, dickbars, etc., etc., provides a cognitive ease that's hard to express.
This is also giving me pause to consider why online news looks and acts the way it does, both in terms of UI/UX and content, to which I can only suggest that virtually all the incentives are perverse, on both the publisher and reader perspectives.
Screenshots show a general sense of the layout as well as detail views of parts of the page. There are still some layout glitches (I've been writing Flexbox CSS for approximately a week now 😺 ), but I'm pretty happy with it, and much happier than with the original. Oh, and the design is remarkably responsive even without @media queries.
(There's more design work in the page itself, including internal cross-references and URL rewrites to the https://lite.cnn.com/ page itself, some of which I'm fairly chuffed about, but ... screenshots for now.)
Thoughts are now toward a news-page generator which incorporates a number of different sources, with some sense of categorisation and prioritisation applied to those. Still sorting how to proceed on that.
Oh, and preempting questions about why this particular site and its quality:
CNN is easily parseable and not paywalled. There area other options which fail one or both of these tests, e.g., NPR has nonsemantic URLs, the NY Times is parseable but paywalled, etc. I'm working with what is to hand but am more than open to better alternatives.
The HN link above includes stats on the section distribution itself, which ... is less than ideal, as well as the substantvity of much of what remains (dittos). F'rex, a couple of days ago, one of the three "Science" stories ... largely revolved around a spaced-out pop singeress. Not exactly what I'd call hard-hitting content. Three months on the relative percentage still hold. Aggregating those to larger groupings, the overall article breakdown as of this posting is:
(For those summing totals: there's one "unclassified" story, an opinion piece.)
തിരുവനന്തപുരം: കർമ ന്യൂസ് എംഡി അറസ്റ്റിൽ ഓൺലൈൻ വാർത്താ ചാനലായ കർമ ന്യൂസിൻ്റെ എംഡി വിൻസ് മാത്യുവിനെ പൊലീസ് അറസ്റ്റ് ചെയ്തു. കളമശ്ശേരി സ്ഫോട...#karmanews, #vinsmathew, #kalamasheri, #onlinenews
"Like nearly everyone else on the internet, yesterday the staff of 404 Media learned the name “Luigi Mangione” and sprung into action. This ritual is now extremely familiar to journalists who cover mass shootings, but has now become familiar to anyone following a news story that has captured this much attention. We have a name. Now: Who is this person? Why did they do what they did?
In an incredibly fractured internet where there is rarely a single story everyone is talking about and where it is impossible to hold anyone’s attention for more than a few minutes at a time, the release of the name Luigi Mangione sparked the type of content feeding frenzy normally only seen with mass tragedy and reminiscent of an earlier internet age when people were mostly paying attention to the same thing at once."
https://www.404media.co/luigi-mangione-played-among-us-breathes-air/
The wave of #lawsuits reflect a #MediaIndustry-wide concern that #GenerativeAI will compete with established publishers as a source of information for internet users, while further sapping away dwindling #advertising revenues and undermining the quality of #OnlineNews. #Microsoft
The #Intercept, #RawStory and #AlterNet sue #OpenAI for copyright infringement | #AI
https://www.theguardian.com/technology/2024/feb/28/media-outlets-sue-openai-copyright-infringement