«Models All The Way Down» is a really interesting exploration of how an image data set (in this case LAION-5B) was assembled to be used for training ML/"AI" models!

> "It contains less about how humans see the world than it does about how search engines see the world. It is a dataset that is powerfully shaped by commercial logics."

https://knowingmachines.org/models-all-the-way

Models All The Way Down

LAION-5B is an open-source foundation dataset. It contains 5.8 billion image and text pairs—a size too large to make sense of. We follow the construction of the dataset to better understand its contents, implications and entanglements.

«Here we find another truth about generative AI:
The concepts of what is and isn't visually appealing can be influenced in outsized ways by the tastes of a very small group of individuals, and the processes that are chosen by dataset creators to curate the datasets.
In the case of Midjourney, by a handful of esoteric nerds, and by a 65-year old mechanical engineer living in Southeastern Wisconsin.»
@gedankenstuecke I also liked it, I found it a bit frustrating that (on mobile) I couldn't see how far within the page I am and how much longer the story was
@tante yeah, totally agree on that, I had seen it yesterday evening on mobile and decided to leave reading it until I was at a bigger screen.