One of the interesting and disturbing aspects of modern LLMs isn’t what new they do, but what they make cheap. Low-quality, duplicative content makes retrieving quality information incredibly hard, especially in commercially lucrative domains such as product reviews. Making it 3+ orders of magnitude less expensive to produce that content will not help our informarion ecosystem, even if AI content is identical in quality and substance to what humans are already producing.
@mdekstrand sure, sure, but it WILL shift the money made towards the owners of the LLMS! I mean… democratize… creativity… uhhhh
@mdekstrand I suspect/hope for the opposite. Product reviews is a trash business and has been for past ~5 years. By spinning the knob past the pain point we might have a chance that something will finally change.
@MichalBryxi That is possible. And if it's easier to detect LLM text than human bad text, it might make the filtering problem easier. We will see. I expect the near-term will be an increase in informational sewage, though.
@mdekstrand Could we steer them towards the opposite end? E.g. summarizing the good stuff and boosting signal to noise?
@vtijms Maybe, but they still need high-quality inputs (where the bar may not be high but involves "actually using the product"). What seems more likely to me: writing garbage reviews that successfully game the system into synthesizing what *looks* like an actual experience report but isn't.
@mdekstrand If Amazon's and Youtube's reviews were bad, from now on there won't be any signal at all any longer.
@mdekstrand This is a great summary of a big part of the risk with them. They also help us lie to ourselves - or more accurately other people lie to us who can use them at scale for developing propaganda and advertisements.
@mdekstrand when people are saying that ChatGPT is going to compose a lot of Internet content in "the future" what they're really saying is the Internet content of the future is going to be remarkably worse than the already bad Internet content we have today.
@mdekstrand It's going to be so painful, having to regularly invest minutes into a document just to discover that it's all empty words because there is no overarching ideas. Next level bullshittery that is difficult to discern because the form is flawless.
@mdekstrand i'm not sure about this -- for product reviews, it always made financial sense to game it … and, so you always had to find the very small number of players who had a reputational stake … that seems unchanged. although, i am curious how and when various entities might risk their reputation at the lure of LLMs productivity
@jbigham The particular risk I'm seeing here is a numbers game - finding the 0.2% of pages with a reputational stake is hard. Finding the 0.0002%, because the cost per page of garbage has gone down, is harder.

@mdekstrand crowdsourcing junk content has long been quite cheap… and a lot of the cost has always been actually interfacing with wherever it's being posted/published/etc. i'm honestly not sure how this plays out, but i'm not yet convinced it'll be harder to find good content.

i do think the days when i could supplement my professor salary writing essays about chickens on mturk are gone though.

http://www.cs.cmu.edu/~jbigham/posts/2014/half-workday-as-turker.html

My (Half) Workday as a Turker

@mdekstrand Thing is, junk content is already pretty pervasive online, and is increasingly dominating search engine results. The online economy seems structured to reward generating it, using low-wage human labor. I'm no economist, but it seems to me that the cost savings of LLMs for this use case are pretty minimal.
@isaac32767 I also am not an economist, but I would not at all be surprised to see cost per page go down by multiple orders of magnitude, especially as LLM costs go down.
@mdekstrand Obviously you're right. I'm just questioning whether reducing costs from $1/page to $0.01/page will make any difference in the demand for junk content. I suspect it will just make it more profitable.
@mdekstrand you make such a good point! It needs to be repeated more frequently to attempt to balance the caucophony of voices evangelizing this type of content creation as some kind of awesome new paradigm.
@mdekstrand This is an especially sharp observation to make the day after DPreview.com announced that it was being shut down by Amazon.
@adamrice That's really sad. They were such a good (and thorough) resource in a sea of garbage. I also didn't know they were owned by Amazon.
@mdekstrand I've had an issue with content farms generally for years. The early days of search engines were gold compared to today; I remember being able to easily find, say, a computer store nearby very easily using Google or Yahoo!.
Now? It's a slog through a ton of unrelated junk to find relevant information.
I never valued the sterling service the Yellow Pages provided back in the day, when I could find what I wanted more or less immediately.
This is true for most information trawls today.

@PopTarts @mdekstrand

Sounds like another instance of Doctorow's enshittificatin theory (https://pluralistic.net/2023/01/21/potemkin-ai/#hey-guys).

I am already mostly reading a curated list of webpages that I trust, I expect this will be even more necessary in the future.

Pluralistic: Tiktok’s enshittification (21 Jan 2023) – Pluralistic: Daily links from Cory Doctorow

@mdekstrand I can’t wait to have my first LLM customer support experience!