Mastodawn

LLM query:

I've found that chatGPT can parody some styles well (e.g. Lovecraftian horror, Dr Seuss) but others it has got nothing (e.g. Hunter S Thompson, Brecht, Riddley Walker). Why is this?

My suspicion is that it is NOT due to the amount of original text, but due to the amount of fanfiction, which is essentially human-augmented enlargement of the original training data

Reckons?

#chatGPT #LLMs #AI

Show thread

henry Jun 16, 2023

@tomstafford Cool hypothesis! 😎 I don’t think the datasets have been made public. Could try it with authors that have a distinct style but no fan fiction. #Hemingway? #SalmanRushdie?

Show thread

Tom Stafford Jun 16, 2023

@henry yep, that'd be the way to test it. Find matched authors with similar amounts of original text and equally distinctive styles [no idea how you'd gauge that] but with / without fan fiction and challenge chatGPT to do parodies

Show thread

Dr.Implausible Jun 16, 2023

@tomstafford Ayup, reckon you're on to something. Somewhere between #modelcollapse and #datapoisoning sits "you trained your AI on human behaviour by letting it read fanfic?!" 👀

https://www.economist.com/science-and-technology/2023/04/05/it-doesnt-take-much-to-make-machine-learning-algorithms-go-awry

It doesn’t take much to make machine-learning algorithms go awry

The rise of large-language models could make the problem worse

The Economist

Show thread

planetscape Jun 16, 2023

@tomstafford Fascinating hypothesis! Hope you keep us in the loop if you learn anything!