Reddit signed a deal with Google to use its data to train LLMs. To celebrate, we made a firefox extension that lets you replace all your comments with any text of your choosing. All we ask is that you not choose something copyrighted. The New York Times is currently suing OpenAI for training ChatGPT on its copyrighted material, and we wouldn't want Reddit's uniquely valuable data to become hopelessly intermingled with that of litigious copyright-holders.

https://theluddite.org/#!post/reddit-extension

The Luddite

An anticapitalist tech blog. Embrace the technology that liberates us. Smash that which does not.

@theluddite

Oh, I say, this is even more elegant than my replacing the comments with randomized text.

Brilliant thought, one must never use copyrighted text, should one!

@theluddite

I'm now happily getting autbanned from reddit subs because, hey, fuck reddit capitalizing on my data.

@theluddite sadly i already deleted my account without redacting the comments...
@theluddite this is brilliant! Growing increasingly annoyed with Reddit and their attempts to retroactively monetize their content

@theluddite

This is great! I was considering working on something similar in the differential security realm.

@theluddite Thanks, it's replacing my comments right now. 🖤
@kolya Yay I'm so glad! I'm also glad that it's working. If I had any idea how popular it'd be, I would've tested it a little more thoroughly 😅
@theluddite So, for an example, one wouldn't want to use the lyrics to Disney songs?
@KindlyWizard @theluddite Look at these comments, aren't they neat? Wouldn't you think my AI model's complete?

@theluddite

Also avoid replacing your comments with AI generated data as this could poison the model and cause it to collapse. The article below talks about AI image models, but the problem can occur on text models too, so be extra careful.

https://www.technologyreview.com/2023/10/23/1082189/data-poisoning-artists-fight-generative-ai/

This new data poisoning tool lets artists fight back against generative AI

The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models. 

MIT Technology Review

@theluddite I did that immediately before deleting my account during the API debacle.

The next day (after my account was deleted,) Reddit restored all the text to what it was pre-edit. All attributed to [deleted] of course.

They haven’t responded to any requests to actually delete the data, as some of it contains what could be personally-identifiable.

@charonpdx @theluddite “Immediately before deleting my account” is the key phrase there. You gave up your ability to fight them when you did that, and signaled to them that you were available for restoration.

I did edit-overwrites on all of my content and kept a copy of the permanent links to verify later. To date, nothing has been changed. If they did revert one of my edits, I could go back and edit it again or leave a reply calling them out on it.

@theluddite I'm so glad I bailed from that place awhile ago. It is sad though, I used to cherish that site.
@theluddite hahaha that's some devious level of malcompliance. 😂

@theluddite ROFLMAO!

As if learning was a copyright infringement...

Cuz if so we'd all be #DebtPeon|s to the #Copyrightmafia!

felixreda.eu/2021/07/github-copilot-is-not-infringing-your-copyright/

@theluddite Well, I have nothing better to do.
@theluddite @crmsnbleyd I used to be really active in tech subreddits. Boutta replace all my comments with "oh you just have to type 'sudo rm -rf'!"
@theluddite Would be good if it saved the comments before replacing them.
@theluddite I expect reddit keeps the full history of everything that's ever typed into any text box they control.

@theluddite Funny you should mention! This AM I went to my Reddit overview & paged back thru comments going back however many years…good laughs. Figured wise to download my post & comment history for safe keeping. Went about my biz, pointing & clicking & typing, as one does.

Looked again an hour later & it's inexplicable, but all my comments done been replaced; now they're all the same mashup of text apparently from NYT, LATimes, WashPost, and the Bee Movie script. Weird. ¯\_(ツ)_/¯

@theluddite lmaoooooo absolutely definitely not using Disney scripts for this. Nope. Not me.
@theluddite Matt Taibi is a very interesting person.

@theluddite The #ccc hackers code of conduct says not to meddle in other peoples data. that line becomes blurry really quick if other people mistakenly think, my data is their data.

Question here is: do you sing away the intellectual property rights of what you create, if you post on Reddit? Can an ULA even do that in a legally binding way?

@theluddite @gilgwath 😂 True, true. But has he heard of class action lawsuits, I wonder.
@theluddite ...ah, too bad that I deleted my account back when they started playing around with NFTs...

@theluddite Somewhat related: I’d love to get some initial feedback from #WordPress users on a plugin I’m awaiting official review. It poisons AI bots with garbled content instead of outright blocking them: https://kevinfreitas.net/tools-experiments/

If anyone installs it let me know — I’d love to see how it does on your pages/posts. Don’t worry, it doesn’t alter your content in the database but garbles it on-visit by #LLM bots. Cheers!

Tools & Experiments - Kevin Freitas

WordPress Plugins AI Poison Pill [beta] Download v1.0.20240304(will update to use official WordPress.org link once approved/live) Email [email protected] with any questions or suggestions. The words you write and publish on your website are yours. Instead of blocking AI/LLM scraper bots from stealing your stuff why not poison them with garbage content instead? This plugin scrambles … Continue reading "Tools & Experiments"

Kevin Freitas
@theluddite 9 years of daily reddit posts across hundreds of subredits.... Shredded and worthless.
@theluddite this right here. Why I love the internet. Well done sir!

@theluddite you know what'd be REALLY cool??

a plugin that replaces each comment with a random paragraph from wikipedia that's had 10% of the nouns swapped out, and 10% of any numeric figures adjusted by +/- 80%.

@theluddite Copyright is a corporate patriarchal late stage capitalism oppression that undermines natural human need for knowledge and creativity.

Until it's my shitpost in r/buttholes

@theluddite This is absolutely fantastic! Thank you!

Only suggestion I'd make is it replaces all comments with the same thing. If you paste in some open domain content, say a novel by EM Forster, that it would randomly pick a section of the pasted stuff that is roughly the same # of words as what its replacing.

This would make the replacement more random and less obvious, hopefully throw a wrench in Reddit's auto detect and rollback.

u has a github I could mebbe sketch up a patch?

GitHub - thiscoldhouse/help-reddit-ai

Contribute to thiscoldhouse/help-reddit-ai development by creating an account on GitHub.

GitHub
@theluddite Also worth noting: before you do dis, request a backup of ur Reddit data (I'm incredibly proud of all my time wasted doing quality Reddit shitposting over the years.)

@theluddite

The one drawback is that Subreddit moderators do not appreciate the move.

At all.

@theluddite this is awesome! I’d choose Alice in Wonderland to really mess with them. 🤨

@theluddite there are some great Lorem ipsum generators online.

Here’s a fun bunch. Plus the classic.
https://mailchimp.com/resources/lorem-ipsum-generator/

14 Lorem-ipsum-Textgeneratoren, die deine Designs aufpeppen | Mailchimp

Verbessere mit einem dieser Lorem-ipsum-Textgeneratoren deine Weblayout-Wireframes und -Mockups.

Mailchimp
@theluddite
I suggest using text from Strunk & White or a style manual, because AI prose sucks.
@theluddite it would be cool if there was a open-source chrome addon, which open every reddit post visible on feed & comments something made up or copyright or something AI generated garbage #linux #firefox #chrome #bot #reddit
@theluddite the absolutely worst kind of content to use would be content generated by an LLM. No copyright worries and expedited model collapse.
@theluddite I recommend creating a fake ISO standard number for the minimum content of Girl Guide in Girl Guide Cookies...
@theluddite Very nice! For a replacement text I decided to go for a part of Molly Bloom's soliloquy. Stay classy 😉