Reddit says it's made $203M so far licensing its data
Reddit says it's made $203M so far licensing its data
well, not mine. i used a script to replace all of my comments with gibberish before i deleted them and then my account. if they went back and restored my comments, then all they’ll get is comments full of gibberish, especially since i overwrote them 3 times before deleting them, just in case they tried to roll back to the previous version.
have fun with that!
i bet they do now, but i’ve checked back now and then, and all of my comments and posts are most assuredly gone.
edit: i’ve gone back to check some old haunts, place i know i’ve commented, and i did some seaching with google using my old usernames, as google uses its cache to match to the posts\comments, even though they’re not there any more.
i see old posts that are graveyards of deleted comments, some with simply deleted accounts, and many others where both the account and comment are deleted. i don’t see any gibberish comments. the ones i know are mine (because replies quote the comment above, which i recognize as mine), are all just deleted in their entirety, so it seems they didn’t do comment versioning, at least not past the first edit. i see no posts under any former username of mine.
the efforts to scrub my content from reddit last May appears to have worked. sadly, since the API lockdown, those tools no longer work.
i mean… i overwrite them 3 times. and, before deleting the account, i verified that they’d been overwritten multiple times. and they’re all gone now.
exactly how would i go about requesting that data anyway? my accounts have been deleted.
Unfortunately, if you actually deleted your account, there is nothing you can do to access that data. I’m pretty sure we both ran the same script - “nuke Reddit.“ I ran it probably every four months for a few years and then I ran it several times right before the API change. I can tell you with certainty that some of your comments are still there. I also ran that one everyone recommended during the API fiasco but I can’t remember its name.
That being said, you probably we’re able to knock out a lot of it. I wouldn’t worry too much about it. We’re lucky we can even do that much. Most social media sites you can’t get rid of anything no matter how hard you try.
Feel good knowing that they aren’t getting a single cent more out of you
I did do it a couple of weeks before the API policy change, so I have a bit more confidence that they were really deleted. I know that they made some backend changes right before the API policy change, which prevented such actions from being permanent. I really did go back and look through past posts in past comments even from many years back, and my comments really are very much gone. Sure, I can’t go and see all tens of thousands of them, but spot checking from a year ago all the way back to nine or 10 years ago, they’re all gone wherever I’ve looked.
But you’re right, they are not going to get one more cent out of me, in time or effort. Thank goodness I never actually gave them any real money.
Just because it shows [deleted] doesn’t mean the data were deleted. That is most likely just a flag for the comment.
They most likely keep every save since they decided to do the sell the data thing. Why would google pay them for what google could easily scrape other than having the full history?
I put my account’s comment through a mass-delete app around the time of the big protest, and a couple weeks later I found every single one was restored.
People can be incompetent for years and then suddenly start figuring things out.
you can request your reddit data, and they provide every comment along with edits as far as I remember, it was uncomfortable but i’d never posted anything regrettable at least
imagine getting your hands on u/spez’s reddit data
I did the same, but we’re both fools if we think reddit didn’t keep every character we typed (yet alone submitted) in a private, proprietary database.
We weren’t paid for our data. We were given access to a website free of charge. The consent we gave was supposed to be for the operation of the website, not for training AI.
They should fucking pay us.
LOL. I did the same. And I confirmed many months later that the comments were not restored.
Now I hear that Google wants to train their AI on reddit content. Haha. Good luck with that, Lorem Ipsum! 😁
If you actually replaced with “Lorem ipsum” texts, it would probably be easy to filter the garbage from the dataset.
Also, they probably have copies of the comments before the edits that are just not presented in the frontend.
I didn’t. At first, it was basically a long ass message about deleting my comment out of protest. Then a few subreddit mods banned me, so I changed them to “almost makes sense” word salad 😂
I ran the script, changing the text each time, several times for good measure.
They still haven’t reverted it, and it’s been more than just a few months now.
Correct, our data.
Their TOS says they own your content in any current or future formats or derivative works.
I’d say Reddit would win.
The TOS shouldn’t hold up in court. A contract must be an exchange of two things, eg money for a product or service. You can’t say “Our service is free of charge!!!” And then in the fine print “(((But also you agree to give us everything we can take free of charge)))”.
The issue is how everyone does it. Facebook and Google started when data had no value, now they’re amongst the wealthiest businesses in the world. Now, Microsoft have joined in, *even though you already pay for their products and services anyway!"
However, the other aspect is that everyone is a victim. Lawmakers are the victim. They still haven’t quite yet realised how much is being taken from them (at least $50 per year, probably more like $1,000 per year if not more for prominent figures) but they are still being abused.
It’s like that form of bank fraud, where the criminal takes pennies from accounts, hoping the user won’t notice and the bank will write it off. Do it to enough people and enough times and you can make millions. They do this to everyone and they make billions.
Either the data is public domain and they don’t have to pay for it, but also cannot charge others for it, or the data is private and they must pay the author a fair share.
No, it isn’t. The website is offered free of charge, regardless of whether you provide them data or content. The exchange for data/content is a second transaction tucked away in the terms and conditions, and the website offers nothing in return for that.
The reason the 2nd exchange is hidden in the terms and conditions is to intentionally hide what the user is giving away, such that the user cannot make a fair value assessment. It is fraudulent and deceptive.
Their TOS says they own your content in any current or future formats or derivative works.
Their ToS could say they own you and your children and grandchildren, but that doesn't make it enforceable. If I post a frame from the movie Akira on Reddit would any reasonable person suggest that they own not only that frame, but also the entire movie that it came from as a derivative work?
I'm not saying you're wrong, just that the idea that they automatically "own" the things that people post on their website is ridiculous. It's a bit like UPS or FedEx saying they own the contents of your package while delivering it.
It is true that Reddit does not hold a valid license to content that is
However, as far as I understand it, the extent to which Reddit—a content provider and social network—is legally required to remedy this is to comply with DMCA requests and review reported content. Perhaps there is a higher standard that I am not aware of?
And yet that exact kind of data is all over reddit in ways that are impractical to enforce by case by case DMCA. How many memes are there using footage from popular shows? How much fanart?
More importantly, is that stuff not included as part of the data that reddit "owns" when they sell their data to tech companies? Because whether a DMCA takedown has been requested on that kind of data or not, doesn't change the fact that they don't hold the copyright in the first place. How can they sell things that they don't even own?
The answer is that it’s more practical than any alternative.
Copyright holders can’t sue Reddit for selling access to copyrighted content (before Reddit receives a copyright claim) because there is no way Reddit could reasonably distinguish between original and copyrighted content. Reddit users violate copyright law and the ToS in submitting copyrighted content, and Reddit is only required to take action as they are made aware of the content’s copyright status.
It would be trivially easy to to circumvent Reddit’s ToS otherwise: I could create some original content, sell my copyright to a friend for $1, and immediately put Reddit in violation of copyright law by submitting the content to Reddit. My friend could go after Reddit, and Reddit could go after me, but my friend would likely get more out of Reddit than Reddit could successfully get out of me.
It’s the same reason publishers can’t sue Cloudflare for hosting a piracy website unless they refuse to take it down, nor can they sue Facebook for ad revenue earned from banners placed next to a copy+paste of a New York Times article. The content providers not knowingly/intentionally violate copyright law, and they make reasonable attempts to prevent/rectify it. Without such limitations on legal standing, the internet becomes a way bigger mess than it already is.
I think you're conflating two very different things here.
The DMCA covers hosting and dissemination. If a user submits copyrighted data to Reddit that they do not own and Reddit unknowingly (because, to be fair, they can't know what is or isn't owned or by who), then Reddit is not liable for copyright infringement as long as they comply with DMCA takedown requests from people who claim to own the original IP.
But again, none of that implies that Reddit themselves (or Twitter, Facebook, TikTok, etc.) can realistically claim ownership over all of the data that is on their website. The reason they are subject to DMCA at all is because there is a globally shared assumption that data that users submit may or may not be owned by some other party, and while the DMCA protects them from being held liable for simply hosting and disseminating that data, it does not magically make them the owner of all data that hasn't had a DMCA claim made against it.
In other words, if I post a picture of Homer Simpson on Reddit (and there are many), it is ridiculous for anyone to suggest that they have any intellectual property rights over that picture, that character, any trademarks, etc., whether someone has made a formal DMCA take down request or not. And if they don't own the picture, the character, the trademark, etc., when what exactly are they selling (licensing) and where did they get the right to sell it?
They might not be liable for just hosting/distributing it, but just like you can't sell someone else's car, you can't license out someone else's IP.
Yeah, probably not. When you sign up and agreed to their ToS, they don’t “own” your content, but you grant them a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use it without compensation.
From their ToS:
Any ideas, suggestions, and feedback about Reddit or our Services that you provide to us are entirely voluntary, and you agree that Reddit may use such ideas, suggestions, and feedback without compensation or obligation to you
There is legal standing, IMO. You can’t take something without consideration, and access to the website was granted free of charge while the data collection was squirrelled away in the fine print. That isn’t a lawful contract, the fine print is for technicalities about the main transaction of X in exchange for Y. You can’t say "we’ll give you X for free!!!” then sneak into the fine print “(((you also give us Y for free)))”. The structure is clearly deceptive in a manner that is designed to prevent a fair assessment of the value being exchanged.
Insurers have to provide a “key facts page” where they summarise in plain English what you’re paying for. The fine print gives the detail, but the front page is still “we give you X in exchange for Y”.
You can’t build a car without paying for the nuts and bolts. Tech companies have placed themselves amongst the wealthiest businesses in the world without paying for the nuts and bolts we provide.
Hell, even Microsoft is in on it now, even though you pay for Windows and Office 365!
So Reddit charges users to create content (paid premium or by showing ads). And then it sells that content.
Making money both going and coming.
And it also asks reddit users to invest in reddit
loool
Literally the only way they could become profitable.
I’m honestly more upset at the deal with google (I think it was google?) than the CEO pay thing, which is all stock options and mostly ragebait.
I expect to see them last 3-5 years and get bought out by some bit tech firm, all current execs take their payouts, sell their shares and retire.
"its data".
Ah yes... of course.
In other news, spez's compensation from reddit last year was $193 million, and it's COO got a cool $93 million.
C'mon, spez, tell us again how horrible it's been that reddit's never made a profit.
Reddit doesn’t own that data. The community owns it.
Maybe there’s something in the terms of service but that shouldn’t hold water because nobody has ever read that document.