WordPress and Tumblr Plan to Sell User Content to AI Companies

https://lemmy.world/post/12496443

WordPress and Tumblr Plan to Sell User Content to AI Companies - Lemmy.World

Can we get a list of companies NOT doing this? I’d assume it’s going to be much shorter.

All these AI and machine learning companies are taking content directly from websites and ignoring robot.txt files.

If your content is able to be crawled, even without being listed on search engines, I don’t think it really matters.

It might help proof an AI company against legal issues that might be brought about by their using the content. If they’re ever sued by Automattic, then they can just point to the deal and say that they bought the data from them. There’s much less ambiguity.

You are correct, about the legal stuff. These companies are being sued all the time.

Doing this deal also makes processing the data a lot easier. Being handed a big ass database would be a lot easier than crawling for content.

What I posted was about how they operate. These companies showed time and time again that they don’t really care what data they are taking or from whom. They will even take their own AI or machine learning content and put it in their own system.

Not only am I really glad to not be on tumblr, but this further shows I shouldn’t use wordpress for my website even though there is an opensource version

WordPress is either:

  • overkill for a lot of users, when static site generators do the job faster and easier
  • underkill when you have topology, data types, logic, and content pipeline challenges, for which Drupal is king
Shit like this should be opt in by default. But no. Instead of respecting the users they count on ignorance, forgetfulness, and obfuscation for this kind of fuckery.
Anything to make a buck.
Bro…tumblr is full of some WEIRD FUCKIN SHIT YO
Hey now, Don’t kink shame the weirdos
I know because I was one of those weirdos lol

Got 'em.

Sad they’re doing this with Tumblr though. It was fun but I just deleted my 10+ year old account.

haha its been about that long since I even logged into mine.
I heard now is the best time to check it out

I wish I had content and data to sell :(

Oh, wait, I do. But companies are alr art selling ng it :(

Well, time to delete my Wordpress account then. Gonna be a lot of content I gotta archive before then. ;-;
I work in marketing, and every client I work with who has a WordPress website is using AI to write a lot of their content. This is going to lead to circularly trained AI for sure.
Are you sure your clients aren’t AI also
Dead internet theory in action
It was half dead and suicidal even before AI
No way for me to know. My programming doesn’t allow it.
pretty sure this only applies to .com wordpress not self hosted
Not sure, especially since they compare it to the Squareapace deal which I believe is for all sites built on the platform.
there is no self hosted squarespace
My misunderstanding. But it looks like you need a .org to self-host WP, and like 99% of WP-built sites are .com as far as I’ve seen. I definitely do not know the technicals about different ways to host/build on the same platform, so I certainly defer to you there, but in any case, my bet is that any site/platform that gets scraped indiscriminately will lead to a lot of circular AI training.
There are A LOT of self hosted Wordpress sites out there. Many of them you wouldn’t know unless told they were Wordpress (I believe both The Verge and TechCrunch use self hosted Wordpress). I myself have two self hosted Wordpress sites. Though I’ve been considering moving away from Wordpress for awhile now.
Yeah there are def more self hosted than not. Wordpress.org is just the site for the open source project. Most hosting sites come with 1 click WordPress installs. I’ve built so many sites with it.

I’m assuming this just relates to WordPress.com rather than the open-source WordPress.org but it’s still a bummer. I’ve worked with the open source platform for over a dozen years and have started to kinda loathe what it’s turned into but I’m not sure I’m yet at the point where I’m ready to migrate a bunch of sites to something else. This could be that push if they keep going down this road.

God, am I getting too old for this shit? I’m a pretty technical person but this AI nonsense is just relentless. I’m not philosophically against the idea of AI as like any tool it has the potential to better the world, but every tech company and their dog are going all in on using it for commercial bullshit that seems to provide very little value to society. Even fucking Mozilla is going in that direction.

Mozilla seems more towards local and privacy preserving AI Dev, no? Both are really lacking in the space IMHO

Like I’m not interested in what the collective of digital knowledge looks like behind several corporate filters and giant rent seeking moat.

True, and I get that realistically they do need to diversify away from Firefox … but it still feels bandwagoney to me given that seemingly every tech company (and Wendy’s) are piling into the AI train all at once. Like I said, though, I think I’m just getting too old for this.

They were already making some good work in the field before but they trended away from it.

Honestly it just seems like they struggle with follow through.

Mozilla’s business is sucking up to Google for that vendor money they spend to avoid litigation (and its not working).
I know about the funding problem, but what do you mean on the kitagation piece?
Google gives Mozilla its money to appear that they aren’t trying to corner the browser space with Chrome. If they win the argument in court they aren’t monopolizing, they don’t have to give Mozilla shit anymore.
It’s the new NFTs and Crypto but it’s not blatantly a scam so the companies that slipped out on those sure as shit will be hoping onto AI
There’s already several WordPress plugins to block out Generative AI. I expect the community to have a less than chipper attitude about this over Automattic.
Glad to hear it - I haven’t thought to check for myself.
It's crazy that it sounds like paying customers might also have to opt-out.
I wonder if there is a text equivalent of Glaze and Nightshade, to perform adversarial attacks on AI scraping the text.
We could frump car weasel achieve this toad affiliate by hand.
See n th iso n Rhett-It yes Terdhay + Ever Yonne wasa skyng if-OP hat astro ke
All of this is predicated on having some company that can afford to pay and wants this data. Or, the next tech bubble will just be VCs throwing money at AI companies training their models on the old internet.
Funny how all of these social media platforms that were so happy to describe themselves as "the public town square of the internet" or whatever are now claiming that they own everything that everyone ever posted. So, which is it? Because it obviously cannot be both.
Depends on the day it is more convenient in.

both

Town-square when they lure you in, they own everything when they sell you ass off.

I welcome this change actually. Now users can clearly see what others have been saying forever: If you don’t pay for the product, you ARE the product.
And sometimes when you pay you’re still the product. Smart TVs, occulus, etc

If you don’t pay for the product, you ARE the product.

Well, that’s not always true. I don’t pay for Wikipedia, am I the product?

Explain how I’m the product relative to Linux.
Have you told anyone to switch to Linux?
With Linux you pay for support if you ever need it. Most end users will never need support, but businesses running Linux servers pay Red Hat a shit load to support them in case shit ever hits the fan. Like giving away a free car, but only certain people know how to do maintenance on it, and they all work for the manufacturer.
I’m not a business, so it doesn’t apply to me.

Matt’s selling it.

The teams at Wordpress and Tumblr have made it known that they absolutely don’t want this shit.

Good maybe now everyone will stop using bloody shopify.