Mastodawn

Jakub Dec 16, 2023

OpenAI suspends ByteDance’s account after it used GPT to train its own AI model: Isn't it hypocritical to use the copyrighted work of others without permission but deny others the same opportunity? OpenAI scraped the entire Internet but is now acting holier than thou.
https://www.theverge.com/2023/12/15/24003542/openai-suspends-bytedances-account-after-it-used-gpt-to-train-its-own-ai-model OpenAI can train on various data types such as text, images, books & videos without permission, but competitors can not access it once it's in their system. So, yes, it is a problematic version of copyright

OpenAI suspends ByteDance’s account after it used GPT to train its own AI model.

The Verge

((( Geekosaurus )))Dec 16, 2023

@nixCraft scrapping the internet might be a good idea in some ways but I'm pretty sure you meant "scraped"

nixCraft 🐧Dec 16, 2023

@kahomono yes, i fixed it. sorry about that.

Freevolt Dec 16, 2023

@nixCraft ngl. In this case, I'm slightly in favor of OpenAI. Deciding which data to collect can be their thing, right? (even though it's so much questionable) Directly learning from it is referring to the data itself, PLUS which data it collected, which they can legit claim theirs.

Atanas Laskov 🏳️‍🌈Dec 16, 2023

@nixCraft in the near future AI models feeding off each other's hallucinations will go completely nuts

Egee Dec 16, 2023

@nixCraft You have some really good takes 👍

Jordan Biserkov Dec 16, 2023

@nixCraft "Open" "A" "I" lives up to it's "name" once again.

The name was chosen using the oldest trick in the book: take the product's greatness weakness and advertise its opposite.

Zeki Çatav 🤔 ☕ 🕯️🎶Dec 16, 2023

@nixCraft When entering Openai, the data comes from the left side and copyleft is valid. When leaving Openai, the data comes from the right side and copyright is valid. This is the situation. 😋

mirabilos Dec 16, 2023

@nixCraft all violations down to the base, no matter what the stinking OSI pretends

The Keymaker Dec 16, 2023

@nixCraft Do as I say, not as I do.

WowSuchCyber Dec 16, 2023

@nixCraft most of what openAI uses come from https://commoncrawl.org. I wonder why so few talk about that one. You can download the data (if you have the storage.....)

Common Crawl - Open Repository of Web Crawl Data

We build and maintain an open repository of web crawl data that can be accessed and analyzed by anyone.

Gilles Bonnet Dec 16, 2023

@nixCraft this is called circular référence so at the end, AI feeds itself...

Sumukha S Dec 16, 2023

@nixCraft ByteDance can do the same right? Build their own AI from the Internet. No one is stopping that. I don’t think this is fair to openAI.

Jylie Dec 16, 2023

@nixCraft if you have to train ai with ai, it is pointless? a clone of a copy is just more fake than the forgery

lupus_blackfur Dec 16, 2023

Silly you...

Trying to apply reason, thinking, and anything resembling rationality to such things...

😁😁

Timothy R. Butler Dec 16, 2023

@nixCraft I don’t think necessarily. It’s one thing to scrape content — so much of the Internet depends on spiders and if AI is actually going to be AI, it needs to be able to “read” just like you and I do. It’s another to actually use a program. I can read anything public on the web freely, but I have to sign a license agreement to use GPT or Excel or Photoshop.

molytov Dec 16, 2023

@nixCraft Corporations built on exploiting the internet for profit and power are now eating each other, huh...