Mastodawn

some_guy Jun 28, 2024

Microsoft’s AI boss thinks it’s perfectly OK to steal content if it’s on the open web

https://lemmy.sdf.org/post/18848765

Microsoft’s AI boss thinks it’s perfectly OK to steal content if it’s on the open web - SDF Chatter

Show thread

Buffalox Jun 29, 2024

copying is not theft

Show thread

GamingChairModel Jun 29, 2024

Yeah, I’m not a fan of AI but I’m generally of the view that anything posted on the internet, visible without a login, is fair game for indexing a search engine, snapshotting a backup (like the internet archive’s Wayback Machine), or running user extensions on (including ad blockers).

Show thread

sugar_in_your_tea Jun 30, 2024

Yes, it kind of is. A search engine just looks for keywords and links, and that’s all it retains after crawling a site. It’s not producing any derivative works, it’s merely looking up an index of keywords to find matches.

An LLM can essentially reproduce a work, and the whole point is to generate derivative works. So by its very nature, it runs into copyright issues. Whether a particular generated result violates copyright depends on the license of the works it’s based on and how much of those works it uses. So it’s complicated, but there’s very much a copyright argument there.

Show thread

TheRealKuni Jun 30, 2024

An LLM can essentially reproduce a work, and the whole point is to generate derivative works. So by its very nature, it runs into copyright issues.

Derivative works are not copyright infringement. If LLMs are spitting out exact copies, or near-enough-to-exact copies, that’s one thing. But as you said, the whole point is to generate derivative works.

Show thread

sugar_in_your_tea

Derivative works are not copyright infringement

They absolutely are, unless it’s covered by “fair use.” A “derivative work” doesn’t mean you created something that’s inspired by a work, but that you’ve modified the the work and then distributed the modified version.