Mastodawn

Fascinating: Judge rules scraping company couldn't violate Meta's terms of service if it wasn't logged in & subject to them. Wonder what implications that might have for AI crawlers and news sites:
Federal judge rules against Meta in data scraping case https://www.courthousenews.com/federal-judge-rules-against-meta-in-data-scraping-case/

Federal judge rules against Meta in data scraping case

Bright Data could not violate Meta's terms of service if it was not a user of Meta's services, a federal judge ruled.

Show thread

Jason ON Jan 24, 2024

@jeffjarvis

This sort of decision will definitely result in a further walled-off internet.

Show thread

Cassandrich Jan 24, 2024

@jeffjarvis This is the obvious outcome of the "we get to make private law via terms of service rather than relying on existing real law to meet our needs" bs.

Show thread

John Bergquist 🍥Jan 24, 2024

@jeffjarvis I agree with that decision. You can't be bound to arbitrary terms by just ‘stumbling’ into a ‘free’ content.

Show thread

bit 🏳️‍🌈 (he/him)Jan 24, 2024

@jeffjarvis
The ruling seems limited to a terms of service issue, copyright being independent from it.

Show thread

Jeremy Chatwin Jan 24, 2024

@jeffjarvis the court makes distinction between visitor and user with user being a visitor that has logged in. Further, the user is subject to TOS but the visitor is not.
This suggests that scrapers for AI training, being visitors only, are free to use any data without restrictions, no?
Clearly there must be some restraint on visitors use of data, is it only copyright or is something more needed?

Show thread

Andreas K Jan 24, 2024

@jeremychatwin @jeffjarvis copyright at best.

And data as such is not copyrightable in the good old USA.

It's another story in the EU, where compilations of data (aka databases) can be copyrighted. (it's also one of the lighthouse showcases demonstrating how copyright is actually harming businesses creating IP that is protected by copyright, since we have database copyright, database creation in the EU has slowed significantly.)

Show thread

Jeremy Chatwin Jan 24, 2024

@yacc143 @jeffjarvis my layman's understanding of US law is that copyright is asserted on 'works of authorship published' where published means 'offered up for sale'.
Thus, that subset of data deemed 'works of authorship', for example news articles written by a human, appear to be protected by copyright law, at least in the US.
Scraped data outside of this subset appears to be the subject of the original dispute. Is such data legally a 'database' pursuant to EU law?

Show thread

Jørn Jan 24, 2024

@jeffjarvis @yawnbox I think this is a good outcome. Scraping should be legal.

What you do with the scraped data afterwards is another story; even if the data was publicly available, doesn’t mean you can put it in a plagiarism machine.