Wikimedia urges AI firms to stop scraping its content and to use its paid Enterprise API

The Wikimedia Foundation asks AI companies to stop scraping Wikipedia and to pay for content via its Enterprise API, citing financial risks due to human traffic decline and growing AI use

AlternativeTo

Podcast: What the First AI Copyright Ruling Means for Authors

ALLi Director Orna Ross unpacks the first court ruling on AI’s use of copyrighted books and why it matters to indie authors. She explains how the judge balanced “fair use” with pirate copying, then walks through ALLi’s Four Cs—consent, compensation, clarity, and…
https://selfpublishingadvice.org/podcast-ai-copyright-ruling/

#Podcast #AIcopyrightruling #authorrights #contentscraping #creativepublishing
@indieauthors

Podcast: What the First AI Copyright Ruling Means for Authors

Orna Ross explains what the first AI copyright ruling means for authors, including its impact on rights, fair use, and future protections.

The Self-Publishing Advice Center
Podcast: What the First AI Copyright Ruling Means for Authors

Orna Ross explains what the first AI copyright ruling means for authors, including its impact on rights, fair use, and future protections.

The Self-Publishing Advice Center
Content Scraping: BBC droht Perplexity mit rechtlichen Schritten

Die KI-Suchmaschine Perplexity soll mutmaßlich Inhalte des öffentlich-rechtlichen Rundfunks Großbritanniens nutzen. Perplexity wittert dagegen Monopolismus.

heise online

Content delivery partner Cloudflare released new technology to help website owners from having their content scraped by bots training AI without permission.

#ContentScraping #WebsiteProtection https://www.msn.com/en-us/news/technology/cloudflares-new-free-tool-stops-bots-from-scraping-your-website-content-to-train-ai/ar-BB1pu9LF

MSN

This is sparking interesting discussions. https://www.wired.com/story/perplexity-is-a-bullshit-machine/ Hallucinations and “bullshitting” are definitely an AI thing, I’d probably say it’s their best feature… But Wired’s article focuses on an important topic, scraping content without permission, ignoring robot.txt among other things. Perplexity is not the first and won’t be the last doing this and it definitely causes harm to publishers. The question is: “Why is this happening”? It’s not just because AIs need more accurate sources (instead of making stuff up), but imho it’s because finding the right content has become increasingly challenging, search engines are dominated by SEO practices and search results are disappointing at best. News sites, obviously in need of getting some revenues, are paywalling everything. In many fora, sites like archive.is and unpaywall extensions are often praised under the “free the information” slogan, RSS feeds kind of play a role there too, because in many cases they don’t drive people to visit the original websites. I think this is not much different than what AIs are doing now, and I’m not saying this is legal or ethical, it’s just a fact.
So, my question is: is the ball on the court of AIs, needing to be regulated, or is it on the publishers’, to identify other ways of getting revenues out of this?
#AI #Hallucinations #ContentScraping #SEO #Paywalls #AIRegulation #News #Publishers #Ethics #searchengine #perplexity #chatgpt
Perplexity Is a Bullshit Machine

A WIRED investigation shows that the AI-powered search startup Forbes has accused of stealing its content is surreptitiously scraping—and making things up out of thin air.

WIRED

😤 #Scraperbots are automating data theft, extracting your website's content without permission! 🌐

💣 Learn about the impact of scraper bots and how to prevent them: https://bit.ly/3RiXgya

#contentscraping #bots #webscrapers #webcrawlers #scraping #waf #botmanagement #waap #scrapingbots #apptrana #indusface

Content Scraping Bots - Techniques & Mitigation | Indusface

Discover what content scraping bots are, why they're used, how they scrape content, the types of content they target, and how to effectively prevent them.

Indusface