A pay-to-scrape AI licensing standard is now official

A licensing standard aimed at making AI companies pay for the content they scrape across the web is now official. With the publication of the RSL 1.0 spec, publishers can dictate licensing rules and indicate whether they want their content to appear in AI search.

The Verge

Dans cet article du Diff, nous expliquons une mise à jour récente que nous avons apportée aux données de trafic des utilisateurs de Wikipédia, les tendances que ces données révèlent, la manière dont la Fondation réagit et comment vous pouvez nous aider.

https://diff.wikimedia.org/fr/2025/11/07/nouvelles-tendances-chez-les-utilisateurs-de-wikipedia/ #Scraping, #ScrapingBots

Nouvelles tendances chez les utilisateurs de Wikipédia

Isiwal, CC BY-SA 4.0 via Wikimedia Commons En mars, la Fondation Wikimedia a fait part des tendances mondiales qui ont un impact sur notre mouvement. Ces tendances ont continué à façonner non seule…

Français
How crawlers impact the operations of the Wikimedia projects

Since the beginning of 2024, the demand for the content created by the Wikimedia volunteer community – especially for the 144 million images, videos, and other files on Wikimedia Commons – has grow…

Diff
How crawlers impact the operations of the Wikimedia projects

Since the beginning of 2024, the demand for the content created by the Wikimedia volunteer community – especially for the 144 million images, videos, and other files on Wikimedia Commons – has grow…

Diff

If #Cloudflare is to be believed, #Lemmy instances have a built-in AI scraping bot operating beneath the covers. Do you think the developers have snuck it in?

Looking through my logs, these requests have all been blocked by Cloudflare because they are identified as "AI Bots". There are many more requests by Lemmy instances blocked in the logs. This is just a sample. Other Lemmy requests from these servers get through. Only a few are blocked as AI Bots.

Cloudflare says they use AI to determine if a request is a legitimate request or an AI bot trying to scrape.

207.204.58.144
AS19045 DIRECTCOM
United States
User agent: Lemmy/0.19.5; +https://lemmy.cryonex.net

23.127.223.238
AS7018 ATT-INTERNET4
United States
User agent: Lemmy/0.19.3; +https://lemux.minnix.dev

2a01:cb19:f85:ec00:82fa:5bff:fe51:ed4a
AS3215 France Telecom - Orange
France
User agent: Lemmy/0.19.5; +https://lemmy.sidh.bzh

50.247.53.42
AS7922 COMCAST-7922
United States
User agent: Lemmy/0.19.5; +https://toast.ooo

69.42.19.234
AS11404 AS-WAVE-1
United States
User agent: Lemmy/0.19.5; +https://lemmy.schlunker.com

155.138.226.183
AS20473 AS-CHOOPA
United States
User agent: Lemmy/0.19.5; +https://lemmy.mbl.social

#MastoAdmin #AIBots #Scrapers #Scraping #ScrapingBots #privacy

lemmy.cryonex.net

😤 #Scraperbots are automating data theft, extracting your website's content without permission! 🌐

💣 Learn about the impact of scraper bots and how to prevent them: https://bit.ly/3RiXgya

#contentscraping #bots #webscrapers #webcrawlers #scraping #waf #botmanagement #waap #scrapingbots #apptrana #indusface

Content Scraping Bots - Techniques & Mitigation | Indusface

Discover what content scraping bots are, why they're used, how they scrape content, the types of content they target, and how to effectively prevent them.

Indusface