#Cloudflare 09:00: I will protect you from all these scrapers smoking your cpu and webserver bandwidth.

Takes a lunch 🥪

Cloudflare 13:00: Hey AI-model hot boys… I have a 1 click #scraper for you to fill up that model!

https://developers.cloudflare.com/changelog/post/2026-03-10-br-crawl-endpoint/

Crawl entire websites with a single API call using Browser Rendering

Browser Rendering's new /crawl endpoint lets you submit a starting URL and automatically discover, render, and return content from an entire website as HTML, Markdown, or structured JSON.

Cloudflare Docs

New rule: Every time I notice an overnight outage with my website, a new scraper gets added to my robots.txt file.

Welcome to the list, "Amzn-SearchBot".

#Amazon #Scraper #AIScraper

リト - GitHub - D4Vinci/Scrapling: 🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

引用元:github.com | 🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl! - D4Vinci/Scrapling

Has anyone seen a huge surge of #ai #bot #scraper traffic from #singapore recently? Our services were hit (non desctructively) but it's been worse for some orgs we know. Any reporting / info on this? Just a new cluster of models being built?

Could I get some admins of bigger instances looking in their access.log and see if there's more than just the good for nothing scraper using "Apache-HttpClient" as part of it's useragent or if there's any actual fedi software using this?

See: https://wolfi.ee/@jase/statuses/01KHB81TRFKKXC1C72M8Z8GFYD

I'm going to be adding that to my user agent blocklist of #BadBotBlocker, as I have nothing but the public timelines scraper using it, but want to be sure.

#Scraper #ScraperNoScraping #FediAdmin #MastoAdmin #SysAdmin #Fediblock

Wolfie Jase is lewd your friendly neighborhood terrorist (@[email protected])

Sensitive content: Public timeline fedi scraper, using Apache-HttpClient in it's useragent

wolfi.ee
One Open-source Project Daily

Fast and simple video download library and CLI tool written in Go

https://github.com/iawia002/lux

#1ospd #opensource #bilibili #crawler #download #downloader #go #golang #iqiyi #qq #scraper #tumblr #video #youku #youtube
GitHub - iawia002/lux: 👾 Fast and simple video download library and CLI tool written in Go

👾 Fast and simple video download library and CLI tool written in Go - iawia002/lux

GitHub

weiß jemand ob es einen Weg gibt aus #Microsoft #Teams die #Schichten zu extrahieren?

Würde die Daten gerne in den Familien #Kalender mit aufnehmen.

Gibt es da ne #api, oder notfalls ein #scraper dafür?

#msteams #frage

Robots.txt Generator - Retro Terminal Edition - Mehr als 200 Bots in der kostenfreien Version. Pures HTML, Javascript und ein bisschen CSS. Keine Third Parties, kein Framework, kein CDN, keine Cookies, kein Tracking, keine Werbung, kein BigTech-Gedönse, keine KI, sehr datenschutzfreundlich. Simple und effektiv im Retro-Style. Demnächst online.

#teufelswerk #HTML #javascript #app #entwicklung #code #retro #css #robotstxt #generator #stopbots #bots #crawler #scraper #keineKI #cookieless #datenschutz

Jonathan Corbet (@[email protected])

So @lwn is currently under the heaviest scraper attack seen yet. It is a DDOS attack involving tens of thousands of addresses, and that is affecting the responsiveness of the site, unfortunately. ...