Blocking Crawls From Cloudflare's Browser Crawl Endpoint

Short post detailing how to identify and block requests from Cloudflare's browser rendering service.

www.bentasker.co.uk
🚫 Oh, the irony! A guide on remembering EVERYTHING you've ever Googled—except it forgot how to let you in! 🤔 Apparently, even openresty's #memory has its limits, and yours will too if you keep waiting for this. 📚❌
https://ellanew.com/2026/03/02/ptpl-197-record-retrieve-from-a-personal-knowledgebase #irony #openresty #techhumor #googling #HackerNews #ngated
🚫 Oh, the profound journey of AI grief—denied access to enlightenment by a 403 wall of wisdom. It's like trying to philosophize with a brick wall while #OpenResty laughs in 1s and 0s. 🤖💔
https://sellsbrothers.com/the-5-stages-of-ai-grief #AIgrief #techphilosophy #403forbidden #digitalwisdom #brickwall #HackerNews #ngated
🚫😆 In a groundbreaking exposé on AI's true purpose, we learn that it's actually designed to perfect the art of saying "NO ENTRY" with style. #openresty is the future, folks - the future of rejections! 🚀🔒
https://www.chrbutler.com/what-ai-is-really-for #AI #NoEntry #FutureOfRejections #TechExposé #InnovationInStyle #HackerNews #ngated
🚨BREAKING NEWS🚨: The German government bravely stands against Chat Control by... blocking their own press release with a "403 Forbidden" 😂. Looks like #openresty has taken up a new role as the gatekeeper of free speech in Germany! 🙈💻
https://xcancel.com/paddi_hansen/status/1975595307800142205 #BreakingNews #FreeSpeech #Germany #ChatControl #HackerNews #ngated
Patrick Hansen (@paddi_hansen)

Great news and big win for privacy in the EU! 🇪🇺🇩🇪 Germany’s ruling CDU/CSU party made it clear today: there will be no chat control - as pushed for by other EU countries - with this German government.

Nitter

GET 和 POST 有何区别?我想也就 method 不一样罢了

在 RFC 7231 里定义了 HTTP 语义和内容。不幸的是,它没有明确说明带有正文的 GET 请求应该发生什么

A payload within a GET request message has no defined semantics; sending a payload body on a GET request might cause some existing implementations to reject the request.同样在 RFC 9110 里提到:A client SHOULD NOT generate content in a GET request unless it is made directly to an origin server that has previously indicated, in or out of band, that such a request has a purpose and will be adequately supported. An origin server SHOULD NOT rely on private agreements to receive content, since participants in HTTP communication are often unaware of intermediaries along the request chain不过,该设计目标并不排除 GET 明确拒绝请求体,也不排除规范提及不应发出请求体

然而不幸的是,在当今互联网,我们的请求到服务器中还有很大一段距离,中间可以有缓存代理服务器如 Squid、CDN 如
#Cloudflare、反向代理如 #OpenResty 等,如果要让 GET 携带 Body,就要确保这些组件可以正确工作,同样在浏览器这边,多数浏览器会忽略或对 body in GET 发出警告

也就是说,除非你可以完全控制浏览器到最终的服务器,否则 GET 传递 body 并不是推荐的方法。目前有一个名为 QUERY 的新方法仍停留在 draft 阶段,感兴趣的可以点击下方链接查看

延展阅读:
-
https://evertpot.com/get-request-bodies/
-
https://www.ietf.org/archive/id/draft-ietf-httpbis-safe-method-w-body-02.html (存档)
-
https://httpwg.org/http-extensions/draft-ietf-httpbis-safe-method-w-body.html
-
https://www.baeldung.com/cs/http-get-with-body

Request bodies in GET requests

🖥️ My ultra-budget server powering http://websysctl.alfonsosiciliano.net has been running smoothly for the past 2 months. So far, so good!

📈 #Crawlers hit tens of thousands of sysctl parameter pages daily. That's fine, since robots.txt allows it. But why keep requesting non-existent pages as if the site were built with WordPress 😤 ? Fortunately, the stack (#FreeBSD  + #OpenResty 🌐 + #Lapis ✏️ + a custom-built #database 📦 ) stays well within the limited resources of my $5/month cloud server.

The code might soon be #OpenSource stay tuned!

#UNIX #sysctl #WebDev #WebServer #ThePowerToServe #coding #Lua #kernel

I've said this before, but I can't say it enough: OpenResty (nginx+lua) is painfully underappreciated. It's fantastic, and the only platform I trust for very high volume, high performance, mission critical web application servers.

#openresty #nginx #webdev

I can’t exactly recall when this started happening but I want to say it was when I updated my container that runs #nginx / #openresty from #ubuntu 22 to 24.04.

Now I get these uptime kuma notifications at least every hour. There are no errors in OpenResty and no changes to my configuration. No changes in my DNS provider either. Checked #pihole and no obvious errors there (and no changes to the config). Anyone seen this before?