Mastodawn

Show thread

a very weeny construct 💀Sep 5, 2025

woo! connected to a test instance of prosody xmpp on port 443 (https) using xmpp's direct-tls protocol

which means

hopefully, certain employers won't block the traffic anymore
hopefully, it can pass through sniproxy the same way https traffic does, which will enable ipv4 users to talk to my ipv6 xmpp server

there's still different ways i can think of to wire all this together, wondering if any would be better

Show thread

a very weeny construct 💀Sep 5, 2025

other ways i could wire it up:

instead of using a separate ip address so it and caddy can both listen on port 443, i could have caddy reverse proxy to it
might need to put it behind a proxy anyway because it might not handle PROXY protocol from sniproxy
might need to put it behind a proxy anyway so iocaine can slap the llm scrapers that try it
maybe run it in a container and figure out how to get those their own ip address

Show thread

a very weeny construct 💀Sep 11, 2025

wow incus (anti-ubuntu fork of lxd) and its web ui is pretty slick

also libvirt and virt-manager connected to lxc offers the ability to create an application container or an operating system container

(compared to incus which says application containers require docker?)

this feels like a deep rabbit hole, hope i can get grips on it soon

Show thread

a very weeny construct 💀Sep 14, 2025

all right incus there you go, a whole-ass lvm volume group all to yourself, let's see what you do with it

also: learning wtf a network bridge is and how to use one 🧑‍🎓📖 instead of the usual winging it with vague guesswork and assumptions from context

Show thread

a very weeny construct 💀Sep 16, 2025

is xmpp's direct-tls protocol (usually on port 5223) the same as its unencrypted protocol (usually port 5222) wrapped in tls? same as imaps and pops and https? so could i terminate the tls with haproxy and reverse proxy to an unencrypted xmpp server?

the protocol is all spec'd in RFCs for anybody to look at but they don't wanna get in my brain

also, aw, the wikipedia article for xmpp describes as an example a transport for icq, rip https://en.wikipedia.org/wiki/XMPP

XMPP - Wikipedia

Show thread

a very weeny construct 💀Sep 16, 2025

the docs don't reveal the info i want, and i don't want to try reading reams of source code in an unfamiliar language right now, so i'll set up an experiment and see how it behaves i guess

(the only activity that ever feels slightly close to doing science in my field of software-jiggling)

Show thread

a very weeny construct 💀Sep 17, 2025

my ipv4-only client
-> ipv4-to-ipv6 sniproxy port 443
-> ipv6-only vm
-> haproxy to conditionally unwrap proxy protocol
-> prosody xmpp server

... experiment is working ✨🤩✨

Show thread

a very weeny construct 💀Sep 18, 2025

calling it now: even though haproxy has lots of sharp edges, i like it, or its configuration mechanism, way more than caddy's

i seem to be able to make stuff work in haproxy that takes struggle and uncertainty in caddy
caddy's magical get-your-free-ssl-cert-automatically is nice when you want to stand up an experiment but i like cronned certbot for "prod"
otoh caddy has a nice builtin static webserver 🤷

i think i'm going to be stuck using both for a while

Show thread

a very weeny construct 💀Sep 18, 2025

migrating xmpp services from my old vps to my colocataires vm is the last thing remaining to do before i'm able to delete the old account (and stop paying for it)

(dreamcompute hasn't been bad, but i like colocataires way better)

now that i've proven to myself that it can work over port 443 and ipv6-only, it's time to configure it properly

next: see if i can move the service over without xmpp clients complaining

but first, sleep 😴

Colocataires: Host with Friends

We offer artisanal server colocation and virtual machine hosting in our Ottawa, Canada-based data center. Pick from our standard rackmount or VM options or propose something custom.

Show thread

a very weeny construct 💀Sep 28, 2025

all the clients we usually use on android and linux are now connecting to my new xmpp server at colocataires, with no settings changes, on the https port so it looks like website traffic 😮

i don't have stun/turn turned back on yet, so voice and video calls probably won't work just yet

i don't have the conversations.js web client set back up yet but that's mostly for emergency use

this may be success enough to decommission the old servers! 🎉

Show thread

a very weeny construct 💀Sep 28, 2025

also need to double check to make sure i haven't left any gaping security holes open, like exposing a server accepting proxy protocol to the fetid internet

Show thread

a very weeny construct 💀Oct 4, 2025

https://conversejs.org/ is back up on my domain and working just fine 🎉

old dreamcompute vps is turned off, sitting there just in case for a bit, then i can delete my account 🎉

(i don't hate dreamcompute but i like https://colocataires.dev way better)

think i'm just gonna leave stun/turn not running for now. if both parties are on ipv6-capable networks, calls should work. let's see how often that's an issue

Converse

Converse.js - Open source, web-based XMPP chat client. Self-hosted, customizable web chat with end-to-end encryption.

Show thread

a very weeny construct 💀Oct 4, 2025

if i want to set up stun/turn, i should abandon my somewhat irrational ipv6-purist intentions and pay the loonie for an ipv4 address

if i'm gonna do that then maybe i can keep almost all of my vm ipv6-only, except for one container that runs coturn?

if i'm gonna do that then maybe i can figure out how to make that container, that only does whatismyipaddress and proxy video calls, shareable with my datacenter neighbors

Loonie - Wikipedia

Show thread

a very weeny construct 💀Oct 6, 2025

my prosody xmpp setup in its new server mostly works great (assuming ipv6 capable network) but somewhere i've introduced a timeout that closes the connection after a uniform number of seconds

pretty sure it has something to do with haproxy, though from a skim of the docs these timeouts are supposed to apply to the initial connection setup, not inactivity

also after a chat with someone more knowledgeable i think i'm resigned to eventually acquire that ipv4 address

Show thread

a very weeny construct 💀Oct 7, 2025

i think i might have fixed haproxy closing the socket on my long-lived idle xmpp connections by setting timeout tunnel 1h

i'll check again in several hours to be sure

wish i knew more precisely why this fixes the issue. are clients and/or the server sending keepalive messages more often than 1h but less often than 10m? is the tcp keepalive stuff not being used? someday perhaps but more likely i'll leave it unexamined as long as it works

Show thread

a very weeny construct 💀Oct 8, 2025

recent impulses have been like

"this is too long for toots i should blog about it"
"but i don't want to put anything new or deploy a new website until i have something installed to block the scrapers like iocaine"
"so let's install it"
"ehh not enough brain rn, maybe next time"

so i set up some rules to block a good portion of bots (until they smarten up)

which frees me up to actually post some blog 👍

i'll install iocaine properly after that

iocaine - the deadliest poison known to AI

Show thread

a very weeny construct 💀Oct 16

i want to set up a photo storage server

photoprism seems like a good browsing interface but what i'm more concerned about rn is the upload

so a client on each android phone that backs up photos to the server

but i want to be able to turn the server off for a while, as a normal/expected thing one does, and not have the clients moan about it. they should just retry occasionally until the server comes back online

anybody have a setup like this running already?

Show thread

a very weeny construct 💀Oct 28

musing about how to do high(ish)-availability systems on the cheap, goblin style

i got this sweet vm on a friend's server
someday i wanna convince employers they're better off in there, than on aws
if the host, or the whole datacenter, or the hose from the internet to that datacenter get orbital-lasered to slag in the middle of the business day, how to automatically shunt traffic to backup systems?
dns records are too cached, too slow to flip

...

Show thread

a very weeny construct 💀Oct 28

also don't want to send all traffic through a load balancing proxy, expensive and another single point of failure

but wait i have some tradeoffs that might enable some tricks:

i want high availability for existing users but new users can wait for dns to propagate
the site can be a "progressive web app" that gets mostly cached in the browser

so in this specific case i think i could do client-side js failover. maybe even a service worker?

Show thread

a very weeny construct 💀Oct 28

also trying to learn incus vs podman vs docker nope vs kubernetes not u

Show thread

a very weeny construct 💀Oct 28

wait, how often does the whole region disappear anyway?

that was never a concern multiple employers ago when i got to help out at the datacenter

they did redundant everything inside the rack, regular cable-yank failover tests and everything, but no geographical redundancy iirc

maybe i'll inquire about a vm on another host within the same rack when i get closer to dragging clients on board and just forget about higher availability than that for now

Show thread

a very weeny construct 💀Oct 30

embarrassed to admit that i've today taken one halfhearted step toward learning wtf snmp is by way of (re)reading the rrdtool tutorial

no, not smtp the email sending thing. snmp the monitoring of hardware status thing

all because i want to put up some pretty charts of computer doing inscrutable computer thing

(accuracy? that's like number seven or twelve down the list of nice-to-haves)

RRDtool - rrdtutorial

Show thread

a very weeny construct 💀Nov 3

well, actually,
my ipv4-only client
-> colocataires' ipv4-to-ipv6 sniproxy on port 443
-> my ipv6-only vm
-> haproxy to unwrap proxy protocol
-> prosody xmpp server
... experiment is not working ✨😕✨

so:

did it never work and i mistakenly thought that it did?
or
did it work at first but i broke it?

an easy fix would be to get an ipv4 address which obviates the need for sniproxy. but dammit before i do that i want answers: is this setup possible? if so, what'd i mess up?

Show thread

a very weeny construct 💀Nov 11

(maniacal cackling)

i have finally got iocaine installed. wasn't even hard, just needed to sit down and do the steps and brain is real good at not that sometimes

hooked it up to the apt-installable anarchism faq for its markov corpus and the biggest canadian flavored apt-installable wordlist i could get

feels good. like the invulnerability you get from your favorite winter gloves and jacket before going out to play in the blizzard

now it's safe to blog again 🎉

Show thread

a very weeny construct 💀Nov 12

i um only just now noticed that the apt-installable anarchism faq, in uncompressed markdown format, which i fed to iocaine for its markov corpus,

is twelve megabytes. of text.

almost 1.9 million words.

iocaine seems to be doing just fine so far

Show thread

a very weeny construct 💀Nov 15

accidentally set caddy to syslog every request sent to iocaine 3 and oh gosh my website is pumping so much poison markov trash into chatgpt and claude rn 😍 💕

and it's using less cpu and memory than systemd-journald to do so

might need to look into setting bandwidth limiters on this thing

Show thread

a very weeny construct 💀Nov 18

i'm still casting around for anti-cloud(flare) mechanisms of regional failover. like if the cable to the datacenter i use gets cut, or there's political upheaval, how to automatically shunt traffic to a different datacenter faster than a dns update would propagate through caches

i'm vaguely aware of this technology called anycast but i don't know much

https://grebedoc.dev/ uses https://rage4.com/ to do it

https://en.wikipedia.org/wiki/Anycast

Show thread

a very weeny construct 💀Nov 20

yeah eat it, ai scraper assholes

(gradually improving my monitoring, iocaine stats newly added to my collectd/rrdtool dashboard)

Show thread

a very weeny construct 💀Nov 22

tiddlywiki doesn't come with a basic to-do feature, to make checkboxes and tick them off without having to tediously edit the page and type some [x]s

but it does have a plugin mechanism. found two plugins (both by the same author) that do checklists: Kara and Todolist

installation instructions made me nervous though, since i'm using tiddlyPWA that is rather different on the backend...

Kara 0.9.7

In tiddler plain checklist and interstitial journaling plugin

Show thread

a very weeny construct 💀Nov 22

turns out though, Todolist is super easy to install on a self-hosted TiddlyPWA! drag and drop, click the button to reupload the wiki file to the server, and done.

entirely through the browser, way easier than messing around with directories. woah tiddlypwa is easier than stock tiddlywiki!?

Todolist Plugin 1.5.0

Organize, prioritize, and plan your work

Show thread

a very weeny construct 💀Nov 25

i haven't put any rate limiters on here yet (i definitely will), but seems like claude and chatgpt limit themselves to 25 requests per second to my websites. i wonder how they picked that number, and if they'll ramp it up. and if i ratelimit, will they send more requests from other ip addresses. etc.

feels so good to know these assholes' language models are chugging down low-effort ungrammatical poison after ignoring my robots.txt

Show thread

a very weeny construct 💀Nov 28

should i do traffic shaping using tc, haproxy, or shove yet another plugin into caddy?

should i slow the response down to a trickle for all the llm scrapers, or randomly drop their connections? 😈

Show thread

a very weeny construct 💀Dec 2

despite it being part of linux since version 2.2, which is about as long as i've been daily-driving it, i hadn't heard of tc until this past month. that's "traffic control," a tool to control the kernel's network traffic limiting, smoothing, and prioritization

and for a command with such a tiny name wow it's a lot

i only want to restrict the bandwidth of one process so i think i'll look for easier mechanisms before i attempt to swallow this whole burrito

Show thread

a very weeny construct 💀Dec 3

til: trickle, a lightweight userspace bandwidth shaper

could i just wrap iocaine with this and be done?

... except trickle doesn't work on statically linked executables, like iocaine. womp womp

i guess i could do a trick like wrap socat with it, then talk to iocaine through that,

but that feels more complicated than just switching back to haproxy and using its builtin traffic shaping features

GitHub - mariusae/trickle: Trickle is a userland bandwidth shaper for Unix-like systems.

Trickle is a userland bandwidth shaper for Unix-like systems. - mariusae/trickle

GitHub

Show thread

a very weeny construct 💀Dec 4

what bits of haproxy, lighttpd, nginx, caddy, static-web-server should i string together?

requirements:

iocaine can plug in somewhere
can control the bandwidth of iocaine's garbage generator
static web server
- for multiple domains ("virtual hosts")
- uses sendfile() for speed
- precompressed files trick

Show thread

a very weeny construct 💀Dec 4

haproxy: has traffic shaping, proxy, fastcgi. no sws; have to proxy one. which?
- lighttpd: small, sendfile(), correct webdav. can do reverse proxy itself, makes haproxy redundant? can i plug in iocaine?
- nginx: i quit it because it was segfaulting when i tried to configure too many features. but if i'm only using it for static files maybe it's ok
- static-web-server: i like rust. don't like a copy paste chunk of toml per configured domain

Show thread

a very weeny construct 💀Dec 4

caddy: proxy, fastcgi, builtin static fileserver, traffic shaping requires a module that doesn't do quite what i want. getting tired of guessing my way around caddyfile syntax. don't need its magic certificate management.
- use 'tc' for traffic shaping? big learning curve
- front it with haproxy? lots of redundant features, feels heavy

dang i gotta draw up a feature matrix or something

Show thread

a very weeny construct 💀Dec 5

it's pretty weird that it took me this long to actually do but

tonight i have set up for the first time a program running on a computer inside my home, that people may access like a normal website, without learning my cable modem's ip address in the process, and if someone starts ddosing me i can just unplug and let the household continue watching videos unaware

(i'm having my @colocataires vps proxy traffic through a tailscale vpn to my closet fileserver)

Show thread

a very weeny construct 💀Dec 5

safe(r) home-hosting by reverse proxy from a little computer in a datacenter is one of those things that seems like complex esoteric engineering from afar

but once you've experienced it, and then again when you've set it up yourself, all of a sudden it makes sense and is totally normal and a whole mess of possibilities for what you can cheaply and casually build on the internet blasts wide open

like the first time you experience nerd astral projection

Show thread

a very weeny construct 💀Dec 7

llm scrapers ignoring my robots.txt and pounding on my small website 28 times per second, 24/7. 600kbps of my available bandwidth wasted just on markov trash

it's easy to imagine how they'll ddos any service that does a bit of compute on each request

Show thread

a very weeny construct 💀

it's not super exciting but if you're the kind of weirdo who wants to look at my vm's gauges, they are viewable here:

https://telemetry.orbital.rodeo/

i have been cobbling it together using collectd, rrdtool, and scripts instead of the far more reasonable and popular prometheus / grafana combo. because it might be more lightweight? haven't measured

for now it updates only when i run the command, so don't sit there wondering

no light mode or explanatory text (yet) soz

Show thread

a very weeny construct 💀Dec 9

i learned how to make haproxy throttle iocaine's output so the scrapers continue to download delicious poison but now only at 56kbps (down from 600kbps) 🎉

Show thread

a very weeny construct 💀Dec 9

oof, it's somewhat heavy though. went from about 3% avg cpu use to about 6%

the throttling that haproxy does just gets buffered up by caddy in front of it and the result is a long initial delay before a fast transmission of data. like latency

which could probably be implemented more simply with a sleep statement somewhere

i wonder what other strategies i can use to slow down crawlers. thinking random connection drops or http errors 429, 402, and 451

Show thread

a very weeny construct 💀Dec 11

somewhere on my to-do list: look into how yunohost compares to just installing various bits of software on your vps. is it heavier, easier, less customizable, can you put other stuff alongside it, etc.

Show thread

a very weeny construct 💀Dec 17

problem: once detected, how best to slap back at ai scrapers? return poison quickly? tarpit? throttled poison drip? drop their ip's packets at the firewall?

idea: drop packets during business hours to free up bandwidth for legit visitors; fast poison otherwise to collect ip addresses for next day's ip ban. "party all night sleep all day" strategy

Show thread

a very weeny construct 💀Dec 17

lots of bot traffic hitting port 80 (http) on my vm just to get redirected to port 443 (https) where they get a "go away, bot" error

who am i keeping port 80 open for?

who types in "orbital.rodeo," lands on http, and doesn't know to or can't try https instead?

many people use hsts and abandon 80

caddy auto-magically puts a redirect on 80 for my sites but i'm increasingly annoyed by its magic. wanna go back to haproxy

think i'll shut it

HTTP Strict Transport Security - Wikipedia

Show thread

a very weeny construct 💀Dec 19

oh, whoopsie

if an ipv4->ipv6 proxy is telling my webserver the clients' ipv4 addresses using proxy protocol, blocking those ipv4 addresses at the webserver firewall isn't going to do much 🤦

i have a v4 address now, i was just lazy about reconfiguring dns to send v4 web traffic direct to my vm instead of through the v4-v6 proxy. time to get on that

Show thread

a very weeny construct 💀Dec 19

oh that's more like it
after blocking ai scrapers at the firewall, cut my cpu use down to 1% and bot traffic to almost nothing
you love to see it!

Show thread

a very weeny construct 💀Dec 20

🤔 i wonder how many innocents i'll accidentally shut out if i adopt a policy of, "any /24 prefix with 3 or more scrapers within it dooms the lot"?

🤔 i could set up a "pls let me back in" automation. tell me my biceps are eleven out of ten in this web-form and you get added to an inclusion list that takes effect before the block list

Show thread

a very weeny construct 💀Dec 23

i could implement both of those defense mechanisms

reduce bookkeeping on my part by being a bit overeager about blocking whole prefixes instead of individual ip addresses

definitely want to do something like @alex's butlerian jihad where i block all networks from any ASN abusing my sites

but also, have a cooldown that sends traffic from blocked prefixes to a "let me back in" form that allowlists individual addresses

Show thread

a very weeny construct 💀Dec 23

oh cool, while i wasn't paying attention anubis has grown dataset poisoning features like what iocaine does and a (paid) collaborative reputation database mechanism

Making sure you're not a bot!

Show thread

a very weeny construct 💀Dec 23

haha oops i accidentally banned my own ip. fixed it but guessing i'll have to flush the ban lists and rebuild in case i caught any more i shouldn't have

one super nice thing i'm doing this time around is using a wireguard-based vpn for all my ssh'ing. so even when i blocked my own ip address my ssh session was unaffected and i could fix it. and zero log spam from vulnerability scanners constantly trying the door 😌

Show thread

a very weeny construct 💀Dec 23

i want to block any requests from google and facebook; also i want to block any isp who would tolerate scrapers

the database of ip range ("prefix") assignments is downloadable but it's big. 590 entries just for as32934 (facebook). too big to just dump into the firewall

but there's often nothing between multiple records for any given asn. maybe i could treat that as a single range, which would let me express the set of ranges to block more concisely 🤔

Show thread

a very weeny construct 💀Dec 23

ooo python's builtin ipaddress library has collapse_addresses and address_exclude functions, and pyasn uses those. if i study those functions i think i should be able to come up with a "collapse_addresses" variant that absorbs unallocated gaps between allocated subnets for a more concise specification

https://github.com/python/cpython/blob/3.14/Lib/ipaddress.py#L304

Show thread

a very weeny construct 💀Dec 24

haha while searching ASNs for "AWS" to block i learned of the existence of AS214513 EEPYPAWS and AS401962 CUDDLE-PAWS

i wonder what other fun autonomous system names are out there, and what they're doing

when it often feels like the internet is like six giant websites consuming everything, it's great to feel lost in a massive database of tiny organizations doing a niche highly technical thing like registering an autonomous system for internet shenanigans

Show thread

a very weeny construct 💀Dec 28

Q.: is communicating to an xmpp server's "direct tls" aka "xmpps" port the same (in terms of protocol) as communicating through a tls tunnel / reverse proxy to its plaintext xmpp port?

Show thread

a very weeny construct 💀Jan 7

so, iiuc, i shouldn't be blocking crawl bots from google, facebook, etc. network ranges, because then i can't feed them poison urls, whereupon i won't be able to identify the more carefully-disguised requests from residential botnets masquerading as browsers

but i do want to very much limit the bandwidth they may consume

so instead of

ip saddr @miscreants drop

let's try

ip saddr @miscreants limit rate over 1/second drop

Show thread

a very weeny construct 💀Jan 19

update: rate limiting packets at the firewall from networks controlled by my biggest bot offenders (facebook, microsoft, google, apple) has accomplished exactly what i wanted: i continue logging and feeding them poison, but their bandwidth is greatly reduced

my implementation might be causing their connections to close mid-request but i don't mind very much. lots of sockets (50) in SYN-RECV state compared to ESTAB (3) rn which could eventually be a problem

Show thread

a very weeny construct 💀Jan 20

oh wait whoops. lots of sockets in SYN-RECV wasn't the fault of my inexpert ratelimiting. an asn outside my "big tech" filtered set was sending me SYN packets and not following up with--

oh my gosh, was i being syn-flooded? was someone angrily trying to deny service to the maybe 3 legit people that want to see my website??

anyway i added them to the limiter so now they're holding around ~3 sockets in SYN-RECV state instead of ~40

Show thread

a very weeny construct 💀Jan 25

graph: sockets in SYN-RECV state. if i understand correctly, this occurs when somebody says "hey let me connect" and my server says "ok you can connect" and then they just never reply

eventually it stops waiting but until then the socket is in use. so if some miscreant fires off a ton of "hey let me connect" without replying they can clog up the pipes

in previous experiments this line averaged either 40 or 0

love being able to just turn the firehose off

Show thread

a very weeny construct 💀Jan 25

i've seen some example nftables rules that monitor how quickly any one ip address is opening new connections, and if it exceeds some threshold instantly blocks them. going to study that and get it working for mine too

Show thread

a very weeny construct 💀Jan 10

seems from experimenting like the answer to this is "no"; you can't put a tls terminating reverse proxy in front of unencrypted xmpp and have it become xmpps. like you can with http and other protocols

closest i got was instructing my xmpp clients to use "legacy tls" mode- they then could successfully tls handshake and connect but wouldn't authenticate my user

but why. starttls is more complex and less secure, why is it so prevalent in xmpp

Show thread

a very weeny construct 💀Jan 7

too big to just dump into the firewall

whoopsie, that was a wrong assumption on my part based on a bad time i had with way too many iptables firewall rules created by fail2ban many years ago

these days i'm using nftables and its set structure to hold ip addresses, which uses radix trees just like the routing tables do, and you can dump addresses in there all day long, it will manage merging them into ranges and auto expiring them if you want, works great

Show thread

a very weeny construct 💀Jan 27

so upthread i was surprised that you can just shovel truckloads of ip addresses into nftables' "set" structure, for blockin' purposes

but i want to do stuff like detect if several addresses within some autonomous system's range are coordinating for shenanigans, and block the whole damn asn

this example, on the nftables wiki itself, loads a whole ass maxmind geoip db into nftables' "map" structure and my first reaction was "surely not"

https://wiki.nftables.org/wiki-nftables/index.php/GeoIP_matching

GeoIP matching - nftables wiki

Show thread

a very weeny construct 💀Jan 27

i mean are there limits? how many rules and addresses can i dump into nftables tables, chains, maps and sets (which, iiuc, all live in the kernel) before it crashes

Show thread

a very weeny construct 💀Jan 27

anyway it's goblin week, or it was recently? so mnabye imma try implementing automatic, immediate, asn matching and blocking in nftables rules 😈

Show thread

a very weeny construct 💀Mar 18

woah cool i just learned about the nftables feature concatenations

i'm already 🤩 about nftables' very fast sets and maps but today i learned that you can store essentially tuples of data in them

which in some cases can let you test multiple conditions at once, replacing multiple rules with a fast set-membership check

Concatenations - nftables wiki

Show thread

a very weeny construct 💀Mar 27

about 1.5 days after asking iocaine to not just poison but also block ai scrapers masquerading as browsers, i have about 36000 ip addresses blocked at the firewall

this is for a site that is not advertised anywhere, disliked by search engines, and contains maybe 10 blog posts that rarely change. AND which preemptively blocks several whole gafam corporate ASNs so not even counting them

so i expect more popular sites are seeing many multiples of this traffic

Show thread

a very weeny construct 💀Mar 27

anyway, thinking again about how to analyze this ever growing set of blocked ai scraper addresses, most of which are probably "residential ips."

calculate for each asn the percentage of its ip range that i've blocked, and above a certain threshold block the whole range? (that would be more efficient than recording every single bad address)

Show thread

a very weeny construct 💀Mar 27

ideas contd.:

have an unblocked subdomain where a legit user of a blocked ip might fill out a form and click a "let me back in" button to get onto an allow-list

double extra forever-ban anybody that uses the "get me back in" button then starts snarfing down poison again

Show thread

bakachu Mar 27

@pho4cexa this is a delightful idea

Show thread

packetcat Mar 27

@pho4cexa yep, the amount of unwanted bot traffic I see hitting client sites at $WORK is truly staggering.

@pho4cexa jesus

@pho4cexa Interesting. We only see 176 routes active from AS32934 on our edge router.

```
bird> show route protocol bgp_he_v4 where 32934 ~ bgp_path all count
176 of 1038011 routes for 1038011 networks in table master4
```

Also, it's a 1 litre Lenovo MiniPC and it can handle over 1 million IPv4 routes without a sweat -- have you considered creating null routes for all of the networks you don't like instead of firewall rules?

Show thread

a very weeny construct 💀Dec 24

@insom the database i downloaded has lots of redundant entries for some reason; many of the records of networks assignments are subnets of other records

i hadn't considered null routes before! are they more efficient than firewall rules for the same purpose? i'd naively assume that a null route would would make replies to malicious networks impossible, but would still allow requests from them to arrive; i guess that's not the case? i'll read up about it!

Show thread

Aaron Brady Dec 24

@pho4cexa Yup, the initial packet could arrive (SYN) but you'd never send a (SYN+ACK) so a session wouldn't be established.

The Linux kernel uses a radix tree to efficiently store the routing table / make routing decisions which is pretty compact and low CPU.

I suspect that every line in iptables would be iterated over for every single packet arriving and I don't think there's any structure more advanced than a linked list at work there.

It'd be fun to try!

Show thread

Alex Schroeder Dec 23

@pho4cexa Time for that magic allowlist at the top of the chain that lists your IP numbers, just in case.

Show thread

Alex Schroeder Dec 23

@pho4cexa Yeah, I have an allowlist based on the follows and follow-requests of my account on the single-user fedi instance I run, for example. Haven’t updated it in a while but the idea is I want to block Hetzner and OVH and all that without damaging my fedi experience.

Now that I think about it, checking my followers would make sense, too.

Anyway, the allow list must be based on something – MX records of the email addresses in your contacts would be a candidate, too. That kind of thing. I just haven’t heard of anybody affected by it.

https://alexschroeder.ch/view/2025-08-03-gotosocial

Show thread

Alex Schroeder Dec 23

@pho4cexa Actually, looking at what I've been doing recently, I suspect that some of that didn't work. I think what I've done is use address ranges in CIDR notation for fail2ban to ban and unban, but fail2ban just manages the info, the actual banning at the firewall failed: something about the rules setup by nft that don't allow prefixes. Yikes! I've been fooled by numbers going up (IP ranges listed as banned but not actually being banned). 😥
https://alexschroeder.ch/view/2025-12-23-santa-bots

Show thread

a very weeny construct 💀Dec 23

@alex oof, glad you figured it out though. i'm not using fail2ban (yet) in my setup, just scripts that add stuff to some nft sets i created by hand, otherwise i might have run into the same problem

Show thread

Alex Schroeder Dec 23

@pho4cexa The big (and only?) benefit of using fail2ban in this context is that it takes care of expiring the bans. How do you expire bans, if you let them expire at all?

Show thread

a very weeny construct 💀Dec 23

@alex i haven't bothered with expiring bans yet, facebook's ip range can get fucked forever 😁

but if i do decide i want them i plan to read more about how to use nft set element timeout and expiry. hoping that will give me all the tooling i need:
https://wiki.nftables.org/wiki-nftables/index.php/Element_timeouts

Element timeouts - nftables wiki

Show thread

Alex Schroeder Dec 23

@pho4cexa I think I figured out what I was missing. The nft table that fail2ban creates contains sets without flags interval; so prefixes weren't allowed. I added the answer to https://alexschroeder.ch/view/2025-12-23-santa-bots

Show thread

Luddicus Mus Dec 9

@pho4cexa Please share! This would be a great addition to the iocaine + haproxy docs! O:)

Show thread

a very weeny construct 💀Dec 9

@algernon i'm no expert here and i probably got this wrong. i'm already seeing bandwidth higher than this limit so i suspect this ends up being 7KB per client not per backend. that said:

listen slowpoison
    bind localhost:42068
    mode http
    filter bwlim-out my-limit default-limit 7k default-period 1s
    http-response set-bandwidth-limit my-limit
    server iocaine localhost:42069

then have caddy proxy to port 42068 instead of 42069

Show thread

Jamey Sharp Dec 9

@pho4cexa my first thought was that it takes server resources to keep those connections open, so keeping each connection open for longer is like you're doing a https://en.wikipedia.org/wiki/Slowloris_(cyber_attack) attack on yourself. but then that got me wondering if a version of iocaine with its own TCP stack could be stateless server-side and keep all the state it needs in the TCP sequence numbers. which would be a ton of work to invest in something that shouldn't need to exist in the first place, but it would also be very entertaining imo 😂

Slowloris (cyber attack) - Wikipedia

Show thread

a very weeny construct 💀Dec 9

@jamey yes indeed i probably am lol

next step is to start dropping connections rather than feeding them poison, once ive got a few on the line

Show thread

Luddicus Mus Dec 9

@jamey @pho4cexa I think that'd be out of scope for iocaine, but... I plan to turn iocaine into a library, and write a reverse proxy that's just barely has the functionality to replace the Caddy + iocaine + bazmeg (fail2ban-like thing, but tailored for my particular need) + vector combo with just one thing.

Now that would be a place where this could be done. It would be very difficult, indeed, and I suspect, would require noticably more CPU than it does now, because I I'd have to make the response streamable, in smaller chunks, and that complicates the architecture a lot.

Show thread

Jamey Sharp Dec 9

@algernon @pho4cexa hehe, I would be delighted to see that. if you want to chat about it as you go I'd be interested!

Show thread

Alex Schroeder Dec 8

@pho4cexa I always wondered whether showing stats (in my case: Munin) would be a security problem or not. I don’t think you can do anything from the web UI except click through and get to a CGI interface and use that to abuse the system. Hm. Might be interesting for users of services in my site to see.

Show thread

a very weeny construct 💀Dec 9

@alex yeah agreed, i'd be a bit wary if all these data-heavy graphs were generated on every request, but in my case they're static hosted images generated outside the request/response cycle. only other concern is revealing to people the set of domains my vm hosts, just for privacy reasons, and i think i inspected it well enough to avoid problems

Show thread

Alex Schroeder Dec 23

@pho4cexa I keep wondering whether exposing this info is a security issue but today I decided to give it a try and removed basic auth from the static Munin pages: https://alexschroeder.ch/munin/

Show thread

a very weeny construct 💀Dec 23

@alex nice! how do you find munin? your dashboard is much more complete and nicer to browse than mine, (https://telemetry.orbital.rodeo) but i'm being super miserly about how many cpu cycles i want to devote to the task. so i'm hesitant about exploring nicer tools like munin or prometheus+grafana.

rn mine's collectd/rrdtool with a graph creation script i run by hand when i want to. which is admittedly way too ducktape and chicken wire. proof of concept

XMPP - Wikipedia

Colocataires: Host with Friends

Converse

Loonie - Wikipedia

iocaine - the deadliest poison known to AI

RRDtool - rrdtutorial

Kara 0.9.7

Todolist Plugin 1.5.0

GitHub - mariusae/trickle: Trickle is a userland bandwidth shaper for Unix-like systems.

HTTP Strict Transport Security - Wikipedia

Making sure you're not a bot!

GeoIP matching - nftables wiki

Concatenations - nftables wiki

Element timeouts - nftables wiki

Slowloris (cyber attack) - Wikipedia

Telemetry | orbital.rodeo