We are in the twitter generation of scientific papers.

Instead of "here are several paragraphs of text explaining the context and framework of our work" I'm reviewing papers that are "I have broken my work up into paragraphs and each paragraph is a subsection with its own heading and contains multiple boldfaced words in case you didn't catch that I was referencing other peoples' work".

I'm not five. I can read long form works without gimmicks.

#GetOffMyLawn #ScientificPublishing

Me in the 2000s: Why would anyone want video calling on their phone?

Me in the 2020s: Why would anyone want video calling on their phone?

#GetOffMyLawn

Boooo! Games Workshop slowly ditching brand names and just turning everything into "Warhammer".

I remember when there used to be Marauder Miniatures as well as Citadel Miniatures!

https://www.warhammer-community.com/en-gb/articles/okgv4njl/citadel-colour-is-becoming-warhammer-colour/

#GamesWorkshop #Warhammer #GetOffMyLawn

Citadel Colour is becoming Warhammer Colour

Same paint. Obvious new name.

Warhammer Community

@blogdiva

Pretty much gopher here too.

I had to dial into a VAX, and all I could do was telnet to other sites. So I would telnet to a "gopher site" (really a gopher client that was publicly available via telnet) to browse the internet. I didn't have ftp access, and many gopher sites didn't have sz installed, but boombox . micro . umn . edu was one of the few, and through that I was able to download FreeBSD and other stuff.

Shortly after that, I found slirp and a local free net provided free shell and so I used that.

#getOffMyLawn

IN OTHER NEWS

i just reckoned i have been online for 42 years. started using the internet at the University of Puerto Rico. compared to my friends, twas living in the future using Gopher:

https://en.wikipedia.org/wiki/Gopher_%28protocol%29

my digital footprint is older than most Millennials and Zoomers.

which protocol did you first use to pop your internet cherry?

#getOffMyLawn

Gopher (protocol) - Wikipedia

The correct levels of silver paint are:
* Boltgun Metal
* Chainmail
* Mithril Silver

#Oldhammer #MiniaturePainting #GetOffMyLawn

Everybody needs to stop beginning sentences with "Yet..."

#GetOffMyLawn #English #gripe

Army of Bots

For some months now I have a simple detection against "bad" bots in place. Bots that scrape *everything* they find and very likely are vacuuming all the contents they get to feed the data grinders that train the LLMs of the world. Bots that not only ignore the "robots.txt" protocol, but actively see entries in the robots.txt file as an invitation to visit the contents that are listed there as "disallowed".

I always had a hunch that stating addresses in a publicy reachable text file and flagging those as "please stay out of there" wasn't the best idea, but well, it was the only thing we've got back in the days where the only bots out there were the crawlers of the search engines.

(…) There are two important considerations when using /robots.txt:
robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.
robotstxt.org

Now with all the content-sucking and scraping that the "AI" corporations let lose on the web, it is not unusual to haver a massive spike in bot-related visits even in the personal-website-space. And those scrapers are ruthless, they hammer the servers in high frequency and repeatetly, and are killing the web as we know it along the way.

(…) Many of these scrapers are so sophisticated that it is hard, or impossible, to detect them in action. They often ignore the websites’ programmatic pleas not to be scraped, and are known to hit the more fragile parts of a website repeatedly. opendemocracy.net

I created a directory with a random name in the top-level of my website.

I then added this directory in the robots.txt file with a disallow. This directory is not linked anywhere. Its name is so random and cryptical that it is highly unlikely that a "name guessing" bot will find it (like those exploit-searching idiot scripts that hammer on "wp-admin" or "typo3" URLs even on sites that don't use WordPress or TYPO3…). Inside the directory is a index script that

a) sends me an email,
b) logs the visit with user-agent-string and IP address and
c) saves the data in a nosql db.

In front of my website I have a script that will check the current visitor's IP address against the nosql and if the IP matches, a HTTP 403 status is served.

Here's a best-of user agent strings that recently "visited" my hidden dir.
That last one is superb, considering that this one alone is several times in my log, of course with a different IP each time:

PetalBot
Googlebot/2.1
Claude-SearchBot/1.0
Thinkbot/0 +In_the_test_phase,_if_the_Thinkbot_brings_you_trouble,_please_block_its_IP_address._Thank_you.

Plus, there's a load more that pretend to be "normal" web browsers, of course. 🙄

It is a crude, a symbolical fist shaking yelling at clouds kind-of thing, especially compared to the things that Matthias Ott shared in his post, but it is better than nothing.

#Bots #getoffmylawn #LLM #scrapers

https://webrocker.de/?p=29781

RE: https://ext.sportsbots.xyz/statuses/2021040960260784524

Zooming in on what someone is texting and then posting it online STINKS. But no one cares anymore as long as it is good content 🤦🏽‍♂️ #GetOffMyLawn

Boy can you tell the print quality improvements over the years at Steve Jackson Games... first printing through to current. The original rubber bands I'd had on my cards had long since disintegrated. #Munchkin #GetOffMyLawn #BoardGames #TTRPG #ChaoticQueer #GeekCrafts