Andreas Wagner

@anwagnerdreas@hcommons.social
1,078 Followers
1.4K Following
5.3K Posts

I am #DigitalHumanities Coordinator at Max Planck Institute for #LegalHistory and #LegalTheory (#mpilhlt) #Frankfurt

Also collaborator of salamanca.school project of #adwl #Mainz

While I mostly toot about work, I do have hobbies...
#Capoeira #Stratocaster

#LawFedi #Histodons
#NLP #TEIXML #Golang #Python #Elm #XQuery

If you're reading this on bsky, follow @ap.brid.gy so i can see your replies.

orcidhttps://orcid.org/0000-0003-1835-1653
hcommonshttps://hcommons.org/members/anwagnerdreas/
pronounshe/him
Over the past year, FromThePage--like many cultural heritage websites--has faced increasing pressure from scrapers harvesting data to train AI systems. These attacks reached a crescendo on Monday morning, when a Chinese bot-swarm overwhelmed our countermeasures. We are currently rearchitecting our systems to include a commercial service to fight this problem, and apologize for outages during this period.
I spent a couple of hours yesterday getting Audacity building, reproducing and diagnosing the bug, and wrapping my head around the complex logic in this part of the code so that I could implement a correct fix. To have copilot review my work, which I contributed back for free, is just so incredibly disrespectful to my time and effort.
You know back in my day, we had static analysis tooling that would give you exactly this kind of feedback, except it was correct. Now we have shit which only looks at the vibes of the source text and does no semantic analysis whatsoever, so of course it's just fucking wrong

Sent a pull request to Audacity fixing a crash bug I'd been running into frequently. The cause was an out-of-bounds memmove. Classic C++ areas.

Anyway I got a fucking copilot review on my PR which left two comments, both completely wrong, one of which suggesting I reintroduce the out of bounds memory access. I'm furious!

"Indian court rules #transWomen are #women and ‘legally entitled to recognition’"

'In a landmark ruling for the country, after rejecting claims that womanhood was preserved only for those who can bear children, the High Court of #AndhraPradesh ruled that #trans women were “legally entitled” to recognition as women.

...tying the definition of women to pregnancy was “legally unsustainable” and contradicted #India’s constitution'

https://www.thepinknews.com/2025/06/26/india-trans-women-high-court-decision/

#transRights #LGBT #LGBTQIA #LGBTQ

Indian court rules trans women are women and ‘legally entitled to recognition’

A High Court in India has ruled that trans women are women despite what India law may say in a landmark decision for the country.

PinkNews | Latest lesbian, gay, bi and trans news | LGBTQ+ news
I think this is a particularly Dutch thing. The recent NATO summit was very disruptive for almost everyone in The Hague, and especially for people who lived nearby. Now that the summit is over, folks were invited to visit the venue, sit on the chairs where Trump & other notables sat etc. Thousands of people showed up, and it sounds like a great time was had by all. Photos: https://nos.nl/artikel/2572548-duizenden-nemen-kijkje-in-world-forum-na-navo-top-leuker-dan-a10-festival
Duizenden nemen kijkje in World Forum na NAVO-top: 'Leuker dan A10-festival '

Onder de bezoekers zijn veel Hagenaren die in de buurt van het World Forum wonen. Aan de NAVO-top ging maanden voorbereiding vooraf.

Great! A bunch of us here wanted it. Now it exists. 👍

It's a "dark archive" of the arXiv - a non-public backup to save the data in case of attack by hackers or the US government. The arXiv, I hope you know, is the biggest source of modern math and physics papers.

Who got the job done? The TIB: the Technische Informationsbibliothek, run by the Leibniz Information Centre for Science and Technology, in Hannover, Germany.

They write:

"The TIB has now set up a so-called dark archive for the arXiv content in order to be able to make the backed-up data accessible if the data stored in the USA is lost. The archive functions as a silent reserve: the complete copy of the content is stored decentrally at the TIB, but is not publicly accessible. This means that the data stock – almost 10 terabytes – is protected against potential outages and can be activated in an emergency.

The TIB is currently working on processes to keep the archive up to date: new submissions and updated versions must be backed up regularly in order to preserve the state of research as completely as possible.

“Building a Dark Archive is an expression of our longstanding commitment for a reliable, international academic provision, and as a partner of arXiv. Even though the Dark Archive today only works in the background, it is a key element in safeguarding digital research contents in the long term, because in case of a crisis, we could open the archive,” explains Dr Irina Sens, Deputy Director of the TIB."

We should call it the darXiv.

More details here:

https://blog.tib.eu/2025/05/14/protecting-science-tib-builds-dark-archive-for-arxiv/

Protecting Science: TIB builds Dark Archive for arXiv - TIB-Blog

Research and science are international; it is not for nothing that we speak of international specialist communities. Although a service such as arXiv is operated by an institution based in the USA, namely Cornell University, it is used by researchers worldwide. Part of arXiv‘s funding has also been internationalised since 2010 with the introduction of arXiv membership. The TIB finances the German contribution together with the Helmholtz Association of German Research Centres (HGF) and the Max Planck Society (MPG). The TIB has now set up a so-called dark archive for the arXiv content in order to make the backed-up data accessible in the event that the data located in the USA is lost.

TIB-Blog

@elotroalex.bsky.social No, that's Boudica. A bodega is a broil, fuss or a commotion, especially one over something of exaggerated importance.

https://fed.brid.gy/r/https://bsky.app/profile/did:plc:n7ql7xohro5n7wzgcft6ol7g/post/3lsjpgntl3k2e

Alex Gil (@elotroalex.bsky.social)

No that's a balestra. Bodega was a queen of the ancient Iceni tribe who almost defeated the Roman Empire. [contains quote post or other embedded content]

Bluesky Social
How do content moderation systems treat Indigenous languages online? CDT’s latest research report from Dhanaraj Thakur dives into how platforms moderate Quechua, a widely spoken but low-resource Indigenous language of South America—and reveals serious linguistic inequities. https://cdt.org/insights/moderating-quechua-content-on-social-media/
×
How do content moderation systems treat Indigenous languages online? CDT’s latest research report from Dhanaraj Thakur dives into how platforms moderate Quechua, a widely spoken but low-resource Indigenous language of South America—and reveals serious linguistic inequities. https://cdt.org/insights/moderating-quechua-content-on-social-media/
Quechua speakers use platforms like Facebook to celebrate and preserve their language and culture—but face algorithmic discrimination, unjust content removals, and targeted harassment. Moderation systems built for Spanish or English are failing them. https://cdt.org/insights/moderating-quechua-content-on-social-media/
Platforms are using LLMs to scale moderation—but experts say these tools aren’t ready for Quechua. Without native speakers on moderation teams or policies tailored to the language, harms go unchecked. Read the full report: https://cdt.org/insights/moderating-quechua-content-on-social-media/