Leon P Smith

@leon_p_smith@ioc.exchange
91 Followers
300 Following
2.4K Posts
Communications engineer and mathematician. Longtime functional programming and Haskell enthusiast, occasional Schemer. Inventor of corecursive queues, postgresql-simple, an aggregate theory of concrete mathematics, and self-documenting cryptography. Currently aspiring to become an Epistemic Frame Engineer.
Pronounshe/him

Great! A bunch of us here wanted it. Now it exists. 👍

It's a "dark archive" of the arXiv - a non-public backup to save the data in case of attack by hackers or the US government. The arXiv, I hope you know, is the biggest source of modern math and physics papers.

Who got the job done? The TIB: the Technische Informationsbibliothek, run by the Leibniz Information Centre for Science and Technology, in Hannover, Germany.

They write:

"The TIB has now set up a so-called dark archive for the arXiv content in order to be able to make the backed-up data accessible if the data stored in the USA is lost. The archive functions as a silent reserve: the complete copy of the content is stored decentrally at the TIB, but is not publicly accessible. This means that the data stock – almost 10 terabytes – is protected against potential outages and can be activated in an emergency.

The TIB is currently working on processes to keep the archive up to date: new submissions and updated versions must be backed up regularly in order to preserve the state of research as completely as possible.

“Building a Dark Archive is an expression of our longstanding commitment for a reliable, international academic provision, and as a partner of arXiv. Even though the Dark Archive today only works in the background, it is a key element in safeguarding digital research contents in the long term, because in case of a crisis, we could open the archive,” explains Dr Irina Sens, Deputy Director of the TIB."

We should call it the darXiv.

More details here:

https://blog.tib.eu/2025/05/14/protecting-science-tib-builds-dark-archive-for-arxiv/

Protecting Science: TIB builds Dark Archive for arXiv - TIB-Blog

Research and science are international; it is not for nothing that we speak of international specialist communities. Although a service such as arXiv is operated by an institution based in the USA, namely Cornell University, it is used by researchers worldwide. Part of arXiv‘s funding has also been internationalised since 2010 with the introduction of arXiv membership. The TIB finances the German contribution together with the Helmholtz Association of German Research Centres (HGF) and the Max Planck Society (MPG). The TIB has now set up a so-called dark archive for the arXiv content in order to make the backed-up data accessible in the event that the data located in the USA is lost.

TIB-Blog

"Zohran Mamdani’s victory proves it: The ‘gotcha’ mode of fighting antisemitism has to go. …

When we reduce understanding of antisemitism to buzzwords — and say that we expect certain answers to certain questions and, if we don’t hear them, that means the candidate is an antisemite who has no place holding office — we confuse the definition of antisemitism. And we do nothing to actually, tangibly advance Jewish safety."

~ Emily Tamkin

#Mamdani #Jews #diversity #antisemitism
/9

You know back in my day, we had static analysis tooling that would give you exactly this kind of feedback, except it was correct. Now we have shit which only looks at the vibes of the source text and does no semantic analysis whatsoever, so of course it's just fucking wrong

Sent a pull request to Audacity fixing a crash bug I'd been running into frequently. The cause was an out-of-bounds memmove. Classic C++ areas.

Anyway I got a fucking copilot review on my PR which left two comments, both completely wrong, one of which suggesting I reintroduce the out of bounds memory access. I'm furious!

Kinda hit me this morning how AI is an assault on gifting economies: reddit, Wikipedia, github, AO3 (even books/art, although those are more tangled with money-making) are all gifting economies that run on the idea that we all benefit by sharing. People freely give because it makes life better.
1/n

@slava if you are interested in presentations and monoids, you might be interested that I finally got around to writing down and starting to develop my philosophy of math education.

I selected the Stern-Brocot tree, the Symmetry Group of the Square, Pascal's Triangle, and computer programming as starting points for a study plan based around iterative deepening depth-first search.

Then I accidentally realized combining the Stern-Brocot tree and the Symmetry Group of the Square results in the general modular group GL(2,Z), a fact from which you can derive presentations of GL(2,Z).

It turns out that the Stern-Brocot tree is isomorphic to the free monoid SL(2,N) given by the presentation <L, R>, and this is a submonoid of GL(2,Z), SL(2,Z), PGL(2,Z), and PSL(2,Z).

https://github.com/constructive-symmetry/constructive-symmetry/blob/master/T002_Tools_of_Math_Construction/Part03_Aggregate_Theory.md

constructive-symmetry/T002_Tools_of_Math_Construction/Part03_Aggregate_Theory.md at master · constructive-symmetry/constructive-symmetry

A Philosophy of Math Education. Contribute to constructive-symmetry/constructive-symmetry development by creating an account on GitHub.

GitHub

Setting aside copyright/commercial/other aspects, a thought about writing blog posts that LLMs train on. When people write posts, they (a) feel good about helping others, and (b) hope to get some credit and visibility for doing so.

When mediated through LLMs, no longer the satisfaction that your consumer is a human (who might comment, thank, share, etc.); nor the cred that comes from people remembering the author, posting on HN, etc.

That is, it totally destroys the incentive structure.

@Anibyl I believe the correct answer is 0!

NIST published guidance in 2024 advising that you don’t need to change your password that frequently.

#nist #passwords #infosec

https://cybersecuritynews.com/nist-rules-password-security/

NIST Recommends New Rules for Password Security

The National Institute of Standards and Technology (NIST) has released updated guidelines for password security, marking a significant shift from traditional password practices.

Cyber Security News

It's me! On #numberphile! Talking about my favourite polyhedron!

https://youtu.be/3X2aQIMx5bs?feature=shared

After many years never being able to line up doing numberphile video, I was really glad that Brady and I ended up in the same place to do this. I love the animations and especially the sound effects!

The 9-sided Enneahedron - Numberphile

YouTube

I wonder, too, if part of the hidden perniciousness of this kind of policymaking is the view that families (ahem, ALL OF US) who need supportive health infrastructure are "not who we want here anyway."

Alternate and more real frame: healthy families and families who need structural medical support ARE OFTEN THE SAME FAMILIES AT DIFFERENT MOMENTS and the trajectory doesn't run unidirectionally

Our kids are fine now, actually. Knock on wood, but we've had no major incidents in almost a decade. They don't need lifelong assistance; they needed to survive childhood, and so did our family

Lots of families have seasons of needing help, and seasons of everything is great

You are missing the long season of fine because you sucked ass when we needed help 🤷🏻‍♀️

×

Sent a pull request to Audacity fixing a crash bug I'd been running into frequently. The cause was an out-of-bounds memmove. Classic C++ areas.

Anyway I got a fucking copilot review on my PR which left two comments, both completely wrong, one of which suggesting I reintroduce the out of bounds memory access. I'm furious!

You know back in my day, we had static analysis tooling that would give you exactly this kind of feedback, except it was correct. Now we have shit which only looks at the vibes of the source text and does no semantic analysis whatsoever, so of course it's just fucking wrong
I spent a couple of hours yesterday getting Audacity building, reproducing and diagnosing the bug, and wrapping my head around the complex logic in this part of the code so that I could implement a correct fix. To have copilot review my work, which I contributed back for free, is just so incredibly disrespectful to my time and effort.
@hailey You could close the PR if you don't want to deal with refuting a bullshit generator.

@be @hailey Except that would mean the bug is still there.

And someone else may claim to make the fix, but apply the exact same reversal that Copilot insists is better without understanding why it doesn't fix the issue.

@be @hailey I would delete it so they cant easily reuse the free contribution.
@hailey And they're doing it because they believe that if they keep spamming the programming community with this bullshit enough, it'll eventually learn to code itself better and then ✨✨✨magical AGI bullshit heaven!!!✨✨✨

@hailey Not to self promo but we're working on rebasing off Audacity 3.7.4 (without the Muse stuff) and chances are we likely have this bug on our rebase branch. If you want to help us with our rebate efforts we'd be happy! No CLA and AI are involved too!

@gperson will be happy to help you with anything too. Just ask him and he'll help you out the best he can! (That's me writing this too. Hello 😄👋)

@tenacity @gperson I had already planned to send this fix across! I wasn't aware of the muse drama or Tenacity until today but I'll be switching over going forward :)
@hailey @tenacity for the uninitiated, what was/is the muse drama (if someone has the time to answer this or point me in a direction)?
Basic telemetry for the Audacity by crsib · Pull Request #835 · audacity/audacity

Please, see our response: #889 Dear all, Due to the large amount of worry about this PR, (which we completely understand), we want to clarify exactly what is going on: Telemetry is strictly optio...

GitHub
@crypticcelery there may have been additional happenings that im not aware of, but the main ones are that they introduced a CLA that would (i believe) hypothetically let MuseCY Holdings make proprietary releases based on the currently GPL codebase, and that they have had a couple kerfuffles with regards to privacy policies and telementry (which, imo, the reaction was more negative than would usually be warranted, probably in part because of the lack of transparency of MuseCY)
@jaxter184 I did not know about this, just the dumb analytics controversy. Welp, guess I'm going back to refusing to use Audacity.
@tenacity @gperson btw I would be interested in contributing some work to move the processing for the spectrogram view off the UI thread to keep the UI snappy and responsive (it currently chugs a bit while h-scrolling with spectrogram on), but since this is a larger piece of work + I see that Tenacity is in the middle of an upstream rebase effort, I wanted to check in first to see what kind of appetite there would be for accepting such a change. Let me know!

@hailey @tenacity I'd say go for it! Just keep in mind that we'd merge it after the rebate is complete, so maybe create the PR once it's all done and then we'll go from there! 😄

If you'd like to help us out with the rebate effort, you can port our dynamic compressor over. Just that plus the Matroska exporter are what we need left, which I'm handling the exporter right now. (There's also a few other things but these are the major things that need taking care if).

@gperson @tenacity should I base it off the rebase branch or main for now?
@hailey @tenacity Base it off the rebase branch ('audacity-3.7-rebase'). That way, it's less work done when we ultimately merge that branch back into main.
@hailey @tenacity Right now I have the default branch as 'audacity-3.7-rebase' as that's where active development is happening. Eventually, I'll merge it back into 'main' where normal things will continue.

@gperson @hailey @tenacity great to see this, for once things are working as they should:

(1) some "bigh tech" thing behaves in a way that is disgusting and unacceptable
(2) people move away from that awful thing to something better, in this case @Codeberg

🎉 😄

(I mean, too often it's like "this big tech thing is truly horrible. Now I'm going to keep using it as if nothing happened".)

@tenacity @hailey @gperson hi, does tenacity use conan still?
@strlcat @tenacity @hailey Tenacity has long gotten rid of Conan. Instead, it uses vcpkg on Windows, macOS, and Linux. Most modern Linux distros can build Tenacity fully featured just fine, so you don't need vcpkg on Linux.
@tenacity @hailey @gperson niice!! Great work and much appreciated! I just built it on my Slackware RISC-V and it works, and no conan is required! ❤😻
@strlcat @tenacity @hailey Wow! Did you have any issues? That's pretty awesome that it just built successfully! 😄
@gperson @tenacity @hailey well, the sound is heavily garbled, which I suppose because the system uses Slackware defaults, e.g. runs pulseaudio, and Tenacity says it outputs sound through ALSA via PortAudio. But this might be just because that machine is slow. I will try to build Tenacity on my gaming laptop today or tomorrow (I think there's no wxWidgets yet)
@tenacity @hailey @gperson okay quick update, I killed pulseaudio, redirected to pure alsa, and only after selecting my actual sound card it worked. Idk if software like Tenacity requires exclusive access to sound card by default, but at least it works there

@strlcat @tenacity @hailey Tenacity doesn't need exclusive access to your sound card, but in this case it seems to have worked better since PulseAudio is out of the way.

Fun fact: if you build and use a recent development version of PortAudio, Tenacity will support PulseAudio directly. You can experiment with that if interested, but I haven't tried that.

@gperson @tenacity @hailey issue fixed! I built latest portaudio, and it works flawlessly! I played with some effects, no problems found. Tenacity runs on riscv64 machine perfectly well.

@strlcat @tenacity @hailey This is very great news! It also seems that Tenacity could perform better with direct PulseAudio support in PortAudio, so that's going to be great when PortAudio 19.8 is out!

This is also exactly what we intended! We want Tenacity to be buildable across a wide variety of platforms and architectures without issue! Granted, we still have work to be done with that too, but it's very great to know Tenacity works on Linux RISC-V! Hooray! 🥳🎉

@hailey The only reason I keep my github account alive these days is to stick solidarity emotes on other people’s issues and comments. I should be surprised that the audacity committers are like this, but I’m really not.
@hailey I don’t blame you for being furious, all of this is absolutely appalling! 
@hailey oh, for sure. extremely disrespectful.

@hailey I can relate so much.

It is called social coding for a reason. Learning from each other and being nice to each other is so much part of PR work.

@hailey
Is aishittification a word yet?

@mcorbettwilson @hailey I call AI "Artificial Incompetence." Once in that mindset, I have rarely been disappointed.

Those so-called AI systems keep making the kind of mistakes that would get humans fired.

@mcorbettwilson @hailey No but i'd happily propagate it.

Also pretty sure i'm going to use AI for complicated 3D printing model generation.

But in term of video and images the glossyness, sepia and utter lack of humor make AI ever so boring.

No worries, Robert Crumb is still King.

@mcorbettwilson
It certainly should be! 👌
@hailey

@hailey

hey at least the team didn't send you an email threatening to get you deported to China

https://github.com/Xmader/musescore-downloader/issues/5

(musescore and audacity are owned by the same company)

How to respond to the takedown request email? · Issue #5 · Xmader/musescore-downloader

Hi, I'm Musescore developer. You need to takedown this repository: https://github.com/Xmader/musescore-downloader and any other your public repositories with same code. Because you illegaly use our...

GitHub

@guenther @hailey

The more I read into the Copilot stuff and the API stuff, the shadier it gets.

This is really sad, because I feel like @tantacrul (their Head of Software) is really very critical of exactly this kind of late-capitalist nonsense.

@teun I don't care what he is supposedly critical of. If your company threatens to deport people, that crosses a line.

(Also the entire analytics/privacy shitshow where they tried to forbid people under 13 to use Audacity (because children cannot legally consent to their data being collected) and Audacity's license change from GPL to MIT.)

But the China thing is just a whole other level of moral bankruptcy.

@hailey

@guenther @hailey

I fully agree. I'm just saying that I cannot quite put these different things together in my head. I hope they will do something drastic in response to this because it doesn't look good.

@teun nah, look at the dates in that github thread. happened five years ago, nobody cares.

@hailey

@hailey such a strong feeling of deja vu from this thread
it has happened before and it will happen again ;n;

@hailey

Why the fuck is the Audacity team using copilot?

@hailey is it ok to take your screenshots and put it on LinkedIn to dunk on AIs.

I promise I will write an alt text on the image, and can remove your username if you rather not have it visible.

Happy to hear that no, you don't want me to do that (this is why I ask!).

@hailey Your rage is both magnificent and eloquent! 👌
@hailey replacing the compiler with sentiment analysis
@jcoglan @hailey I refuse to accept Sentimental Code Analysis as an overload for SCA (Static Code Analysis, Software Composition Analysis).
@hailey static analysis was much better but it did not (does not) have correct output... You can absolutely confuse it with several valid code patterns (the one my company uses gets confused with using unique pointers for cleanup functions). But it was at least much better than that garbage
@hailey god daaaamn this is true
@hailey the static analyzers gave 99% false positives so ... just ignore it all.

@hailey For example the Masterscope static analysis tool of Interlisp which still runs nicely on Medley Interlisp:

https://files.interlisp.org/medley/library/MASTERSCOPE.TEDIT.pdf

In the 1970s it could answer queries such as these (and didn't make stuff up):

. IS FOO BOUND BY ANY ON PATH TO ’PARSE

. WHICH ENTRIES OF ANY CALLING ’Y BIND ANY GLOBALVARS ON ’FOO

. WHO SETS ANY BOUND IN X OR CALLED BY Y

@hailey And those static analysis tools took many years of research and engineering to solve the issue of false positives.

Meanwhile, an LLM being wrong 90% of the time is no concern at all.