Thomas Pornin

399 Followers
11 Following
20 Posts

It is not well-documented, but it turns out that if you try to run crypto benchmark code on a STM32F407 ("discovery") board (with an ARM Cortex M4 CPU) with instructions in SRAM (instead of Flash), then you get extra delays, unless you set SYSCFG_MEMRMP accordingly. After a week-end of dabbling, I can now run my benchmarks (for Falcon/FN-DSA) at 168 MHz with no wait states or cache issues. Details here: https://github.com/pornin/c-fn-dsa/tree/main/bench_cm4

(As a side-note, I have done some more assembly optimization work, so signing cost is now 19.7 mcycles at n=512, down from 22.0 previously.)

c-fn-dsa/bench_cm4 at main · pornin/c-fn-dsa

Contribute to pornin/c-fn-dsa development by creating an account on GitHub.

GitHub
I put together some notes on a few tricks in assembly for ARM Cortex M0+ and M4, e.g. how to make a conditional constant-time swap of two registers in 3 cycles.
https://github.com/pornin/arm-asm-notes
GitHub - pornin/arm-asm-notes

Contribute to pornin/arm-asm-notes development by creating an account on GitHub.

GitHub

New paper: https://eprint.iacr.org/2025/435
To some extent, it’s a blog post in scientific article format. In that text I argue that writing constant-time code has become mostly infeasible in general (because there are JIT compilers everywhere, especially where you cannot look, and they’ve become smart).

It’s still worthwhile to write code in a “constant-time” way, but you’ll get the real thing only by adjusting things for your specific hardware.

Constant-Time Code: The Pessimist Case

This note discusses the problem of writing cryptographic implementations in software, free of timing-based side-channels, and many ways in which that endeavour can fail in practice. It is a pessimist view: it highlights why such failures are expected to become more common, and how constant-time coding is, or will soon become, infeasible in all generality.

IACR Cryptology ePrint Archive

On a whim, I did something which is arguably a bad idea: a pure JavaScript implementation of FN-DSA (Falcon). It's there:
https://github.com/pornin/js-fn-dsa

Key pair generation, and to a lesser degree signature generation, are not constant-time (not that making constant-time code is really feasible in JavaScript). The JavaScript story of handling secret keys is terrible anyway. The signature verification code, though, should be fine.

It all fits in a single 2530-line comment-heavy source file, with no external dependencies; it's not that big! I have cursorily tested it in Chrome, Safari (on iOS) and Firefox. Presumably it should also run on Edge and Node.js, but I have not tried.

GitHub - pornin/js-fn-dsa

Contribute to pornin/js-fn-dsa development by creating an account on GitHub.

GitHub

Falcon on the ARM Cortex-M4: it got about twice faster.
https://eprint.iacr.org/2025/123

Compared with optimized Dilithium (ML-DSA) on the same hardware, signing is "only" 5.4 times slower, but verifying is 4 to 6 times faster (and uses a lot less RAM as well). Falcon/FN-DSA might be deemed usable even on such microcontrollers.

(Among things, I use a combined addition/subtraction routine, when computing x+y and x-y for two floating-point values x and y. I think I saw that somewhere before but I don't remember where; it's a nice trick and it should be credited properly, so if anybody has a pointer that would be helpful.)

Falcon on ARM Cortex-M4: an Update

This note reports new implementation results for the Falcon signature algorithm on an ARM Cortex-M4 microcontroller. Compared with our previous implementation (in 2019), runtime cost has been about halved.

IACR Cryptology ePrint Archive
For optimizing and benchmarking cryptographic primitives I often need to read the in-CPU cycle counter (the actual one, not the fixed-frequency time stamp counter), and operating systems are uncooperative. I have working solutions on Linux on x86, ARMv8 and RISC-V; for the last two, this requires a custom kernel module. I aggregated my notes and code here: https://github.com/pornin/cycle-counter
It _might_ also work on an Apple Silicon system if that system runs Asahi Linux (i.e. Linux directly on the machine, not in a VM inside macOS). Could somebody with access to such a system try that?
GitHub - pornin/cycle-counter

Contribute to pornin/cycle-counter development by creating an account on GitHub.

GitHub

Following up on my Rust implementation of "prospective FN-DSA" (i.e. Falcon + support for pre-hashing as in ML-DSA), I made C and Go versions, with the same features, and full reproducibility across all three implementations:
https://github.com/pornin/rust-fn-dsa
https://github.com/pornin/c-fn-dsa
https://github.com/pornin/go-fn-dsa

Everything ought to be correct, fast (as can be made so, but I am not unhappy with the performance), and constant-time (again within what can be ensured in a world where compilers and hidden JIT compilers are smart enough to recognize disguised Booleans).

GitHub - pornin/rust-fn-dsa: FN-DSA (Falcon) signature scheme

FN-DSA (Falcon) signature scheme. Contribute to pornin/rust-fn-dsa development by creating an account on GitHub.

GitHub

I made a Rust implementation of FN-DSA/Falcon. Of course it is not (yet) the "real" FN-DSA since NIST has not published any draft standard yet; I will align the implementation with the drafts as they go live.
https://github.com/pornin/rust-fn-dsa
https://crates.io/crates/fn-dsa

It's as fast as the C code (faster for verification, actually, thanks to some more AVX2). It's portable and presumed secure (best effort on constant-time, no guarantee though, recent LLVMs are good at recognizing disguised Booleans and inserting back conditional jumps). My _ambition_ is that this becomes the "canonical" FN-DSA implementation in the Rust ecosystem.

GitHub - pornin/rust-fn-dsa: FN-DSA (Falcon) signature scheme

FN-DSA (Falcon) signature scheme. Contribute to pornin/rust-fn-dsa development by creating an account on GitHub.

GitHub
The “definitive” paper on double-odd elliptic curves is now formally published in the “Communications in Cryptology” IACR journal. The paper has all the useful formulas, for all finite fields (including binary fields).
https://doi.org/10.62056/akmp-4c2h
A Prime-Order Group with Complete Formulas from Even-Order Elliptic Curves

A shorter summary: