381 Followers
8 Following
7.2K Posts
Unofficial IACR eprint updates. Only posts about new papers, follow @eprintrevision for updates about revisions.
Websitehttps://eprint.iacr.org
#eprint Accelerating FAEST Signatures on ARM: NEON SIMD AES and Parallel VOLE Optimization by Seung-Won Lee, Ha-Gyeong Kim, Min-Ho Song, Si-Woo Eum, Hwa-Jeong Seo (https://ia.cr/2026/499)
Accelerating FAEST Signatures on ARM: NEON SIMD AES and Parallel VOLE Optimization

FAEST is a post-quantum digital signature candidate whose performance is dominated by repeated AES-CTR-based PRG calls in the VOLE-in-the-Head phase, yet its reference implementation provides no FAEST-specialized ARM NEON acceleration path. We present an ARM-oriented optimization that accelerates this bottleneck using general-purpose NEON SIMD instructions without relying on ARMv8 Crypto Extensions. The proposed implementation combines a register-resident 256-byte S-box with TBL/TBX-based four-stage SubBytes, 4-way and 8-way parallel AES block processing, a fixed-size PRG path specialized for the FAEST tree structure, and pthread-based batch-level parallelization of independent VOLE tasks. Evaluated on all 12 parameter sets of FAEST v2 on Raspberry Pi 4 and Apple M2, the combined optimization achieves speedups of up to $136.9\times$ and $330.1\times$, respectively, over the pure-C reference. On RPi4, the single-thread NEON implementation outperforms OpenSSL's software AES, and on M2, the full NEON-plus-pthread configuration outperforms the best available reference configuration, including hardware-accelerated OpenSSL, across all tested parameters.

IACR Cryptology ePrint Archive
#eprint Expander properties of superspecial isogeny digraphs with level structure by Thomas Decru, Krijn Reijnders (https://ia.cr/2026/500)
Expander properties of superspecial isogeny digraphs with level structure

Charles, Goren and Lauter proved that the supersingular $\ell$-isogeny graph is a Ramanujan graph, which is an optimal expander. Jordan and Zaytman argued that this is no longer true in dimension two, but Florit and Smith showed that those graphs exhibit good expansion properties nonetheless. Castryck, Decru and Smith however have pointed out that the higher-dimensional analogue setting should only consider a subset of all edges, namely the paths corresponding to $(\ell^k,\ell^k)$-isogenies, so-called good extensions, instead of all $(\ell^a,\ell^b,\ell^c,\ell^d)$-isogenies in general, which contain bad extensions too. Such bad extensions lead to many small cycles in the graph, which are a cryptographic problem due to collisions and a graph-theoretic nuisance as these superfluous edges counteract part of the expansion properties. Restricting to good extensions makes the resulting graph directed, as outgoing edges now depend on the incoming edge. We study $(\ell,\ell)$-level surfaces and $(\ell)^g$-isogeny digraphs restricted to good extensions for concrete small dimensions and degrees $\ell$. These graphs exhibit excellent expander properties: by our heuristic evidence, they are Ramanujan graphs for all primes $\ell$ in dimension 1, and for $\ell = 2$ in dimension 2. Our main conjecture implies that this would still be the case for $\ell=3$ in dimension 2, but not for any larger $\ell$ in dimension 2, or any $\ell$ in dimension 3 and up. Furthermore, we generalize the work of Florit and Smith from $\ell = 2$ to general primes $\ell$, by classifying all abelian surfaces with nontrivial automorphism groups and their actions on their maximal isotropic $(\ell,\ell)-$subgroups.

IACR Cryptology ePrint Archive
#eprint More Brisés in Ballet: Extending Differential and Linear Cryptanalysis by Emanuele Bellini, Gabriele Bellini, Alessandro De Piccoli, Michela Gallone, David Gerault, Yun Ju Huang, Paul Huynh, Matteo Onger, Simone Pelizzola, Andrea Visconti (https://ia.cr/2026/501)
More Brisés in Ballet: Extending Differential and Linear Cryptanalysis

In this work, we present new cryptanalytic results on the Ballet block cipher family, a simplified Lay-Massey ARX construction with a linear key schedule, winner of the symmetric algorithm category in the 2018–2020 Chinese National Cryptographic Algorithm Competition. Despite winning the competition, the cipher has received limited attention outside the Chinese Association for Cryptologic Research (CACR) community. We provide the first classical key recovery attacks in the literature, new explicit differential and linear trails (up to 15 rounds for differential, and 16 for linear, while the original paper only provided a bound for 9 rounds), improved impossible differential trails (8 rounds instead of 7), and the first differential-linear analysis of Ballet (up to 20 rounds). Our results lead to key recovery attacks on up to 16 rounds of Ballet-128/128/46 and 17 rounds of Ballet-128/256/48, thereby extending the cryptanalytic understanding of this ARX-based design and contributing new insight into its security margin, an area that the designers themselves note warrants further study.

IACR Cryptology ePrint Archive
#eprint Efficient RLWE based Chosen-Ciphertext Secure Dual-Receiver Encryption and Sender-Binding KEM in the Standard Model by Laurin Benz, Robert Brede (https://ia.cr/2026/502)
Efficient RLWE based Chosen-Ciphertext Secure Dual-Receiver Encryption and Sender-Binding KEM in the Standard Model

Key encapsulation mechanism (KEM) is an often used primitive in communication, closely related to public key encryption (PKE). Dual-receiver encryption (DRE) is another primitive closely related to PKE that allows a sender to encrypt a message to two different receivers. Most applications of DRE need the soundness property which guarantees that both receivers decrypt any ciphertext to the same message. Addition ally, IND-CPA security is often not enough and therefore schemes should satisfy a stronger notion like IND-CCA2. Meanwhile, an alternative to IND-CCA2 for KEMs is the IND-SB-CPA security notion which was proven to be strong enough to realize secure channels while in theory enabling the construction of more efficient schemes. Most IND-CCA2 security proofs rely on the FO transformation, which is only secure in the ROM, and the standard model DREs and KEMs are far from efficient. We fill this gap by providing a sound DRE and a KEM satisfying IND-CCA2 and IND-SB-CPA security respectively. Both schemes are based on RLWE, proven secure in the standard model, and have key sizes of 150 KB and ciphertext sizes of 100 KB, improving upon previous results by a factor of 10x to 100x.

IACR Cryptology ePrint Archive
#eprint SwiftSNNI: Optimized Scheduling for Secure Neural Network Inference (SNNI) on Multi-Core Systems by Kanwal Batool, Saleem Anwar, Francesco Regazzoni, Andy Pimentel, Zoltán Ádám Mann (https://ia.cr/2026/503)
SwiftSNNI: Optimized Scheduling for Secure Neural Network Inference (SNNI) on Multi-Core Systems

Secure Neural Network Inference (SNNI) enables privacy-preserving inference on encrypted data with strong cryptographic guarantees. However, practical deployments suffer from high preprocessing overhead, significant communication costs, and sequential execution. These limitations lead to low throughput, underutilized system resources, long queueing delays, and poor scalability. This work introduces \textit{SwiftSNNI}, a unified, resource-aware scheduling framework for SNNI. It implements a hybrid offline–online strategy that orchestrates offline preprocessing ($T_{\text{pre}, i}$) and online inference ($T_{\text{on}, i}$) jobs to maximize parallelism. By formulating SNNI scheduling as a constrained optimization problem, \textit{SwiftSNNI} overlaps $T_{\text{pre, i}}$ phase execution of future requests with active $T_{\text{on, j}}$ jobs. \textit{SwiftSNNI} also incorporates optional advance notices to enable proactive $T_{\text{pre}, i}$, which further reduces average input delay ($D$). Evaluations using five benchmark neural networks (M1, M2, HiNet, AlexNet, VGG-16) under diverse workloads and stochastic arrival rates confirm substantial performance gains. Compared to a parallelized sequential baseline (MS-SHARK), \textit{SwiftSNNI} achieves up to 97\% lower average input delay ($D$), a 81\% reduction in makespan ($\approx 5.4 \times$ speedup), and delivers $5.6 \times$ increase in throughput. Furthermore, \textit{SwiftSNNI} reduces average waiting time ($W$) by over 99\%, demonstrating robust starvation prevention for high-concurrency workloads. \textit{SwiftSNNI} supports concurrent execution, scales to larger neural networks, and provides an efficient runtime for SNNI deployments. The \footnote{https://github.com/KanwalBat00l/SwiftSNNI}{\textit{SwiftSNNI}} implementation is available online.

IACR Cryptology ePrint Archive
#eprint Compression And Decompression Under FHE Using Error-Correcting Codes and Copy-And-Recurse by Adi Akavia, Hayim Shaul, Ofer Shayevitz (https://ia.cr/2026/504)
Compression And Decompression Under FHE Using Error-Correcting Codes and Copy-And-Recurse

Compression has been a fundamental problem in computer science for decades. Simply put, we want to represent a low-entropy vector $v$ of size $n$ with less than $n$ elements so that $v$ can be reconstructed (decompressed) from the shorter representation. Since compressed vectors require less storage and less communication, compression algorithms are part of almost every digital system. When the vector is encrypted with fully homomorphic encryption (FHE) the problem becomes significantly harder. Some research (e.g., [TCHES'19, CCS'21, EuroCrypt'23 ,USENIX'24]) have considered the problem of compressing an encrypted vector but they all assumed the decompression step happens in cleartext. This is a significant restriction. For example, any system with an untrusted agent that needs to receive data and analyze it cannot use existing compression algorithms. In this paper, we give the first (to the best of our knowledge) non-trivial compression-decompression algorithms that are both FHE-friendly. Our algorithms use the copy-and-recurse technique together with the known duality between compression and error-correcting codes. Our experiments show that our decompression algorithm is faster than the folklore decompression algorithm. This is useful in systems with an agent-in-the-middle that is bounded by communication and by computation.

IACR Cryptology ePrint Archive
#eprint SCALE-FL: Scalable Cryptography-based Aggregation with Lightweight Enclaves for Federated Learning by Micah Brody, Antonia Januszewicz, Jiachen Zhao, Nirajan Koirala, Taeho Jung (https://ia.cr/2026/505)
SCALE-FL: Scalable Cryptography-based Aggregation with Lightweight Enclaves for Federated Learning

Privacy-Preserving Federated Learning (PPFL) emphasizes the security and privacy of contributors' data in scenarios such as healthcare, smart grids, and the Internet of Things. However, ensuring the security and privacy throughout PPFL can be challenging, given the complexities of maintaining relationships with many users across multiple epochs. Additionally, under a threat model in which the aggregating server and corrupted users are colluding adversaries, honest users' inputs and output data must be protected at all stages. Two common tools for enforcing privacy in federated learning are Private Stream Aggregation (PSA) and Trusted Execution Environments (TEE). However, PSA-only approaches still expose the raw aggregate to the server (and thus to colluding parties). TEE-only aggregation typically incurs non-negligible per-client per-epoch overhead at scale because the TEE must handle per-client communication and maintain per-client state/key material. This paper presents SCALE-FL, a novel solution for PPFL that maintains security while achieving near-plaintext performance using a state-of-the-art PSA protocol to collect user information and a TEE to hide information about the raw aggregate. By using a PSA protocol for aggregation, we can maintain the privacy of information on the untrusted server without requiring per-user key storage or use by the TEE. Then, the aggregate is securely processed by the TEE in plaintext, without the heavy encryption required on an untrusted server. Finally, we ensure the security of user inputs in the federated learning output by using Differential Privacy (DP). The additional overhead introduced by SCALE-FL is 1% of the overhead of the plain FL executions.

IACR Cryptology ePrint Archive
#eprint Unclonable Encryption in the Haar Random Oracle Model by James Bartusek, Eli Goldin (https://ia.cr/2026/506)
Unclonable Encryption in the Haar Random Oracle Model

We construct unclonable encryption (UE) in the Haar random oracle model, where all parties have query access to $U,U^\dagger,U^*,U^T$ for a Haar random unitary $U$. Our scheme satisfies the standard notion of unclonable indistinguishability security, supports reuse of the secret key, and can encrypt arbitrary-length messages. That is, we give the first evidence that (reusable) UE, which requires computational assumptions, exists in ``micocrypt'', a world where one-way functions may not exist. As one of our central technical contributions, we build on the recently introduced path recording framework to prove a natural ``unitary reprogramming lemma'', which may be of independent interest.

IACR Cryptology ePrint Archive
#eprint Bridging Programmability, Efficiency, and Bounded Trust: A Hybrid Privacy-Preserving Smart Contract Framework by Youheng Wang, Rujia Li, Zhaoyang Xie, Kaikai Feng, Qingjie Chen, Yang Gao, Sisi Duan (https://ia.cr/2026/498)
Bridging Programmability, Efficiency, and Bounded Trust: A Hybrid Privacy-Preserving Smart Contract Framework

Privacy-preserving smart contracts (PPSCs) extend blockchain computation from transparent execution to confidential applications, enabling mutually distrustful parties to jointly compute contract logic on private inputs. Existing PPSC designs can be categorized into two main paradigms: trusted hardware–based systems and cryptographic systems. Trusted hardware-based systems provide general programmability and the performance is usually close to non-confidential computation, but the hardware has to be trusted. In contrast, cryptographic systems require much lower trust on the hardware but the performance is usually much lower. In this paper, we propose a hybrid PPSC framework that combines trusted hardware with cryptographic techniques, achieving both general programmability and reduced reliance on trusted hardware. Specifically, the TEE executes the smart contracts, but needs to authenticate the computation. A proof of the encrypted computational results is sent on-chain, and the blockchain authenticates the computational and aggregates the computational results using cryptographic approaches such as homomorphic encryption. In this way, the confidential smart contract via TEE is both efficient and general programmable, without being trusted. Meanwhile, the on-chain cryptographic approach does not introduce high overhead as it only authenticates and aggregates the results. We formalize the system model and security goals, and prove the correctness using the Universal Composability framework. Our implementation and evaluation on Intel SGX as the trusted hardware and Solidity as the smart contract show that our approach achieves nearly no degradation on the performance compared to non-confidential computation.

IACR Cryptology ePrint Archive
#eprint SIMD HSS and aHMAC from Interval Encoding with Application to One-Bit-Per-Gate Garbling by Jaehyung Kim, Hanjun Li, Huijia Lin, Zeyu Liu (https://ia.cr/2026/485)
SIMD HSS and aHMAC from Interval Encoding with Application to One-Bit-Per-Gate Garbling

Primitives enabling homomorphic computation over secret-shared values--Homomorphic Secret Sharing (HSS) and algebraic Homomorphic MACs (aHMAC)--have recently emerged as efficient alternatives to ciphertext-based primitives such as fully homomorphic encryption (FHE) and attribute-based encryption (ABE). Leveraging the distributed nature of secret sharing, direct constructions of HSS and aHMAC are simple, lightweight, avoid costly bootstrapping, and have many applications including one-bit-per-gate garbled circuits. Despite encouraging progress, all existing direct schemes still lack one key feature: efficient Single Instruction Multiple Data (SIMD) evaluation, a capability that has been critical to the efficiency of FHE. This gap leaves the potential of substantial efficiency improvements untapped. We present the first SIMD evaluation techniques for HSS and aHMAC, based on variants of the RLWE assumption. Using a new interval coefficient encoding, our approach embeds $\sqrt{n}$ integer-valued slots per ring element and supports $\sqrt{n}$-fold batch addition and multiplication in just $O(\log n)$ ring operations, achieving a multiplicative $\tilde O(\sqrt{n})$ improvement in amortized efficiency over prior direct constructions. Building on top of these improvements, we show a streamlined one-bit-per-gate SIMD garbling scheme with similar efficiency gains in the online phase. Our efficiency gains are concrete. Concrete operation counts and microbenchmark based estimates show $6\times$--$10\times$ improvements in amortized multiplication cost over prior non-SIMD constructions, with up to $25\times$--$50\times$ speedups for aggregation-heavy workloads such as matrix--vector multiplication. These results demonstrate the practical potential of SIMD techniques for secret-sharing-based homomorphic computation.

IACR Cryptology ePrint Archive