RegexPSPACE: Regex LLM Benchmark

RegexPSPACE는 LLM의 공간 복잡도 한계를 평가하기 위해 PSPACE-완전 정규표현식 문제(동등성 결정과 최소화)를 기반으로 한 최초의 벤치마크를 제안한다. 100만 개 이상의 정규표현식 인스턴스를 포함하는 대규모 데이터셋을 구축하고, 6개 LLM과 5개 LRM을 대상으로 평가를 수행해 LLM의 장황함과 반복 같은 공통 실패 패턴을 발견했다. 이 연구는 LLM과 LRM의 고급 추론 능력과 공간 계산 한계를 정량적으로 분석하는 새로운 평가 프레임워크를 제공한다.

https://arxiv.org/abs/2510.09227

#llm #benchmark #regex #reasoning #pspace

RegexPSPACE: A Benchmark for Evaluating LLM Reasoning on PSPACE-complete Regex Problems

Large language models (LLMs) show strong performance across natural language processing (NLP), mathematical reasoning, and programming, and recent large reasoning models (LRMs) further emphasize explicit reasoning. Yet their computational limits, particularly spatial complexity constrained by finite context windows, remain poorly understood. While recent works often focus on problems within the NP complexity class, we push the boundary by introducing a novel benchmark grounded in two PSPACE-complete regular expression (regex) problems: equivalence decision (RegexEQ) and minimization (RegexMin). PSPACE-complete problems serve as a more rigorous standard for assessing computational capacity, as their solutions require massive search space exploration. We perform a double-exponential space exploration to construct a labeled dataset of over a million regex instances with a sound filtering process to build the benchmark. We conduct extensive evaluations on 6 LLMs and 5 LRMs of varying scales, revealing common failure patterns such as verbosity and repetition. With its well-defined structure and quantitative evaluation metrics, this work presents the first empirical investigation into the spatial computational limitations of LLMs and LRMs, offering a new framework for evaluating their advanced reasoning capabilities. Our code is available at https://github.com/hyundong98/RegexPSPACE .

arXiv.org

On Monday, November 17, at 3:30pm ET, I get to give the next VCGT talk on the computational complexity of the game #BattleSheep: https://sites.google.com/view/virtual-cgt/seminar

Abstract: Battle Sheep is a board game published by Blue Orange Games where players take turns moving stacks of sheep tokens around a hexagonal board, always leaving at least one sheep behind. In this talk we'll learn the basics of the game, play once, and finally show that determining the winnability of the game is PSPACE-complete. This talk assumes no prior knowledge of computational complexity.

#CombinatorialGames #ComputationalComplexity #PSPACE

Virtual CGT

The seminar started after the Virtual Combinatorial Games Workshop in June 2020. There will be a mix of research talks, tutorials, and coffee meetings. If you wish to attend, please contact Melissa Huggan (email address at the bottom) to be added to the email list for the video call link.

The second day of #Integers2025 was excellent! I heard some great talks, especially one from Carrie Finch-Smith. Me and the other three gamesters I know about here also took some time and I think we found a game to be #PSPACE complete, so we're definitely going to be writing that up!

Tonight we got to see a tree that owns itself. Cool stuff!

The organizers have been totally awesome. Great conference so far!

#NumberTheory #CombinatorialGames

🤩#call4reading

✍️Towards a #quantum-inspired proof for #IP = #PSPACE #by Ayal Green, Guy Kindler, and Yupan Liu

🔗10.26421/QIC21.5-6-2 (#arXiv:1912.11611)

#Quantumwalks #Singularcontinuousmeasure

I finally wrote a #CombinatorialGames piece I've wanted to write for a long time. It's about how Col's computational complexity went confusedly unsolved for over 35 years and how I got scooped at the end of that but still got a cool result. https://combinatorialgametheory.blogspot.com/2024/08/the-curious-case-of-cols-computational.html

#ComputationalComplexity #PSPACE

The Curious Case of Col's Computational Complexity

Combinatorial Game Theory blog. Algorithmic, computational complexity, CGT, abstract games, Nim, Col, Snort, Kayles.

I'm going to be talking about Algorithmic CGT in India in a few weeks. I really want to start with the fundamentals of #PSPACE, so I want to present the Boolean Formula Game (https://en.wikipedia.org/wiki/Formula_game). Whenever I talk about a game, I like to play it with the audience.

So... I coded a working version: http://kyleburke.info/DB/combGames/booleanFormula.html

It's not particularly fun, mostly because it's often very easy to win as the True player. (I didn't implement any deep strategies to generate more interesting formulas. Advice is welcome!)

Nevertheless, I'm very excited to use this next week and in future talks. #CombinatorialGames #ComputationalComplexity #WebGames

Formula game - Wikipedia

In our effort to put courses online, we continue lectures on Algorithmic Lower Bound Course. Now you can watch

Lesson 4-11: Algorithmic Lower Bounds by Mohammad Hajiaghayi - NP-Completeness and Beyond

(FEEL FREE TO SUBSCRIBE TO YOUTUBE @hajiaghayi FOR FUTURE LESSONS Premiering on WEDNESDAYS)

https://youtu.be/VZyffnAb1r0 (Lesson 4: 3-Partition Problem & Proving NP-Hardness)

https://youtu.be/4fCD9_1eQw0 (Lesson 5: Puzzle Problem NP-Hardness & 3-Partition)

https://youtu.be/FIyEj72-UJQ (Lesson 6: 3-SAT Problem & Proving NP-Hardness)

https://youtu.be/tbSJzaKx2pA (Lesson 7: Puzzle Problem NP-Hardness via 3-SAT)

https://youtu.be/voRVebBsh94 (Lesson 8: Fine-grained Subcubic Complexity: Part 1)

https://youtu.be/gRURSM6QARo (Lesson 9: Fine-grained Subcubic Complexity: Part 2)

https://youtu.be/qPw82bTAXkc (Lesson 10: Fine-grained Subquadratic Complexity 1)

https://youtu.be/C6j4avVkI7U (Lesson 11: Fine-grained Subquadratic Complexity 2)

#AlorithmicComplexity,

#3SAT,

#3Partition,

#subquadratic,

#subcubic,

#Finegrained,

#HardnessExploration,

#NP,

#PSPACE,

#NPComplete,

#LogSpace,

#ExponentialComplexity,

#ParallelComputation,

#PvsNP,

#NPSPACE,

#NonDeterministicSpace, hashtag

#SavitchTheorem,

#ComplexityClasses,

#Reductions,

#ImportantProblems,

#CommunicationComplexity, hashtag

#GeometricProblems,

#AlgorithmDesign,

#ComputationalComplexity,

#TheoreticalComputerScience,

#AlgorithmicLowerBounds

For comprehensive handwritten lecture notes on this course, visit the instructor's website:

http://www.cs.umd.edu/~hajiagha/
The course textbook "Computational Intractability: A Guide to Algorithmic Lower Bounds" by Demaine, Gasarch, and Hajiaghayi is available for free at:

https://hardness.mit.edu/

Lesson 4: Algorithmic Lower Bounds by Mohammad Hajiaghayi: 3-Partition Problem & Proving NP-Hardness

YouTube
🎥 Now available: Algorithmic Lower Bounds Lessons 2 & 3 by Mohammad Hajiaghayi - NP-Completeness and Beyond
🔗 Lesson 2: youtu.be/zNkKT0Y_tus
🔗 Lesson 3: youtu.be/QvgknFAm_qg
📺 Subscribe for more @hajiaghayi
Dive into #complexity, #NPvsP, #PSPACE, #reductions, and more!