- https://github.com/coveragepy/coveragepy/issues/2099
- https://github.com/coveragepy/coveragepy/issues/2128
- https://github.com/coveragepy/coveragepy/issues/2137
| GitHub | https://github.com/devdanzin |
#Python is for artists too! #stickers
https://villares-shop.fourthwall.com/products/python-is-for-artists-too
labeille Package Registry stats
Top 3.15 Blockers (364 packages):
* PyO3 / Rust / maturin: 111
* C extension build failures: 108
* pydantic-core (transitive PyO3): 69
* numpy / scipy / meson: 43
Once PyO3 adds 3.15 support, ~180 more packages will unlock (PyO3 direct + pydantic-core transitive)
Skip Reasons (418 packages):
* Monorepo subpackage (Azure, GCloud, etc.): 214
* No test suite found: 70
* No source repository: 52
* Type stub packages: 42
labeille Package Registry stats
We've grown the registry: https://github.com/devdanzin/labeille/tree/main/registry/packages
* Total packages: 1,500
* Enriched (information collected and present): 1,500 (100%)
* Fully runnable on CPython 3.15: 654 (43.6%)
* Skipped (no tests, monorepo, etc.): 418 (27.9%)
* 3.15-specific blockers (skip_versions): 364 (24.3%)
* pytest: 95.1% (1,427 packages)
* unittest: 4.8% (72 packages)
* GitHub: 96.4% of repos
* Same JIT crash found in 7 packages
The most important and tedious part of labeille is the registry.
So far with 350+ PyPI packages, each with a repo URL, install and test commands, metadata about whether it has C extensions, what Python versions to skip, and whether it needs xdist disabled.
"Just run pytest" doesn't work for all packages. Some need specific test markers or editable installs. Some have tests that might hang. Some need extra dependencies that aren't in their dev requirements.
I built labeille to find CPython JIT crashes, but it's a "run real world test suites at scale" platform.
It also works for:
— Checking which packages pass their tests on a new CPython version
— Testing free-threaded (no-GIL) CPython compatibility
— Measuring coverage.py or memray overhead across hundreds of packages
— Comparing CPython vs PyPy performance on real code
The registry of 350+ packages with install/test commands is the core.
labeille can compare 2 test runs and show what changed and why it changed.
When it goes from PASS to CRASH, labeille looks at the package's repo. If the commit is the same, it's a CPython/JIT regression. Otherwise, it might be the package:
requests: PASS → CRASH
Repo: abc1234 → abc1234 (unchanged — likely a CPython/JIT regression)
flask: CRASH → PASS
Repo: 222bbbb → 333cccc (changed)
This allows figuring out "3 of these are JIT regressions".
labeille has a bisect command that binary-searches through a package's git history to find the commit that triggers a JIT crash:
labeille bisect requests --good=v2.30.0 --bad=HEAD --target-python /path/to/cpython-jit
https://github.com/devdanzin/labeille#bisecting-crashes
Commits that won't build get skipped automatically (like git bisect skip), revisions get a fresh venv so dependency versions don't leak, and you can filter by crash signature when a package has distinct crashes.