So You Want to Solve Python Packaging: A Practical Guide

First, the technical: Python is used by vastly different groups of people, some that don't identify as "developers". Those groups often have disparate expectations about how packaging should work. Some don't even know what a package is.

Some don't even know they're using Python! Here's some examples: Python's in the Linux Standard Base and bunch of critical Linux stuff is written in Python. Distros gotta package those & their deps into their package database (deb/rpm).

Most distros want nothing to do with the language-specific package manager. They want to manage everything though rpm/deb/portage/whatever and they don't want you fucking around with system packages. Ever got burned by Python included with macOS? Yeah, same deal

So OS vendors want Python to be invisible to the user. They want it for system purposes, and they want to distribute python apps, scripts, and packages on their own terms. Cool. Let's pick another group: academics and researchers.

They want to do their research. They don't want to program Python. They want to work their data, create visualizations, and very importantly: they want fellow researchers to be able to use their code. These folks don't really want to think about packaging.

The packages they use, however, are complex fucking monsters. They're a mix of C, C++, FORTAN, Haskell, Julia, and god knows what else. They don't want to waste time installing build tools and compiling these things. Their packages need to be precompiled and ready to go.
Precompilation is *hard*, especially for high-performance libraries. You can't just distribute a build will all the fancy vector extensions enabled cause someone on a different processor won't be able to use it. You wanna see a nightmare? Look at TensorFlow.
Fundamentally these users do not want to think about this shit, and they're a *huge* group of users. You know who does think about this shit? Web developers, and every time someone comes along with "Python packaging sucks and someone should fix it" they're a web dev.

That's because web devs have different expectations. They *expect* to work with a packaging tool. They expect to find and install dependencies. They don't expect to work with a ton of native dependencies. They don't have the same problems.

This only scratches the surface of the technical complexity here. The reason there are so many tools for managing Python dependencies is because Python is not a monoculture and different folks need different things.

But let's assume for a moment that you can overcome those technical challenges. You can create a tool and workflow that works for the vast majority. Now you have to deal with people. You gotta convince a bunch of unpaid volunteers that you're right and that they should help.
You gotta convince a bunch of unpaid volunteers maintaining existing tools to give up their projects for your solution. Projects they built from the ground up for their own use case. You gotta write several PEPs and get them accepted.
You gotta deal with the politics: The PyPA which is completely volunteer and has all the responsibility of maintaining existing tools and practically no real authority or resources. They aren't a unified body, more of a loose collection of people that chat sometimes.
You gotta deal with the Python Core team and the steering council. They have consistently abdicated the details of packaging to the community. They aren't, at this time, very interesting in taking over packaging and telling the community how to manage their dependencies.
You gotta deal with downstream distributors and major users. Linux distros, Apple, Google, AWS, Anaconda, and so many more. Google's using Bazel to build their Python projects, good luck with that one!
You gotta deal with the users and the stans. Wanna know why I stopped working on Python packaging? I got harassed for *months* because KR picked a fight with Reddit right when I dared to include pipenv on http://packaging.python.org. Fuck that.
Python Packaging User Guide — Python Packaging User Guide

The Python Packaging User Guide (PyPUG) is a collection of tutorials and guides for packaging Python software.

So you want to fix Python packaging: you fucking can't. get lost.
Go ahead and bookmark this so you can link to it every few months when another baby faced, naive, precious little developer thinks that they can slay the hydra because they only see one head.

@stargirl

I'm still upset that Python 3 broke Python 2.7 compatibility, I've still got systems built and running Python 2.7 and I always will.

@stargirl @rastilin This. If I want to run some perl code, pretty much any old code will run on any interpreter version. With python it's a crapshoot as to what will happen on any given machine.
@stargirl Absolutely fantastic metaphor for this. 💯✨
@stargirl people and data/code distribution systems are fricking hard, yup. Also I think Python is doing very fine. We are not Perl /ducks 👀
@peteriskrisjanis @stargirl I don’t remember have the same level of problems with CPAN that I have with python environments.
@stargirl I can’t even get the few dozen python devs in one company to agree on a packaging and dep management standard. The entire community standardizing and on anything seems impossible
@stargirl Well one thing that royally fucked up are distributions and their tools.
Getting what's effectively a binary-cache is great, but it shouldn't be a massive pain for users to tweak things slightly for their needs, be it a bit of configuration, patches, slightly specific versions, …
And should also be very easy to create packages, specially dirty ones that just get the job done before production.

To me it doesn't makes sense for python to somehow try to manage a part of the world bigger than it is, because it simply can't.
@stargirl Best thread I've read in a while. 😂 🙌 👏 👏 🥂
@stargirl They actually did the forbidden thing - re-invented the wheel. 😈
@stargirl Thanks for helping to explain why Python packaging seems so utterly bonkers broken to me, a web dev.

@stargirl We're doing something related to what you wrote in this threat at ActiveState, but our target audience is mostly enterprise customers.

I agree that having a single packaging solution that works great for every use case is naive, but I don't see why each of them can't be "solved" "individually".

@stargirl #Nix offers a genuine solution and a path towards packages as capabilities. #nixpkgs has a mature #Python subsystem. No involvement with PSF is required.
Python Packaging Strategy Discussion - Part 1

(NixOS developer, maintain some Python packages in nixpkgs and has knowledge of the issues we encounter classically with the Python packaging ecosystem wrt to native dependencies. I don’t represent the NixOS project or community.) I read the whole thread and found the remarks of @steve.dower very interesting. Firstly, I have to say we know that Nix ecosystem is probably a small subset of Python users and probably the one for which Python packaging works more or less the best in our closed garde...

Discussions on Python.org
@corbin @stargirl this might be true for NixOS, but using Nix on other systems (like Fedora, as I have), my experience is that it gets weird-and-broken pretty quick in ways that are confusing, and the community can be quick to point out your error is simply because you are not using NixOS.
@whack @stargirl Sure; along with NixOS, the container environment ("FROM nixos/nix") and Darwin are better-supported than classic LSB distros. Aside from #X11 and other system services, Nix is self-contained; however, Nix is not the goal, but part of the journey towards package-capability security.

@whack @corbin @stargirl

> my experience is that it gets weird-and-broken pretty quick in ways that are confusing,

This usually happens somewhere in the vicinity of loading drivers (e.g. libcuda or libGL), which currently is only accommodated by wrappers (nixglhost, nixGL) and, yes, sucks. It's hard to improve upon because of how other dsitributions mix userspace drivers and normal libraries, but there are some cursed ideas being talked about involving libcapsule...

@whack @corbin @stargirl

> community can be quick to point out your error is simply because you are not using NixOS.

They're wrong to set such a low bar, we'll silence them!

@nobody @whack @stargirl It's a fair critique, though, and one that spawns from the *same* memetic issue as in Python packaging, so it's worth examining.

Sure, a Nix package *should* only rely on being located in /nix/store next to its dependencies. But as you've listed, there's piles of ambient authorities and mutable state which also must be correctly configured in order for some packages to run.

This is *exactly* analogous to Python packages which aren't pure statically-imported Python modules. In a simpler world, we would deploy Python apps by tossing all Python modules into a single ZIP file, but because CPython allows extension modules and exposes a C API, a practical Python app must also depend on C runtime and C linkage.

This isn't hypothetical. RPython, the toolkit for building PyPy, is pure Python 2.7. My rpypkgs flake https://github.com/rpypkgs/rpypkgs "installs" RPython packages by unpacking all of them to a common build directory.

GitHub - rpypkgs/rpypkgs: Packages built with RPython

Packages built with RPython. Contribute to rpypkgs/rpypkgs development by creating an account on GitHub.

GitHub
@stargirl I agree. The easier solution is to teach people TypeScript and npm, rewrite those libraries people care about, and leave Python in the history pile next to Perl.
@stargirl As much as I love Python, I hate the packaging "experience."
@stargirl I'm firmly in the Anaconda (miniforge, actually) camp. With libmamba thankyouverymuch 😉
@stargirl You just convinced me what I've suspected for years - I'm an academic / researcher.

@stargirl Rephrasing @corbin, I'd say that "Python Packaging" is actually solved. Mostly. Everything-pyproject.toml (maybe with PEP-725) is comprehensive for what's truly in scope for "python packaging". There will necessarily be some native stuff that is absolutely out of scope for "python packaging", in fact most of it. There will always have to be a "system layer" under "python packages".

Nixpkgs is one consumer that integrates PEP-compliant "python packages" with its "native" stuff

@stargirl @corbin Nobody asked for this opinion, but I think what needs solving instead is python imports. Specifically, first class multi-tenancy support which would remove the urge for writing things like https://discuss.python.org/t/allowing-multiple-versions-of-same-python-package-in-pythonpath/2219
Allowing Multiple Versions of Same Python Package in PYTHONPATH

TLDR; I wanted to get feedback on a potential feature that may be added to nixpkgs that allows multiple versions of the same python package to be installed in the same PYTHONPATH. This is a general approach that is not specific to nixpkgs and could be used in other package managers. The only nix specific part is the tooling to allow for the building of these specialized packages. All of the materials/demo is in this repo https://github.com/costrouc/python-multiple-versions. Sorry discourse preve...

Discussions on Python.org
@stargirl @corbin Also bootstrap. Bootstrapping python packages is still cursed but, unlike imports, not fundamentally deficient. Also hey speaking of web devs, their stuff is a lot more broken than python, I'm sure their stuff cannot be bootstrapped at all

@nobody @stargirl I'll note that this suffers from Zooko's triangle, a trilemma in nominal logic: https://en.wikipedia.org/wiki/Zooko%27s_triangle

Imagine every Python module were content-addressed somehow, and packages too. This only kicks the can down the road, because you'd probably rather import from a petname than from a SHA-256 hash, so somebody must now set up a petname-to-hash map, and all of the existing political issues of the Cheeseshop reappear upon the maintenance of that map. (This is why politics is an inevitable part of maintaining large ports trees too.)

Zooko's triangle - Wikipedia

@corbin @stargirl Um to be clear, the intention would not be to make developers write `import from package_name_${hash}`, but to generate hints for the import system at build/install time
@corbin @stargirl Maybe I should elaborate: imagine you have a project that uses protobuf-python, and you want to use it together with tensorflow. Chances are, tensorflow's outdated protobuf does not work for you. Rather trying to patch your project or patch tensorflow for compatibility with the older/newer protobuf, I'd want a solution where we instruct python to resolve `import protobuf` into one "package" when the import happens under `${prefix}/tensorflow`, and another package elsewhere

@corbin @stargirl There's two obvious challenges:

1) multi-tenancy support in python import cache (needs implementing),

2) potential symbol collision issues for native libraries (potentially solved by dlmopen, I suspect, but maybe not on windows and what do I know actually)

@corbin @stargirl Same for e.g. combining packages that use pydantic1 and pydantic2

@nobody @stargirl Sure. Those hints would be populated from a mapping either maintained via Cheeseshop or via local choices. Either way, there's politics going on; each individual mapping is a policy choice. In terms of Zooko's triangle, this would be the "centralized" option, which means that the import keyword would now functionally depend on whatever build/runtime configuration you've imagined.

Not that this is a bad thing! Check out PEP 302 for the hooks you'd need to prototype an actual implementation of your idea. Allen Short explained how to use PEP 302 here https://web.archive.org/web/20180411011138/http://washort.twistedmatrix.com/2011/01/introducing-exocet.html and I used it to add hot-reload plugin functionality to a Python Minecraft server.

Introducing Exocet

Last time I talked about the deficiencies of Python's module system. Now I'd like to talk about a solution to them. There are two questions...

@corbin @stargirl Thanks I'll read that. I've no idea what Cheeshop is though, but if we had this multi-tenancy support I'd just go and implement a setup hook for Nixpkgs to generate these hints. I wouldn't really care for much beyond that

@corbin @stargirl Nice, I'm not sure how I missed this. I suppose the PEP-302 example you shared is exactly the "local choices" implementation. I also now noticed this: https://github.com/mitsuhiko/multiversion. Similarly local.

> means that the import keyword would now functionally depend on whatever build/runtime configuration you've imagined.

Well it does today too

GitHub - mitsuhiko/multiversion: A hack that allows you to use different versions of the same library in the same Python process without clashes

A hack that allows you to use different versions of the same library in the same Python process without clashes - mitsuhiko/multiversion

GitHub
@corbin @nobody @stargirl petname as display only solves this problem (among others)
@stargirl Keanu Reeves???
@neutrinoceros @stargirl i believe the person referred to here is the original author of requests. (maybe you were joking, i don't know, but just in case, you can follow that trail if interested).
@tshirtman thanks ! I was joking, but I also had no idea who it really was.
@neutrinoceros @stargirl probably Kenneth Reitz, author of Requests and Pipenv, among others.
@neutrinoceros @stargirl I found a short summary on Hacker News, see comment by legostormtrooper: https://news.ycombinator.com/item?id=18612590
Follow on to the article, and its link to the heated May thread for more than anyone probably really needs to know anymore, which I surmise is stargirl's "picked a fight with Reddit" reference.
Pipenv: promises a lot, delivers very little | Hacker News

@stargirl fwiw, that packaging guide saved my sanity. Even if the landscape as a whole remains with significant fractures that place is a good thing. Thanks for what you've contributed.

@stargirl an aside, on "volunteering"
.. I ran across a piece that @webmink has cooking.

I'm more familiar with the ASF philosophy and practice, and what he's written rings true for me. Curious if any of it rings true for PSF folk.

https://the.webm.ink/on-volunteering

On Volunteering

I often hear about how open source is not sustainable because it is "made by volunteers". But that's misunderstanding the nature of volun...

Webmink In Draft
@msw @stargirl @webmink thanks for the link. I found the follow up article on the hazardous framing of open source in monetary terms more resonant/powerful (https://the.webm.ink/getting-back-to-a-social-frame)
Getting Back to a Social Frame

While the Free Software/Open Source movement is based on an essential and timeless concept -- that users of software should be self-sover...

Webmink In Draft