The C-Shaped Hole in Package Management

System package managers and language package managers are solving different problems that happen to overlap in the middle.

Andrew Nesbitt
@andrewnez as a data point, opam does track 'system dependencies' and helps automate their installation as part of the OCaml package installation; e.g. https://github.com/ocaml/opam-repository/blob/master/packages/conf-cairo/conf-cairo.1/opam
@andrewnez Have PEPs 725 and 804 crossed your radar in the past couple of years? (they're currently stalled as far as I know, but they're Python's most recent attempt at closing the external dependency gap without relying solely on bundling)
@ancoghlan yeah I've referenced them a few times and studied it as part of this working group: https://github.com/chaoss/wg-package-metadata
GitHub - chaoss/wg-package-metadata: The Package Metadata Working Group explores how different package managers capture, expose, and structure metadata.

The Package Metadata Working Group explores how different package managers capture, expose, and structure metadata. - chaoss/wg-package-metadata

GitHub
@andrewnez This is good and I'm stoked you're diving into this particular rabbit hole, more awareness will help build bridges and resolve the issues!
@andrewnez the title feels weird. If Python packaging Rust dependencies has the same problem, why is it the "C-shaped" hole? Isn't it just a "other ecosystems" hole?
@equinox i also struggled picking a title, but the C bit was much bigger than all the others, even though the problem is a general one

@andrewnez I'm German, so of course I can conjure a word for this 😂

"Tellerrandproblem"

('edge of the plate problem' or 'plate rim problem')

Or maybe the more scary sounding "Tellerrandklippe" ('edge of the plate cliff')

But yeah I can see how this is hard to slap a title on.

[Tellerrand/"edge of the plate" is commonly used in German to express the boundary of applicability of something, some system, your knowledge, etc.]

@andrewnez I think this touches the core issue:

> None of these mechanisms really declare
> C dependencies in a machine-readable way.

C libraries can't even express their own *API*/*ABI* in a machine-readable way¹, so no surprise that the bigger steps are also missing!

C people have been quite content with the status quo, as it's a crucial part of upholding their ABI monopoly.

¹ Outside of your package manager shipping with its own C compiler that parses C header files.

@andrewnez Syft also has the problem of trying to figure out what a binary blob of stuff is. This feels like maybe it's time for something bigger

We look for various strings in the binary today (yara would be even better, but there isn't a nice yara go library we could find)

https://github.com/anchore/syft/blob/main/syft/pkg/cataloger/binary/classifiers.go#L18

syft/syft/pkg/cataloger/binary/classifiers.go at main · anchore/syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems - anchore/syft

GitHub
@andrewnez > Same library, four names, no mapping between them.
Do you know about repology?
https://repology.org/project/openssl/versions
There's also an API/database: https://repology.org/api/v1
openssl package versions - Repology

List of package versions for project openssl in all repositories

@andrewnez Symbol names of a library is not sufficient to guarantee ABI compatibility - or express dependency info - the types may differ and is an important part of compatibility. Did you review https://sourceware.org/libabigail/manual/libabigail-overview.html before writing your own library symbol parser? Several GNU C libraries use it to guard against ABI breaks, compare libidn2 and libtasn1 for example. It produce stable machine readable XML with library information. Or am I missing what you want to achieve?
Overview of the Abigail framework — Libabigail documentation