@jperkel @benmarwick @minecr @fertiglab @ctb @Nature You write: “Tools written in languages such as C, Perl and Fortran can be hard to encapsulate into environments”.
In https://www.nature.com/articles/d41586-020-02462-7 you mentioned our work with #Guix, which fits exactly this space: reproducible software environments, independent of the language, with provenance tracking.
@jperkel
"Manage your environments" is definitely good advice. But to share environments with others, or to archive them for future use, the environments themselves must be reproducible. And to ensure that code runs reproducibly inside an environment, the latter must be containerized.
Conda environments are neither reproducible nor containerized. The price to pay for cross-platform and no-root. A compromise. Would have been nice to point that out!
No-root is in principle possible for #Nix and #Guix under #Linux, but not available today.
Cross-platform reproducibility is not possible, period. You can run #Nix under #macOS as well as under #Linux, but you get reproducibility only within each platform.
Moreover, reproducibility with #Nix under #macOS is limited because #Nix depends on code it cannot control (managed by Apple).
Good to know, I wasn't aware that this was already implemented in #Nix!
@khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick cross platform bit-to-bit reproducibility is a different beast indeed, I misunderstood what you meant by that (i.e. running the software on multiple platforms).
Though, what you care about is the leaves, not the intermediary nodes. It could be argued that all the non reproducibility are *bugs* because only arch-specific and platform-specific code should change, not the end result. But I agree, that's almost never the case.
That's a good point and comes to down to the question of why you want reproducibility. For me, it's debuggability. I want a platform that guarantees same results for 100% identical code. That's a condition for debugging the problems you mention, which one could indeed call bugs.
BTW, no cross-platform reproducibility also means no reproducibility across processors. You can run #Guix on Intel and ARM chips, but you can't expect to get reproducibility across hard platforms.
Basically, you build a software stack layer by layer, starting from the hardware. You can't swap foundations and be certain to get the same results.
@khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick you seem to be talking about execution.
There's no reason that native compilation != cross compilation.
That's a (almost universal) bug among compilers and interpreters.
@khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick wrt to macOS, indeed. Impurity is higher but does not prevent attempts.
I would encourage to go further on those platforms using the appropriate tools (kernel extensions, etc.)
And I have seen we have some people resuming their work on Nix for Windows.
@khinsen @benmarwick @minecr @fertiglab @ctb @Nature @jperkel @raito On requiring ‘root’ privileges for #reproducibility: https://hpc.guix.info/blog/2017/09/reproducibility-and-root-privileges/
Linux unprivileged user namespaces are more widely available today than they were when that blog post was written but still lacking typically on #HPC clusters.
@khinsen @raito @jperkel @Nature @ctb @fertiglab @minecr @benmarwick It's quite something. None of the Nix+Docker tutorials work on a Mac.
So what's the point of this thing?
What exactly do you mean by "this thing"? Nix? Docker? The tutorials? The Mac?
@raito @khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick Asking on the Discourse about cross-compilation I'm concluding that it's mostly theoretical…
And on my Mac I'll be stuck with two cross-compiles that both need to work:
- Compile from macOS to Linux container running on my M1
- Compile from macOS to Linux container running on target architecture (x86 or something)
@raito @khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick At this point I'd kill for a fat binary.
The job to be done is: I want to run a command in the Rust project directory on my Mac and then have something that I can deploy somewhere.
Indeed. It's always a matter of priorities and compromise.
A problem today is that Conda is frequently described as the only or the obviously best solution. And it is often presented as guaranteeing reproducibiIity, which is not true.
One advice I give to students is not to use complex computational tools until they understand their limitations. That's what I find missing in your article.