Today @Nature, by me: why you might want to do your computing work inside computational environments (e.g., conda, renv). With @ctb @fertiglab @minecr @benmarwick et al. https://www.nature.com/articles/d41586-023-01469-0
The sleight-of-hand trick that can simplify scientific computing

Computational environments and the tools to manage them can help researchers to deliver code that is reproducible, documented and shareable.

@jperkel @khinsen @benmarwick @minecr @fertiglab @ctb @Nature Fully agree with your conclusion: software environments are all too often overlooked as part of reproducibility efforts!

@jperkel @benmarwick @minecr @fertiglab @ctb @Nature You write: “Tools written in languages such as C, Perl and Fortran can be hard to encapsulate into environments”.

In https://www.nature.com/articles/d41586-020-02462-7 you mentioned our work with #Guix, which fits exactly this space: reproducible software environments, independent of the language, with provenance tracking.

@khinsen

Challenge to scientists: does your ten-year-old code still run?

Missing documentation and obsolete environments force participants in the Ten Years Reproducibility Challenge to get creative.

@jperkel
"Manage your environments" is definitely good advice. But to share environments with others, or to archive them for future use, the environments themselves must be reproducible. And to ensure that code runs reproducibly inside an environment, the latter must be containerized.

Conda environments are neither reproducible nor containerized. The price to pay for cross-platform and no-root. A compromise. Would have been nice to point that out!

@Nature @ctb @fertiglab @minecr @benmarwick

@jperkel

You can have reproducible and containerized environments, using #Guix. The price to pay: #Linux only, and the manager (Guix) can only be installed by an administrator. It's a different compromise.

@Nature @ctb @fertiglab @minecr @benmarwick

@khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick (it does not have to be this way: cross platform is in reach either for Guix or Nix, same for no-root in certain cases which could be most cases.)
@khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick (well maybe for Guix, cross is harder because of GNU constraints.)

@raito

No-root is in principle possible for #Nix and #Guix under #Linux, but not available today.

Cross-platform reproducibility is not possible, period. You can run #Nix under #macOS as well as under #Linux, but you get reproducibility only within each platform.

Moreover, reproducibility with #Nix under #macOS is limited because #Nix depends on code it cannot control (managed by Apple).

@jperkel @Nature @ctb @fertiglab @minecr @benmarwick

@khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick no-root is available for Nix, through unprivileged user namespaces or if you really really want through disabled sandbox/alternative store directory *or* PRoot.

@raito

Good to know, I wasn't aware that this was already implemented in #Nix!

@jperkel @Nature @ctb @fertiglab @minecr @benmarwick

@khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick yeah a lot of people in HPC contexts use it because well HPC :-)
@khinsen @raito I've got a short guide on rootless #nix for #HPC here: https://www.jboy.space/blog/nix-on-hpc.html
Using the Nix Package Manager on an HPC Cluster

HPC clusters are great. They provide lots of CPUs and vast pools of memory that make computationally intensive analyses possible. Without them, I wouldn’t…

John D. Boy

@khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick cross platform bit-to-bit reproducibility is a different beast indeed, I misunderstood what you meant by that (i.e. running the software on multiple platforms).

Though, what you care about is the leaves, not the intermediary nodes. It could be argued that all the non reproducibility are *bugs* because only arch-specific and platform-specific code should change, not the end result. But I agree, that's almost never the case.

@raito

That's a good point and comes to down to the question of why you want reproducibility. For me, it's debuggability. I want a platform that guarantees same results for 100% identical code. That's a condition for debugging the problems you mention, which one could indeed call bugs.

@jperkel @Nature @ctb @fertiglab @minecr @benmarwick

@raito

BTW, no cross-platform reproducibility also means no reproducibility across processors. You can run #Guix on Intel and ARM chips, but you can't expect to get reproducibility across hard platforms.

Basically, you build a software stack layer by layer, starting from the hardware. You can't swap foundations and be certain to get the same results.

@jperkel @Nature @ctb @fertiglab @minecr @benmarwick

@khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick you seem to be talking about execution.

There's no reason that native compilation != cross compilation.

That's a (almost universal) bug among compilers and interpreters.

@khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick wrt to macOS, indeed. Impurity is higher but does not prevent attempts.

I would encourage to go further on those platforms using the appropriate tools (kernel extensions, etc.)

And I have seen we have some people resuming their work on Nix for Windows.

@khinsen @benmarwick @minecr @fertiglab @ctb @Nature @jperkel @raito On requiring ‘root’ privileges for #reproducibility: https://hpc.guix.info/blog/2017/09/reproducibility-and-root-privileges/

Linux unprivileged user namespaces are more widely available today than they were when that blog post was written but still lacking typically on #HPC clusters.

Guix-HPC — Reproducibility vs. root privileges

@khinsen @raito @jperkel @Nature @ctb @fertiglab @minecr @benmarwick It's quite something. None of the Nix+Docker tutorials work on a Mac.

So what's the point of this thing?

@alper

What exactly do you mean by "this thing"? Nix? Docker? The tutorials? The Mac?

@raito @jperkel @Nature @ctb @fertiglab @minecr @benmarwick

@khinsen @raito @jperkel @Nature @ctb @fertiglab @minecr @benmarwick I mean the use case for Nix on Mac is not great if every tutorial breaks fro me at the point where they build a Docker container.
@alper @khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick While I imagine you had indeed bad experience, I know that a lot of people and companies are running Nix+Docker, so please don't hesitate to report the bugs or your situation so we can look into it. It's the only real way to make this thing have a point for you too.

@raito @khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick Asking on the Discourse about cross-compilation I'm concluding that it's mostly theoretical…

And on my Mac I'll be stuck with two cross-compiles that both need to work:
- Compile from macOS to Linux container running on my M1
- Compile from macOS to Linux container running on target architecture (x86 or something)

@alper @khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick I don't really use Docker but can you even run a cross compiled Docker container on a different arch? What's your usecase? Building golden containers for many arches?

@raito @khinsen @jperkel @Nature @ctb @fertiglab @minecr @benmarwick At this point I'd kill for a fat binary.

The job to be done is: I want to run a command in the Rust project directory on my Mac and then have something that I can deploy somewhere.

Fair point, Konrad, thank you -- these tools definitely represent a compromise. Conda and the like aren't perfect, but they are a relatively easy lift, making them accessible to non-experts.
@khinsen @Nature @ctb @fertiglab @minecr @benmarwick

@jperkel

Indeed. It's always a matter of priorities and compromise.

A problem today is that Conda is frequently described as the only or the obviously best solution. And it is often presented as guaranteeing reproducibiIity, which is not true.

One advice I give to students is not to use complex computational tools until they understand their limitations. That's what I find missing in your article.

@Nature @ctb @fertiglab @minecr @benmarwick