Mastodawn

nopersonalspace

Kubernetes? docker-compose? How should I organize my container services in 2024?

https://lemmy.world/post/10510362

Kubernetes? docker-compose? How should I organize my container services in 2024? - Lemmy.World

Currently, I run Unraid and have all of my services’ setup there as docker containers. While this is nice and easy to setup initially, it has some major downsides: - It’s fragile. Unraid is prone to bugs/crashes with docker that take down my containers. It’s also not resilient so when things break I have to log in and fiddle. - It’s mutable. I can’t use any infrastructure-as-code tools like terraform, and configuration sort of just exist in the UI. I can’t really roll back or recover easily. - It’s single-node. Everything is tied to my one big server that runs the NAS, but I’d rather have the NAS as a separate fairly low-power appliance and then have a separate machine to handle things like VMs and containers. So I’m looking ahead and thinking about what the next iteration of my homelab will look like. While I like unraid for the storage stuff, I’m a little tired of wrangling it into a container orchestrator and hypervisor, and I think this year I’ll split that job out to a dedicated machine. I’m comfortable with, and in fact prefer, IaC over fancy UIs and so would love to be able to use terraform or Pulumi or something like that. I would prefer something multi-node, as I want to be able to tie multiple machines together. And I want something that is fault-tolerant, as I host services for friends and family that currently require a lot of manual intervention to fix when they go down. So the question is: how do you all do this? Kubernetes, docker-compose, Hashicorp Nomad? Do you run k3s, Harvester, or what? I’d love to get an idea of what people are doing and why, so I can get some ideas as to what I might do.

sabreW4K3 Jan 9, 2024

I can’t remember what I was watching, but I remember watching something where they said Kubernetes is designed for something so large in scale that the only reason people have heard about it is because some product manager asked what Google use and then demanded that they use it to replicate the success of Google and subsequently, hobbyists also followed and now a bunch of people are using stuff that’s poorly optimized for such small scale systems.

CubitOom Jan 10, 2024

You should try out all the options you listed and the other recommendatiins and find what works best for you.

I personally use Kubernetes. It can be overwhelming but if you’re willing to learn some new jargon then try a managed kubernetes cluster. Like AKS or digital ocean kubernetes. I would avoid managing a kubernetes cluster yourself.

Kubernetes gets a lot of flack for being overly complicated but what is being overlooked with that statement is all the things that kubernetes does for you.

If you can spin up kubernetes with cert-manager, external-dns, and an ingress controller like istio then you got a whole automated data center for your docker containers.

Toribor Jan 10, 2024

In my opinion trying to set up a highly available fault tolerant homelab adds a large amount of unnecessary complexity without an equivalent benefit. It’s good to have redundancy for essential services like DNS, but otherwise I think it’s better to focus on a robust backup and restore process so that if anything goes wrong you can just restore from a backup or start containers on another node.

I configure and deploy all my applications with Ansible. You can programmatically create config files, pass secrets, build or start containers, cycle containers automatically after config changes, basically everything you could need.

Sure it would be neat if services could fail over automatically but things only ever tend to break when I’m making changes anyway.

Lem453 Jan 10, 2024

This, I used to have a kubernetes setup but how much redudency can you really have at home. Do you have a generator? Multiple Internet lines?

The fact is most hardware is highly reliable. Having good backups to restore from is all you need and you gain a huge improvement in simplicity which adds reliability in and of itself.

CubitOom Jan 10, 2024

I would say that if you are going to host it at home then kubenetes is more complex. Bare metal kubernetes control plane management has some pitfalls. But if you were to use a cloud provider like linode or digital ocean and use there kubernetes service, then only real extra complexity is learning how to manage Kubernetes which is minimal.

There is a decent hardware investment needed to run kubernetes if you want it to be fully HA (which I would argue means it needs to be a minimum of 2 clusters of 3 nodes each on different continents) but you could run a single node cluster with autoscaling at a cloud provider if you don’t need HA. I will say it’s nice not to have to worry about a service failing periodically as it will just transfer to another node in a few seconds automatically.

Decronym Jan 10, 2024

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:

Fewer Letters More Letters DNS Domain Name Service/System Git Popular version control system, primarily for code NAS Network-Attached Storage k8s Kubernetes container management package

[Thread #417 for this sub, first seen 10th Jan 2024, 04:15] [FAQ] [Full list] [Contact] [Source code]

Decronym

Samsy Jan 10, 2024

I was familiar with just organise my docker-compose containers without any frontend. But I discovered casaOS, which make things pretty simple. An AppStore and a SMB-Shared File manager gave me a really good workflow. Things that aren’t on the AppStore can be handled outside of Casa, too.

PS. But never make the mistake to integrate the outside handled containers, this mess things up.

forwardvoid Jan 10, 2024

Portainer + caddy + watchtower, this will give you the benefits of containers without the complexity of Kubernetes. As someone who professionally works with Kubernetes, I agree with what other people have said here: “only run it if you want to learn it for professional use”.

Portainer is a friendly UI for running containers. It supports docker compose as well. It helps with observability and ops. Caddy is an easy proxy with automatic Let’s Encrypt support. Watchtower will update and restart your containers if there’s an update.

corsicanguppy Jan 10, 2024

First, hire a team of energetic full-time container bros. Half of them will help architect your setup, and other half will focus entirely on supporting the container cult.

forwardvoid Jan 10, 2024

Containers are bad hmmkay… cause… cause… they’re bad… hmmkay

FooBarrington Jan 10, 2024

I am happy with my simple docker-compose setup - one root folder with one subfolder per project containing the compose file and any configuration mounted into the container. Traefik automatically exposes all services I want under a well-known URL using a single line in each compose file. Watchtower updates the containers.

This has been running stable for over two years with probably 2-3 reboots in between. If my current NUC ever breaks I’ll set it up again using Podman instead of Docker, but aside from that I couldn’t be happier!

vegetaaaaaaa Jan 10, 2024

Podman pods + systemd units to manage pods lifecycle. Ansible to deploy the base OS requirements, the ancillary services (SSH, backups, monitoring…), and the pods/containers/services themselves.

RedFox Jan 11, 2024

I really enjoy these type of conversations, learn a lot.

Since you’ve gotten lots of good advice on container manager, I’ll encourage your desire for IaC/DevOps CM, etc.

I believe all the leading CM choices support what you’re wanting to do. I can’t guide you on which one to chose, but just browse through the options or functions your favorite does for the Kx container solution you go with.

I use SALT because of Security Onion, and open source IDS. I have all my nix systems being babysat by SALT, and can have a new x-arr media server, NGINX, blog, etc running in the amount of time to deploy the template (I use vSphere) and salt applies the desired state. Back up and restore a mount folder, np. IaC is only limited by your imagination. I have salt also specifying all the containers I have running, defining the config files, etc. Basically poor mans/simpleton kub.

I suspect you already know this, but if there isn’t a module that directly does what you want like running SQL specific functions, you can just have it run programmatic CLI files on the host, or in the container for you.

I am in the process of moving my IaC code from manager file system to Gitlab. I imagine you’d do this from jump street. Have fun.

nico Jan 11, 2024

I see no one else commented my stack, so I suggest:

Nomad for managing containers if you want something high availability. Essentially the same as k8s but much much much simpler to deploy, learn, and maintain. Perfect for homelabs imo. Most of the concepts of Nomad translate well to k8s if you do want to learn it later. It integrates really well with Terraform too if you are also hoping to learn that, but it’s not a requirement.

NixOS for managing the bare metal. It’s a lot more work to learn than say, Debian, but it is just as stable, and all configuration will be defined as code, down to the bootloader config (no bash scripts!). This makes it super robust. You can also deploy it remotely. Once you grow beyond a handful of nodes it’s important to use a confirmation management tool, and Nix has been by far my favourite so far.

If you really want everything to be infra-as-code, you can manage cloud providers via Terraform too.

For networking I use wireguard, and configure it with NixOS. Specifically, I have a mesh network where every node can reach every node without extra hops. This is a requirement if you don’t want a single point of failure (hub and spoke) to disconnect your entire cluster.

Everything in my setup is defined ‘as-code’, immutable, and multi-node (I have 7 machines) which seems to be what you want, from what you say in your post. I’ll leave my repo here, and I’m happy to answer questions!

–

My opinions on the alternatives:

Docker compose is great but doesn’t scale if you want high availability (ie, have a container be rescheduled on node failure). If you don’t want higher availability, anything more than docker might be overkill.

Ansible and Puppet are alright but are super stateful, and require scripting. If you want immutability you will love Nix/NixOS

GitHub - Cottand/selfhosted: My home-lab setup, a cluster of 7 servers running 50-70 containers

My home-lab setup, a cluster of 7 servers running 50-70 containers - GitHub - Cottand/selfhosted: My home-lab setup, a cluster of 7 servers running 50-70 containers

GitHub

johntash Jan 12, 2024

Hey, your stack is pretty similar to mine. One thing I recently started testing is Seaweedfs. I saw it listed in your repo too, how are you liking it so far? And do you use it on all of your nodes?

nico Jan 12, 2024

I struggled a bit to get it up and running well, but now I am happy with it. It’s not too hard to deploy (at least easier than the alternatives), it has CSI which for me was big, and it has erasure coding. The dev that maintains it (yes, the one dev) is very responsive.

It has trade offs, so depending on your needs, I recommend it. Backing store for stateful workloads like postgres DBs? Absolutely not. Large S3 store (with an option for filesystem mount) for storing lots of files? Yes! In that regard it’s good for stuff like Lemmy’s pictrs or immich. I use it as my own Google drive. You can easily replicate in your own cluster, or back it up to an external cloud provider. You can mount it via FUSE on your personal machine too.

Feel free to browse through my setup - if you have specific questions I am happy to answer them.

johntash Jan 13, 2024

Thanks! I’ll do some testing over the weekend and see how it goes.

While I’d love to be able to use it for postgres, I figured that wouldn’t work out well so probably won’t try it any time soon. I do have several apps that use sqlite databases though, do you think those would have any issues? e.g. trilium, ntfy, ghost

The main downside to most of the distributed/clustered storage that I’ve tried is they always seem to corrupt sqlite db files due to not supporting locking or some other posix feature. Reading through some older github issues, it looks like that is something the dev of seaweedfs fixed hopefully.

nico Jan 13, 2024

The problem with using seaweedfs to a back your DBs is more on the filesystem than the implementations of POSIX features. When you are writing to a file, and the connection to seaweedfs breaks (container restart, wifi, you name it), then you might end up with a half-written file. If you upload pictures, this is unlikely, but DBs are doing several writes per second usually. So it is more likely one of those gets interrupted. In my case, my grafana sqlite DB would get corrupted every other week.

What I recommend is using DBs natively in your node’s filesystem, and backing them up to seaweedfs periodically instead. That way your DBs ‘work’ but you can get them running again, and the backup is replicated in the distributed filesystem.

johntash Jan 13, 2024

What I do right now is I have a rclone sidecar container that uploads files in a directory every few seconds, and I also have another init sidecar that runs before the main application and downloads those files (incl sqlite dbs) to the normal disk. This works okay but feels pretty clunky and can still result in stuff getting corrupted because I’m just backing up the db files and not using any sqlite commands to actually back up the db to another file that isn’t in-use first.

How do you handle a job going from one nomad node to another? Or do you pin jobs like grafana to specific hosts?

nico Jan 14, 2024

Nomad has host volumes - so you can tell it to mount a folder from the machine on the container, and it will only schedule that container on machines that have that folder. So yes, effectively you pin the workload, thus introducing a SPOF - I do not love it but Grafana only supports sqlite and postgres, so making those HA would require failover setups which is a bit much for a homelab :')

For backing up, you can use the sqlite command periodically (do cron job or Nomad periodic job) and then upload the backup to some external, safe storage (could be seaweedfs or S3!). For postgres you can use something like this.

Stateful workloads with Nomad host volumes | Nomad | HashiCorp Developer

Configure and deploy a host volume to support a MySQL workload that requires persistent storage.

Stateful workloads with Nomad host volumes | Nomad | HashiCorp Developer

jkrtn Jan 12, 2024

Could you give a quick example of using NixOS configuration to launch a machine or deploying something remotely? I’m just starting to move beyond a single machine at home. I’d really like to get transition to infra as code.

nico Jan 13, 2024

I recommend starting with ZeroToNix’s docs and then moving on to nixos.wiki, but here is a minimal, working example that I could deploy to a hetzner VPS that only has nix and ssh installed:

{ config, pkgs, ... }: { # generated, this will set up partitions and bootloader in a separate file imports = [ ./hardware-configuration.nix ]; zramSwap.enable = true; networking.hostName = "miki"; # configures SSH daemon with a public key so we can ssh in again services.openssh.enable = true; users.users.root.openssh.authorizedKeys.keys = [ ''ssh-ed25519 AAAAC3NzaC1lNDI1NTE5AAAAIPJ7FM3wEuWoVuxRkWnh9PNEtG+HOcwcZIt6Qg/Y1jka'' ]; # creates a timmy user with sudo access and wget installed users.users.timmy = { isNormalUser = true; extraGroups = [ "networkmanager" "wheel" "sudo" ]; packages = with pkgs; [ wget ]; }; # open up SSH port networking.firewall.allowedTCPPorts = [ 22 ]; # start nginx, assumes HTML is present at `/var/www` services.nginx = { enable = true; virtualHosts."default" = { forceSSL = true; # Redirect HTTP clients to an HTTPs connection default = true; # Always use this host, no matter the host name root = /var/www; # Set the web root to ser }; }; system.stateVersion = "22.11"; }

This sets up a machine, configures the usual stuff like the ssh daemon, creates a user, and sets up an nginx server. To deploy it you would run nixos-rebuild --target-host [email protected] switch. Other tools exist (I use colmena but the idea is the same). Note how easy it was to set up nginx! If I was setting Nomad up, I would just do services.nomad.enable = true.

As you can see some things you will have to learn (the nix language, what the configs are…) but I think it is worth it.

NixOS · Zero to Nix

Your guide to learning Nix and flakes

Zero to Nix

jkrtn Jan 14, 2024

This is such a wealth of information, thank you! I’m really excited to try this out.