@arichtman @vwbusguy @mttaggart exposed api endpoints, super secret secrets hanging out in env vars, rbac not configured or not present, public api access, shared usernames, images that are 2-5 years old with trivial kernel privesc bugs, containers built by people who dont security and spread far and wide. its just a risk matroshka doll full of exploitable surfaces and configs, and all the corners and edges full of "industry best practices", written by non-security people

@Viss @arichtman @mttaggart I'm more bothered by the fact that k8s secrets objects aren't actually encrypted (they're just base64 encoded) than scoped injection by env.

https://12factor.net/config

The Twelve-Factor App

A methodology for building modern, scalable, maintainable software-as-a-service apps.

@Viss @arichtman @mttaggart This is especially concerning given that most k8s deployments don't have any kind of RBAC setup at all. The gears are there for it, but few (Openshift and Rancher being notable exceptions) implement it.
@vwbusguy @arichtman @mttaggart one time i made a very attractive lady literally snotlaugh by saying "kubernetes appears to have been invented to solve a litany of problems that nobody actually appears to have"
@Viss @arichtman @mttaggart This just tells me you didn't have the wonderful joy of trying to run Docker Swarm in production in its early days and I'm happy for you in that regard. Sweet glory did Kubernetes solve a lot of problems compared to that.
@vwbusguy @arichtman @mttaggart this feels like one of those sorta 'if you go back further in time, you see that docker actually introduced a lot of problems, which were then fixed by k8s' scenario, so if your context window begins at docker, then yeah its a 'measurable improvement', but if it begins 'before you installed docker', then you're still at a net negative
@Viss @arichtman @mttaggart To be fair, you're not wrong for a whole lot of use cases. If you built your empire on a LAMP stack, that doesn't translate well in a scalable way in a Kubernetes world because it was stateful and built for vertical scaling. Forcing that into Kubernetes means retooling some core architectural things for the stack for an outcome that might not be demonstrably better.
@vwbusguy @arichtman @mttaggart unless youre dealing with like, dozens or hundreds of containers that are geographically distributed, i get the impression kubernetes is just massive overhead and lots of extra attack surface. I can see how in narrow circumstances it can be useful, but so far literally every single k8s deployment ive seen is "way more overhead and complexity and attack surface, for not enough benefit"
@Viss @vwbusguy @arichtman I believe this is generally correct. The scale at which its utility becomes apparent will never be achieved by the vast majority of those who use it. The choice was informed by hype and a desire to believe they would one day require, as K8s puts it, "planetary scale."
@mttaggart @Viss @arichtman This was definitely true in some shops. I actually remember hearing a Red Hat person advising a customer once, roughly nine years or so ago, that what they wanted OpenShift for could be done better on some regular machines running RHEL. I admired the honesty and restraint from oversell in that particular moment.

@Viss @arichtman @mttaggart Again, I agree with you that this is true for a lot of use cases and shops. That said, you can't pretend that things were gloriously secure en masse in the older days of LAMP, Tomcat, and ASPX. Moving to Kubernetes in some cases allowed for better hygiene in general around secrets, hardening, and idempotency. For stuff like multi-tenant JupyterHub, Kubernetes is highly practical. For serving your company's blog - maybe not.

https://jupyter.org/hub

Project Jupyter

The Jupyter Notebook is a web-based interactive computing platform. The notebook combines live code, equations, narrative text, visualizations, interactive dashboards and other media.

@vwbusguy @arichtman @mttaggart thats the tug though. everyones hamfisting it in everywhere, using it for their core business infra or making it part of ci/cd pipelines. nobody is using it 'the right way'

@Viss @arichtman @mttaggart CI/CD pipelines makes sense - not having designated hardware sit idle when workers aren't running, the worker agents can go away when the job is done leaving only intended artifacts meaning less attack vector for workers, idempotency, etc.

Of course you don't *have* to do it this way, but there's a clear case to be made.

@vwbusguy @arichtman @mttaggart that description is not how i have seen it deployed, though
@Viss @arichtman @mttaggart That's how I have it deployed 😀 . All on prem with Jenkins and Rancher RKE2 k8s backends.
@vwbusguy @Viss @arichtman This conversation is quite the piece of evidence that you are the exception to the rule. Your knowledge is impressive, and rare. Certainly moreso than orchestrated container deployments. Y'all are both right.

@mttaggart @vwbusguy @arichtman this is just the 2024 version of

- there is a 'way to do it right'
- most people do not do it that way
- the thing is almost certainly being used when it doesnt need to be
- the folks deploying the thing in most cases are not familiar enough with it, or architecture in general to adquately harden it
-- or they just dont care to, usually because compliance

it used to be lamp, now its containers

@mttaggart @vwbusguy @arichtman i guess the tl;dr for me is:

"if you give people a giant red george jetson button that does a thing, then people will just instinctively mash that button without ever considering the consequences. and you end up with a bunch of output that the button masher wasnt expecting and doesnt know what to do with, which often times ends up as someone elses problem, who wont be happy with this arrangement"

@Viss @mttaggart @arichtman I think it often happens more like this:
@vwbusguy @mttaggart @arichtman nailed it. but now with k8s you can cloud scale that debt at warp factor 9 :D
@vwbusguy @mttaggart @arichtman just like java was 'write once, exploit everywhere', now you can take "architectural and technical misconfigurations and lack of hardening and cloud scale it" :D

@Viss @vwbusguy @arichtman I really do think a giant piece of it—especially in the tech industry/startup space itself—is a decision-making process that assumes:

  • Old == bad
  • We will be the next 1M user unicorn and should build for that today.
  • @mttaggart @vwbusguy @arichtman 100% of the folks who follow that logic pathway are vc types or money types or upper-level-execs, who are solving for "return on their personal cash money investment" and not "to build some shit that actually works, or has longevity, or to solve some problem"
    @Viss @vwbusguy @arichtman I'm not so sure about that. Having lived in the dev space for long enough, the dev/founder folks do this as well.

    @mttaggart @vwbusguy @arichtman a lot of founders, especially founders who set out to score vc money tend to think the same way as the vc.

    ive done a loooooooot of M&A assessment work, and some of the environments ive seen smack of those scenes in home alone where its all cardboard cutouts, strings and shadowpuppets to give the illusion that some shit exists there

    @mttaggart @Viss @arichtman

    1. Containers are old. They're basically jails and Solaris had containers in the 1990s.
    2. Getting this right is a tricky problem. Arguably one viable reason *to* use public cloud is that you don't expect to scale big soon, so the cost to do so could be relatively low in OpX dollars.

    @mttaggart @Viss @arichtman The "magic" about containers in either direction tends to go away once you realize that containers are just Linux processes. That's all they are - wrapped in cgroups namespaces and with link hijacking like a jail. That's why when you run `ps` on Linux you see the actual container process and not a hypervisor, etc. Requests and limits? That's CFS.

    @vwbusguy @Viss @arichtman While the concept of containers is old, I think we can both agree that the "productization" of them is less so.

    And as far as scale, I'm referring specifically to choosing a container orchestrator as the deployment target from day one.

    @vwbusguy @Viss @mttaggart @arichtman
    This is often the case with freelancers

    @mttaggart @Viss @arichtman We didn't even get into immutable Linux hosts yet, either ;-)

    And to be clear, I also think Viss is right. Where we've disagreed here, I'm also agreeing with him at least somewhat.

    @Viss @arichtman @mttaggart I literally wrote my own CNI before I realized what I was doing just trying to get a reliable network service that I could proxy between containers with it. It constantly called etcd to splice in config updates to nginx with regex on changes on each of the hosts with logic to proxy sub paths, etc.
    @Viss @arichtman @mttaggart I never published it because it felt so dirty and I didn't want to maintain it and it was a problem I didn't have to solve with Kubernetes.
    @vwbusguy @Viss @arichtman @mttaggart what do you think about Docker Swarm today? I tried k8s in my homelab and I hated it. Just not a great fit for such a low scale. Now I run Docker Swarm and I hate it much less. Still not great though but I see no alternative...
    @DrRac27 @vwbusguy @Viss @arichtman I'm a big fan of Swarm as a "just enough" solution for distributed container management. And I actually think the secrets management is decent.

    @DrRac27 @Viss @arichtman @mttaggart If you want a small scale lightweight k8s, then I recommend k3s. You can run k3s on one node.

    https://k3s.io/

    K3s

    @vwbusguy @Viss @arichtman @mttaggart thats what I tried first but I liked it even less. In k8s I at least had to learn how it works and every upgrade has a defined path. In k3s the install is `curl | sh` and what about upgrades? Just swapping out the binary and hope nothing breaks? I got it up and running with Ansible but I was not feeling great about it and expected it to break all the time. With swarm I just install the debian package and use the community.docker.docker_swarm ansible module
    @DrRac27 @Viss @arichtman @mttaggart Upgrade for k3s is you run that same script again. It upgrades the components for you. You can also revert versions and you can backup etcd in case you want to start fresh. Etcd on k3s single node is just an sqlite database.
    @DrRac27 @Viss @arichtman @mttaggart Coincidentally, Ansible is the reason I got into using k3s. I've been running AWX on it for years in my dayjob for an environment where I didn't have k8s established but just wanted to run Ansible AWX there.
    @vwbusguy @Viss @arichtman @mttaggart ok good to know. I still don't think it is right for me but at least I learned sth, thanks!
    @DrRac27 @Viss @arichtman @mttaggart For things that I run in a container that don't need all the overhead of Kubernetes, I use podman with systemd to manage, so they end up running more like traditional Linux services, but getting updates through `podman pull` instead of yum update. Podman plays nicer with rootless, firewalld, cgroups2, etc., and has a fairly straightforward migration path to k8s if you end up needing to go bigger down the road.
    @DrRac27 @Viss @arichtman @mttaggart My general opinion is that podman with a proxy in front (eg, caddy, nginx) can do most of what swarm can with less overhead and if you really need more than that, then you probably should be thinking about Kubernetes anyway.

    @DrRac27 @Viss @arichtman @mttaggart And if multitenancy with security is your end goal, then check out Kata Containers.

    It let's you orchestrate container workloads as tiny VMs.

    https://katacontainers.io/

    Kata Containers - Open Source Container Runtime Software

    Kata Containers is an open source container runtime, building lightweight virtual machines that seamlessly plug into the containers ecosystem.

    @vwbusguy @Viss @arichtman @mttaggart I would love to use podman or kata but then I have no orchestration, right? If one node goes down for what ever reason (reboot, crash, I want to change hardware or reinstall) no other node picks up the tasks of that node? Can I build a sane failover with something like keepalived? If I had more time I would just write something myself, I can't believe nobody did it yet...

    @DrRac27 @vwbusguy @Viss @arichtman Yeah so this is why I teach starting with Swarm for orchestration, then moving to Podman/k3s once the need arises.

    I like Podman a lot, but your concerns are real. I'd also add that while yes, much of Swarm functionality is achievable to a degree with Podman and a reverse proxy, that is additional deployment complexity for a solution designed to reduce it.

    Container Essentials

    Build. Deploy. Scale.

    @mttaggart @DrRac27 @Viss @arichtman That's a valid point. In my setup, I have config management and monitoring services, making podman more practical, but if you don't already have those things, podman is less useful. It also ultimately depends on your SLA. IOW, can you afford the downtime vs added complexity trade off?
    @DrRac27 @Viss @arichtman @mttaggart That's an absolutely fair point and you're generally right. I would use Ansible to automate it and while systemd can trigger a restart on a failed container process, podman health check is mostly just for notifying journald that there might be a problem but doesn't pro-actively do anything about a container where the process is running but unhealthy.
    @vwbusguy @Viss @arichtman @mttaggart I think I don't fully understand. How would you automate failover with ansible?
    The last days I was working much on my homelab and if I weren't invested too much in Swarm yet I would have tried k8s again 😅 Swarm does not even support devices like GPU or Zigbee Sticks (without hacking) and I wanted to run a registry that is only reachable on localhost (so inside of the whole cluster via the builtin loadbalancer) but that isn't supported in swarm mode eighter.

    @DrRac27 @Viss @arichtman @mttaggart Hey, so I was wrong about this.  They actually did add support for this as of podman 4.3.

    https://www.redhat.com/sysadmin/podman-edge-healthcheck

    Podman at the edge: Keeping services alive with custom healthcheck actions

    New Podman feature allows you to automate what happens when a container becomes unhealthy, which is crucial for services in remote locations or critical systems.

    Enable Sysadmin
    @Viss @vwbusguy @arichtman @mttaggart @ceejbot from the outside, it looks like that, but I always assumed I was just missing something. Is that really how it is?
    @mk30 @vwbusguy @arichtman @mttaggart @ceejbot the easiest way to put this without getting into the weeds is "kubernetes is not for everyone". it very clearly was not designed with security in mind from the beginning, and since its inception its been this sorta trapeze act to bolt on stuff here and there or use other third party tools to 'make it safer somehow', so it got (my opinion) very top heavy and complex, very fast, and i would argue not as "hardenable" as many security folks would prefer
    @Viss @vwbusguy @arichtman @mttaggart @ceejbot that makes sense. Thanks for explaining it 🐱