Eric Curtin

269 Followers
71 Following
510 Posts
Docker Engineer working in AI

The world's first inference server with Gemma 4 and TurboQuant:

brew tap ericcurtin/inferrs
brew install inferrs
inferrs run --turbo-quant google/gemma-4-E2B-it

https://github.com/ericcurtin/inferrs

Gemma 4 and TurboQuant coming to Docker Model Runner soon. Gemma 4 is available on Docker Hub.

GitHub - ericcurtin/inferrs: A TurboQuant inference engine

A TurboQuant inference engine. Contribute to ericcurtin/inferrs development by creating an account on GitHub.

GitHub
Very happy to share, what our team at #docker has been cooking for the last months. It is our evolution of the `docker sandbox` command, now available as a standalone binary `sbx`, a modern Docker take on AI agent sandboxing, check it out at https://docs.docker.com/ai/sandboxes
Docker Sandboxes

Run AI coding agents in isolated environments

Docker Documentation
Heading to Brussels for #FOSDEM26 (Jan 31- Feb 1)? Want to learn about Docker Model Runner and other AI offerings at Docker? Dorin and I will be speaking in the AI Plumbers DevRoom on Saturday and hope to see you there

https://www.youtube.com/watch?v=m061eizaEko

Some points to clarify on wayoa. The Metal rendering, that's for the macOS window. We would expect the Linux application and VMM to use GPU acceleration via KosmicKrisp. https://github.com/ericcurtin/wayoa and https://github.com/ericcurtin/vmm are open to contributions.

Weird World Of Wayland Compositors On MacOS

YouTube

Apple GPU support for vLLM? Monorepos vs microrepos?

https://github.com/vllm-project/vllm/pull/29629

Recently we did a webinar on Docker Model Runner. I prepared a slide on my take of when to use llama.cpp and when to use vLLM. Here is what I came up with (and I feel Edge kinda fits in the middle, there are use cases for both), but I am curious what do people think?

This feature in the upcoming systemd release started from the efforts of the Red Hat In-Vehicle OS team Brian Masney Alexander Calhoun . Probably just less than 2 years in the making, but it finally made it! Thanks to Francesco Valla for driving it to completion.

"in certain fixed-function usecases it might make sense to load modules via this infrastructure during boot"

Often in Automotive, we know exactly the hardware and software stack that is to be deployed and there is little plug and play from the boot perspective. Bypassing udevd and loading boot-time critical kernel modules early in parallel can lead to significant boot speedups. And by starting udevd later, we still get the benefits of udev.

One myth about systemd is that it's too fat to boot quickly, if we build up systemd from scratch starting with just the systemd binary itself, we can start critical boot services in milliseconds, as fast as any init system we would write from scratch in a few lines of C, the core of systemd is lightweight.

https://mastodon.social/users/pid_eins/statuses/115620451885638963

WOAAAA!!!! #smolBSD with libkrun on aarch64!! By the one and only @slp ❤️

https://x.com/slpnix/status/1984400857077572027?t=h59EGz3mWrbKtTDdlq6fOA&s=19

Sergio López (@slpnix) on X

smolBSD with libkrun on aarch64 is pretty cool (and FAST!) cc/ @iMilnb

X (formerly Twitter)
Show HN: docker/model-runner – an open-source tool for local LLMs | Hacker News