For a while now, I’ve self-hosted a runner to integrate with Gitea Action. But there was something about it that was leaving me worried: in an age where software supply chain security is becoming more and more important, I felt like I needed to improve the security of the setup. The answer I arrived at, while not perfect (yet), it’s a great improvement over what I had before. This post will be a walk through of what I had, what and why I chose to replace it, how I did it, and where I could still improve it.

In the beginning, there was act#

The act runner started a long while ago, to offer the ability to run GitHub Actions, but locally. This would allow for both fast feedback loops, as a dev could now simulate GHA locally without any costs, as well as having a local task runner. This seemed like a perfect fit for a project like Gitea; in our day and age, a git forge without an integrated CI/CD pipeline is pretty much a no-go for any “serious work”.

As of roughly 3 weeks ago, the fork became more pronounced. “act_runner” became just “runner”, the semver jumped to v1.0.0, and the pace of development seems to have accelerated considerably - I fear we all know why, but let’s hope this doesn’t translate into a rotting of the software. In any case, this ties into my worries.

I have to admit that I haven’t been a very good sysadmin. For a while now, I was running the runner directly on my system. Old-school, bare-metal. Now, I’m not an idiot-sandwich, the runner was properly configured with its own user, dirs, and permissions. More importantly, I was only running it for my own pipelines, so I had a reasonable degree of confidence that this was okay.

But every day, I kept reading about shai-hulud; github tokens getting stolen; worms spreading; CI/CD pipelines getting pwned left and right. And while I only write software in Go, and the ecosystem has been relatively safe for now, there is no reason for me to relax. On the contrary, the time to improve the situation is before problems happen.

What’s next? Not Docker#

The current recommendation for running the runner, is to run the jobs in a Docker container. But I’m not exactly a fan of Docker; while it certainly has its uses, and it made life much easier for many tasks, it also introduces certain trade-offs that I’d rather avoid.

It’s true that Docker came a long way since its inception, and since it started leveraging cgroups and allowing for rootless mode, most of the security worries went away. But it still doesn’t come with sane defaults, and rootless mode still has its kinks. So, if I’ll have to waste some time properly configuring Docker, which I’ve already done, I’d rather spend it learning something new. So, let’s leverage the facilities that a modern Linux gives us, without having to reach for Docker.

If saying “I don’t like Docker” didn’t piss off enough people, I’m about to rustle even more jimmies. Yes, I reached for systemd. More concretely, I reached for three systemd components that really made this (relatively) easy, (very) secure, and (not so much) fun: systemd-nspawn for making use of kernel namespaces, cgroups, and seccomp; systemd-networkd to make private virtual links easy; and systemd-resolved to help with a kink in my system.

The container solution#

systemd-nspawn’s manpage has a very apt description of what it is and what it does, so I’ll just paste it here verbatim:

systemd-nspawn may be used to run a command or OS in a lightweight namespace container. In many ways it is similar to chroot(1), but more powerful since it virtualizes the file system hierarchy, as well as the process tree, the various IPC subsystems, and the host and domain names.

Now, let’s be clear: there’s nothing here that is exclusive to systemd. In a sense, nspawn is just like Docker, since it repackages and creates abstractions around very powerful technologies. The nice thing is that it also comes with first class integration with the rest of the systemd ecosystem, and we’re going to make full use of that.

Still, in general, I wanted a container that allowed me to run a process with:

  • no capabilities
  • no access to host files (separate and read-only rootfs)
  • no access to host processes (separate PID namespace)
  • no access to host network services (separate network namespace)
  • no new privileges
  • a restricted set of syscalls

This can be achieved in less than 10 lines of a configuration file:

[Exec]
# Disable user namespacing — avoids UID mapping problems with bind-mounts.
PrivateUsers=no

# Block setuid/setgid binaries (sudo, ping, etc.)
NoNewPrivileges=yes

# Seccomp filter: only the listed syscall groups are allowed
SystemCallFilter=@system-service @process @basic-io @file-system @network-io @signal @ipc @mount

[Network]
# Private network namespace with veth pair + automatic NAT
VirtualEthernet=yes

[Files]
# Rootfs is immutable — all writes must go through bind-mounts
ReadOnly=yes

# Writable bind-mounts (survive rootfs rebuilds)
Bind=/var/lib/gitea-runner-container:/var/lib/gitea-runner
Bind=/var/cache/gitea-runner-container:/var/cache/runner

The networking headaches#

Now, the networking part is always where the fun is, isn’t it? As I don’t like to make my life easier, there was something that I didn’t want to give up: having the container use my private DNS resolver. What’s the point of self-hosting if you don’t go full-on? And, let’s not forget that we still want a separate network namespace - just sharing the network from the host would be dangerous in case the container gets compromised.

Thankfully, systemd-networkd and systemd-resolved come to the rescue. By having networkd running on both the host and the container, the configuration becomes very simple, especially given that I wanted a relatively static configuration for the purposes of configuring the DNS resolution. resolved in our case is just a very simple stub resolver, that on the host is pointed to an authoritative DNS resolver that can only be accessed through a Wireguard tunnel.

The first step is masking the generic container network configuration, which is as easy as ln -sf /dev/null /etc/systemd/network/80-container-ve.network. With that out of the way, I configure the interface for our container by editing /etc/systemd/network/90-ve-gitea-runner.network:

[Match]
Name=ve-gitea-runner

[Network]
Address=172.30.0.1/28
DHCPServer=yes
IPMasquerade=both

[DHCPServer]
EmitDNS=yes
DNS=172.30.0.1

The important part here isn’t so much the address that the container ends up having, but rather the IP of the gateway: that’s where we’re going to point our DNS resolution. So, on the host’s resolved configuration, we want to add a stub to guarantee that resolved will answer the queries from the container:

[Resolve]
DNSStubListenerExtra=udp:172.30.0.1:53

And on the container’s resolved, we need to point it to the host:

[Resolve]
DNS=172.30.0.1
Domains=~.
DNSStubListenerExtra=udp:[::1]:53

That last line was added due to some processes in the container expecting IPv6 resolution; this was the way to make resolved also forward those queries to the host.

This was it! I think you can see now why I reached for that extra little bit of systemd goodness. With the proper services enabled on both host and machine, everything is working as expected. Except for one thing: if you have a firewall, you’ll have to also configure it. I’m using iptables, but it should be easy enough to configure the same rules for nftables:

-A INPUT -i ve-+ -p udp --dport 67 -j ACCEPT
-A INPUT -i ve-+ -p udp --dport 53 -j ACCEPT
-A FORWARD -i ve-+ -o eth0 -j ACCEPT
-A FORWARD -i eth0 -o ve-+ -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT

Wrapping it up, and some considerations#

Now, with all of this configured, what’s left is to register the runner from inside the container with the Gitea forge, and build away! In fact, if you’re reading this post, it just so happen that this migration was successful, and I’m now using a containerized CI/CD pipeline! Hurray!

Of course, I made everything sound so nice and peaceful, but there are still some drawbacks to this, namely:

  • Your caches and workspace can still get poisoned by a rogue dependency
  • Updating the container implies having to stop the container, disabling ReadOnly mode, issuing the update command, enable ReadOnly, and starting the container again
  • This isn’t a proper VM, therefore there are still ways to chain together exploits that could, in theory, allow for escaping the container, escalate privs, and so on

While I don’t feel any of these are deal-breakers, they certainly leave room for improvement. And given that the runner already supports ephemeral runners, integrating this with systemd-nspawn would certainly be something that could be done and help quite a fair bit to improve the security of the system. Maybe one day :-)