Hardware security devices for remote PAM auth
Sometimes, laziness can be a great motivator.
In effect, what I wanted was a zero-trust setup for remote authentication. On one hand, because it’s way more secure; but the real motivator was that I got tired of typing or copy/pasting passwords.
For a while now, I’ve been using both a Nitrokey and a Yubikey for authentication and signing on my local machine. This has many interesting features, e.g.: combined with my password manager workflow it means I can securely get a password without much work, as the touch of a button is enough. And not only is this more convenient, but it’s also more secure. Not only do they add a physical layer to workflow, which makes the workflow depend on the physical presence of a device; they also obviate the need for the usage of other apps or devices, which decreases the attack surface from other sources. Not only are they easier to use and manage, but they also increase security? This is exactly my kind of jam!
But there has been one situation that has been bugging me for a while. I wanted to be able to use this flow for one specific thing: local auth is fine and dandy, but if I could also use this to manage auth on my remote servers, that would be so good! I was already using the keys for ssh authentication, which is a great first step to protect a remote server (remember: always disable password auth!); but if I could also make su(do) auth to only work with these hardware devices, this would again be the best of both worlds. I could increase the security of my servers, and the convenience of using them! No more typing su(do) passwords. But the way to set this up properly has eluded me for a while. I even went as far as making a LinkedIn post asking for help. Unfortunately, this hasn’t really helped, nor were the various LLMs any help.
What follows is a story of many hours of frustration and debugging.
The journey starts with pam_ssh_agent_auth. This is a project that’s made to forward ssh-agent authentication to PAM, which in essence means that valid auth with an SSH key would allow me to authenticate with PAM. Exactly what I was searching for! With this set up, I can plug the module to be the source of truth for PAM, which meant that su or sudo auth on a remote host were at my fingertips (quite literally). But, alas, this wasn’t meant to be, as there were two (quite serious) problems with this module. The first of all, is that the module hadn’t seen any update in almost 6 years at this point; while this isn’t a deal-breaker per se, the truth is that I wouldn’t be comfortable with having an abandoned piece of software work in such an important part of a Linux system as PAM. But even if I was okay with it, there was the second problem: the module as it is doesn’t support ECDSA-SK or ED25519-SK key types. Turns out these are the types that I’m using with my hardware keys, as even though RSA keys are supported, the usage of 4096 bit keys imply much more computationally intensive operations than elliptic curve cryptography (ECC), which on a system that lives inside a USB keys makes a world of difference (remember: the crypto operations are done inside the hardware, not on your computer!). So, no dice.
I then proceeded to check pam-ssh-agent. This was much more promising, as this was a project that is actively maintained and written in Rust. Besides that, modern niceties like ECC-based keys were working. So, after quickly installing it, I started to configure it according to the instructions. Pretty easy stuff, but it turns out that when I tried to run sudo from a normal user, I wasn’t able to authenticate. Running journalctl -f --facility authpriv led to this log:
Apr 24 20:36:22 pod sudo[133830]: pam_ssh_agent(sudo:auth): Authenticating user 'lol' using ssh-agent at '/run/user/1000/gnupg/S.gpg-agent.ssh'
Apr 24 20:36:22 pod sudo[133830]: pam_ssh_agent(sudo:auth): authorized keys from '/etc/security/authorized_keys'
Apr 24 20:36:22 pod sudo[133830]: pam_ssh_agent(sudo:auth): Agent did not know of any of the allowed keys
Seems like this was a problem with my ssh-agent. And indeed, if I jumped into my user, and ran ssh-add -L to list the existing known keys, nothing would come up. Something was missing, but I wasn’t sure what it was since I had configured agent forwarding, so this should work.
Now, an intermission. When I configured my hardware keys, I opted to use gpg-agent as my ssh-agent. This meant that I only needed to configure one agent in one place, streamlining my configuration, at the cost of having to add a few more configuration options into GnuPG and my environment. Turns out that, for the purpose of using the hardware key for remote auth, this is a pretty big deal. I discovered this once I started to debug the ssh connection with ssh -vvv <host>, and was seeing messages like Agent forwarding disabled: couldn't create listener socket.
Turns out that there needs to be some work done before the ssh-agent on your remote machine sees your local ssh-agent (which is just gpg-agent emulating ssh-agent). The first of all, is the need to configure sshd on your remote host to remove stale sockets on connect, which is done via setting StreamLocalBindUnlink yes on sshd_config. That’s easy enough. The other part, might not be.
This is where I really had hit a wall. The second part of the configuration is on your local host, and requires you to specify the sockets that need to be forwarded to the remote host. The Arch Linux wiki has a good section on this, but where the configuration was falling apart had to do with the fact that it needed to be based on the gpgconf output of the remote host. Turns out that since, even though I was using Arch on both machines, the remote machine is a server, which means that there is no dbus activation on login; this in turn means the default paths for the sockets (i.e., /run/user/...) weren’t being created. So, I had to point the forwarding of the sockets to the actual directory that the gnupg agent was expecting. So, your ~/.ssh/config needs to have either one of these snippets:
Host remote_name
...
RemoteForward /run/user/1000/gnupg/S.gpg-agent /run/user/1000/gnupg/S.gpg-agent.extra
RemoteForward /run/user/1000/gnupg/S.gpg-agent.ssh /run/user/1000/gnupg/S.gpg-agent.ssh
But if you’re using non-standard dirs (remember, check with gpgconf --list-dir agent-extra-socket and gpgconf --list-dir agent-ssh-socket), this might be it:
Host remote_name
...
RemoteForward /home/<user>/.gnupg/S.gpg-agent /run/user/1000/gnupg/S.gpg-agent.extra
RemoteForward /home/<user>/.gnupg/S.gpg-agent.ssh /run/user/1000/gnupg/S.gpg-agent.ssh
This worked! I could now do sudo ls on the remote machine, and my local hardware token would blink, asking for confirmation of the auth request. ssh-add -L also displayed all the proper keys, exactly the same as on my local machine, which meant that the agent was being properly configured. I’m a happy camper, even if this ended up being quite a bit more difficult than what I was expecting at first. But it’s always good when you can debug some new part of your system, as it teaches you a lot about how things work under the hood. I now have a better grasp on how PAM, ssh, gpg can work together, and that’s invaluable! Also, I can also be a bit more lazy and not have to type any password!
In reality what I achieved here was a passwordless, hardware-attested authentication workflow; a mini zero-trust architecture, if you will. This is actually pretty close to how real production systems should work. And understanding the full auth stack is a huge win, since security should always be a prominent consideration in everything we do.