gok8ctl - Golang CLI tool for managing k8s clusters
After finishing my previous project, I felt an itch to continue building. One of the things that was noticeable during that process was that my coding skills were not being pushed: while the app I built did teach me about integration of Prometheus inside a Golang application, there wasn’t much else new for me there. So, for my next project, I chose to concentrate on building software.
The question then became: what should I build? It occurred to me that if I wanted to concentrate on improving my software development skills, that I should choose a domain that I have some experience in, so as to remove an extra layer of complexity for the project. As I have some experience already with Kubernetes, and given that I love CLIs, the union of those two things seemed like a natural choice. I even already had a cluster running that I could easily test things on!
So, gok8ctl was born. Now, let me be honest: there’s very little reason why you should be using this in production environments instead of something like k9s. That’s a much more mature and feature complete application. I built this as an exercise to push the boundaries of what I know. Still, it works, and for that I’m happy. This write-up marks the release of v1.0.0, as I implemented all features I wanted for a first release. Let us get started!
The architecture#
One of the things I wanted to achieve with this project was, from the get go, to have a clear separation of concerns: the public facing command line interface would need to live separate from the internal implementation details. Also, each internal package would need to have a clear boundary and responsibility, as this is bound to make the code more maintainable and testable in the long-run.
For that, I ended up settling on the following layout:
├── cmd/ # CLI command definitions (user-facing layer)
├── internal/ # Private application code
│ ├── k8s/ # Kubernetes client operations
│ ├── config/ # Configuration management
│ ├── cache/ # Caching layer
│ ├── retry/ # Retry logic
│ ├── errors/ # Custom error types
│ └── output/ # Terminal output formatting
└── main.go # Entry point
Another decision I took early on with regards to how I was going to build this project, was the use of the standard library, with the objective of using it as much as possible. As I’m discovering more and more of Go, I feel like the batteries included approach and the overall quality of the standard library are a blessing, so I
There are, however, three exceptions to this rule which I feel are worth mentioning: the first two are spf13/cobra and the official Kubernetes Go client library. The rationale for these is simple: the cobra library is an extremely solid, stable and widely use CLI framework, used on several projects (like kubectl, docker, and hugo); as for the k8s Go client library, this was a compromise made to simplify a fair bit the development of the application, but given that the objective was to create something that would integrate k8s, having to write a client library from scratch would’ve taken me way too long for something that feels like is already a very well solved problem by the creators of k8s. Finally, it made sense to rely on an external library to (un)marshal YAML, since that implementing one by hand is not an easy task, and again feels completely out of scope for such a project.
Concurrency and context#
One of the areas where Go excels at is the simplicity of implementing concurrencies in an idiomatic way; implementing concurrent calls for certain requests helps quite a fair bit with the feeling of having a performant application, as the parallelization of API calls for fetching deployments/services/pods/events can be subject to extraneous factors that might cause performance problems (or feelings thereof). I adopted a very simple strategy for these concurrent calls:
// internal/k8s/status.go
var wg sync.WaitGroup
var mu sync.Mutex
wg.Add(1)
go func() {
defer wg.Done()
list, err := c.clientset.AppsV1().Deployments(c.namespace).List(ctx, metav1.ListOptions{})
mu.Lock()
defer mu.Unlock()
// ... handle result
}()
We can see pretty much all the pillars of Go concurrency in full display: the usage of WaitGroups, to coordinate the conclusion of multiple goroutines; the usage of Mutexes to protect shared state during concurrent writes; as well as the usage of goroutines by calling the go keyword.
Another pattern that was used throughout the codebase is leveraging the context package. I made sure that all long-running operations respect context cancellation, so as to allow the user to interrupt operations cleanly with Ctrl+C without leaving orphaned connections or goroutines. We can see this in action inside the cmd/root.go code:
// cmd/root.go
func Execute() error {
ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
defer cancel()
return rootCmd.ExecuteContext(ctx)
}
Errors and cache#
One place that I felt that I could spend some time on was the creation and usage of an internal error library, so as to create a more tailored and structured way for handling errors during runtime. No that there’s anything wrong with the stdlib’s errors package, but as I wanted to increase my proficiency and given that explicit error handling is one of the cornerstones of Go, this felt like a place where I could spend some time to explore different possibilities. This ended up being quite simple to structure and write, as the whole package is around 50 lines of code:
// internal/errors/errors.go
// Sentinel errors for common cases
var (
ErrNotFound = errors.New("not found")
ErrUnauthorized = errors.New("unauthorized")
ErrForbidden = errors.New("forbidden")
ErrTimeout = errors.New("timeout")
ErrInvalidInput = errors.New("invalid input")
)
// ResourceError represents an error related to a specific resource
type ResourceError struct {
Kind string
Name string
Err error
}
func (e *ResourceError) Error() string {
return fmt.Sprintf("%s %q: %v", e.Kind, e.Name, e.Err)
}
func (e *ResourceError) Unwrap() error {
return e.Err
}
// NewResourceError creates a new ResourceError
func NewResourceError(kind, name string, err error) error {
return &ResourceError{Kind: kind, Name: name, Err: err}
}
// NotFound creates a not found error for a resource
func NotFound(kind, name string) error {
return NewResourceError(kind, name, ErrNotFound)
}
// IsNotFound checks if an error is a not found error
func IsNotFound(err error) bool {
return errors.Is(err, ErrNotFound)
}
// Wrap wraps an error with additional contect
func Wrap(err error, msg string) error {
if err == nil {
return nil
}
return fmt.Errorf("%s: %w", msg, err)
}
As you can see, there’s nothing out of the ordinary here. But this allowed me to understand better how error handling works in Go, as well as giving me a more structured way to think about what kind of errors I wanted to return throughout the codebase.
Another place where it made sense to spend some time was to implement strategies to deal with the fact that this software might be used in situations where the networking is less than ideal, and given that there are API calls that might fail due to that, I’ve implemented both a simple cache, as well as a exponential backoff retry logic.
The cache package implements a simple data structure that stores both a value and a time-to-live duration, and can be retrieved with a Get() function. Also worthy of note is the fact that the value store is generic (by using an any type), and that we implemented a background cleanup goroutine with proper shutdown via a stop channel:
// internal/cache/cache.go
// entry holds a cached value with expiration
type entry struct {
value any
expiresAt time.Time
}
// Cache is a simple thread-safe cache with TTL support
type Cache struct {
data sync.Map
ttl time.Duration
stopCh chan struct{}
}
// Get retrieves a value from the cache
func (c *Cache) Get(key string) (any, bool) {
v, ok := c.data.Load(key)
if !ok {
return nil, false
}
e := v.(entry)
if time.Now().After(e.expiresAt) {
c.data.Delete(key)
return nil, false
}
return e.value, true
}
Unit testing#
An integral part of modern software development is that tests are fundamental to guarantee that a continuous development strategy doesn’t end up breaking the expected functions of the application. For that, I have added a somewhat comprehensive test suite, focusing on table-driven tests that span the core packages of this application:
internal/cache/cache_test.go
internal/config/config_test.go
internal/errors/errors_test.go
internal/k8s/client_test.go
internal/k8s/exec_test.go
internal/k8s/status_test.go
internal/output/color_test.go
internal/output/writer_test.go
internal/retry/retry_test.go
Unfortunately, the coverage is not total, as there are some places (especially around the cache and k8s package) that require more work to guarantee more adequate coverage. Fortunately, this means that there is space for improvement, and that is always good news for someone who wants to get better!
$ go test -cover ./...
ok git.assilvestrar.club/lourenco/gok8ctl/internal/cache 0.023s coverage: 51.5% of statements
ok git.assilvestrar.club/lourenco/gok8ctl/internal/config 0.003s coverage: 81.5% of statements
ok git.assilvestrar.club/lourenco/gok8ctl/internal/errors 0.003s coverage: 100.0% of statements
ok git.assilvestrar.club/lourenco/gok8ctl/internal/k8s 2.187s coverage: 54.6% of statements
ok git.assilvestrar.club/lourenco/gok8ctl/internal/output 0.002s coverage: 83.3% of statements
ok git.assilvestrar.club/lourenco/gok8ctl/internal/retry 0.014s coverage: 100.0% of statements
Makefile and CI/CD pipeline#
As the CI/CD pipeline depends on the Makefile, it’s only natural that both of these are talked about together. The first thing of note is that, by default, the Makefile produces fully static binaries that can run anywhere without glibc dependencies, due to the CGO_ENABLES=0 flag; this is a neat trick, especially given you might want to run this in systems that use musl instead of glibc. Another thing of note is that there’s version metadata injection at build-time:
VERSION?=dev
COMMIT?=$(shell git rev-parse --short HEAD)
DATE?=$(shell date -u +"%Y-%m-%dT%H:%M:%SZ")
LDFLAGS=-ldflags "-X main.version=$(VERSION) -X main.commit=$(COMMIT) -X main.date=$(DATE)"
Which means that if you run the tagged version from the upstream git repository, you get:
gok8ctl v1.0.0
commit: 218fe87
built: 2025-12-07T16:38:24Z
While if you build it from source, by default you get:
gok8ctl dev
commit: 218fe87
built: 2025-12-08T13:03:28Z
As you can see, the commit id is the same, but versioning is different. This is achieved by setting up the release step in the ci.yaml file so that when you tag something in git and push said tag, the release job is triggered:
release:
runs-on: self-hosted
needs: test
if: startsWith(github.ref, 'refs/tags/v')
steps:
- uses: actions/checkout@v4
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: '1.25'
- name: Get version from tag
id: version
run: echo "VERSION=${GITHUB_REF#refs/tags/}" >> $GITHUB_OUTPUT
- name: Build release binaries
run: |
VERSION=${{ steps.version.outputs.VERSION }} make cross-build
- name: Create release
uses: softprops/action-gh-release@v1
with:
files: |
dist/gok8ctl-linux-amd64
dist/gok8ctl-linux-arm64
generate_release_notes: true
The last thing of note is that there’s a clear separation of different stages in the CI/CD pipeline: first, the test suite runs; if that’s successful, then the build process starts; and if that was also successful and there’s a tagged release, finally we release the build. This is, as far as I understand, the gold standard for developing modern software, and that’s why I decided to implement it like that.
Putting the app into production#
Finally, I’d like to highlight some functionalities of this software. Since we already have a running k8s cluster running from our previous project, this will provide us with a way to make sure that everything is running smoothly and working as expected. We start by triggering a restart of the argocd-server service:
$ gok8ctl restart argocd-server -n argocd
time=2025-12-08T14:17:48.742+01:00 level=INFO msg="triggered restart" deployment=argocd-server
Deployment "argocd-server" restart triggered
This will allow us to have a look at whether the logs functionality is working correctly:
$ gok8ctl logs argocd-server -n argocd --tail 10
time=2025-12-08T14:20:25.191+01:00 level=INFO msg="streaming logs" pods=1 resource=argocd-server
{"level":"info","msg":"Configmap/secret informer synced","time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"Loading TLS configuration from secret argocd/argocd-secret","time":"2025-12-08T13:17:50Z"}
{"level":"warning","msg":"Static assets directory \"/shared/app\" does not exist, using only embedded assets","time"
:"2025-12-08T13:17:50Z"}
{"level":"info","msg":"invalidated cache for resource in namespace: argocd with the name: argocd-notifications-cm","
time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"invalidated cache for resource in namespace: argocd with the name: argocd-notifications-secre
t","time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"argocd v3.2.0+66b2f30 serving on port 8080 (url: , tls: true, namespace: argocd, sso: false)"
,"time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"Enabled application namespace patterns: argocd","time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"0xc0005ba070 subscribed to settings updates","time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"Starting rbac config informer","time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"RBAC ConfigMap 'argocd-rbac-cm' added","time":"2025-12-08T13:17:50Z"}
Let us then see the status of the whole namespace:
$ gok8ctl status --events -n argocd --pods
Namespace: argocd
────────────────────────────────────────────────────────────
Fetched: 14:35:17
DEPLOYMENTS
NAME READY UP-TO-DATE AVAILABLE AGE STATUS
argocd-applicationset-controller 1/1 1 1 13d ✓ ready
argocd-dex-server 1/1 1 1 13d ✓ ready
argocd-notifications-controller 1/1 1 1 13d ✓ ready
argocd-redis 1/1 1 1 13d ✓ ready
argocd-repo-server 1/1 1 1 13d ✓ ready
argocd-server 1/1 1 1 13d ✓ ready
SERVICES
NAME TYPE CLUSTER-IP EXTERNAL-IP PORTS AGE
argocd-applicationset-controller ClusterIP 10.43.221.241 <none> 7000/TCP,8080/TCP 13d
argocd-dex-server ClusterIP 10.43.181.239 <none> 5556/TCP,5557/TCP,5558/TCP 13d
argocd-metrics ClusterIP 10.43.83.205 <none> 8082/TCP 13d
argocd-notifications-controller-metrics ClusterIP 10.43.139.81 <none> 9001/TCP 13d
argocd-redis ClusterIP 10.43.111.30 <none> 6379/TCP 13d
argocd-repo-server ClusterIP 10.43.192.18 <none> 8081/TCP,8084/TCP 13d
argocd-server ClusterIP 10.43.30.148 <none> 80/TCP,443/TCP 13d
argocd-server-metrics ClusterIP 10.43.146.147 <none> 8083/TCP 13d
RECENT EVENTS
AGE TYPE REASON OBJECT MESSAGE
16m Normal Killing Pod/argocd-server-65544f4864-852kq Stopping container argocd-server
16m Normal SuccessfulDelete ReplicaSet/argocd-server-65544f4864 Deleted pod: argocd-server-65544f4864-852kq
16m Normal ScalingReplicaSet Deployment/argocd-server Scaled down replica set argocd-server-65544f486...
17m Normal Started Pod/argocd-server-764f54784c-2949h Started container argocd-server
17m Normal Pulling Pod/argocd-server-764f54784c-2949h Pulling image "quay.io/argoproj/argocd:v3.2.0"
17m Normal Pulled Pod/argocd-server-764f54784c-2949h Successfully pulled image "quay.io/argoproj/arg...
17m Normal Created Pod/argocd-server-764f54784c-2949h Created container: argocd-server
17m Normal SuccessfulCreate ReplicaSet/argocd-server-764f54784c Created pod: argocd-server-764f54784c-2949h
17m Normal ScalingReplicaSet Deployment/argocd-server Scaled up replica set argocd-server-764f54784c ...
106751d Normal Scheduled Pod/argocd-server-764f54784c-2949h Successfully assigned argocd/argocd-server-764f...
PODS
NAME READY STATUS RESTARTS AGE NODE
argocd-application-controller-0 1/1 Running 2 13d omega
argocd-applicationset-controller-fc5545556-4nf9f 1/1 Running 2 13d omega
argocd-dex-server-f59c65cff-vpf5b 1/1 Running 2 13d omega
argocd-notifications-controller-59f6949d7-vnbgd 1/1 Running 2 13d omega
argocd-redis-75c946f559-zxjh8 1/1 Running 2 13d omega
argocd-repo-server-6959c47c44-cjw96 1/1 Running 2 13d omega
argocd-server-764f54784c-2949h 1/1 Running 0 17m omega
All seems to be good. Let’s just drop into the shell of the pod that we have just restarted:
$ gok8ctl shell argocd-server-764f54784c-2949h -n argocd
Connecting to argocd-server-764f54784c-2949h/argocd-server...
argocd@argocd-server-764f54784c-2949h:~$ uname -a
Linux argocd-server-764f54784c-2949h 6.17.9-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 24 Nov 2025 15:21:09 +0000 x86_64 x86_64 x86_64 GNU/Linux
So all is in working order, and our application seems to behave as expected. Brilliant!
Closing thoughts#
In closing, I want to make a few remarks about the journey of building this small CLI application:
-
It was extremely fun to create an interface that served to abstract process that I already had some insight on. That knowledge made everything much smoother, and it was certainly a good choice for this project, as it helped me concentrate on the engineering patterns that I was missing, and feel like now I have better mastery on.
-
There’s still room for improvement! The test suite coverage needs to improve, and there are still a decent amount of functionalities that could be added to the application so that it can cover more of the interaction surface that k8s has. In this regard,
k9smight serve as a source of inspiration on which functionalities to implement next. -
Building with the constraint of making the most use of the Golang standard lib was the right choice. This has taught me a lot of interesting development patterns that are widely used elsewhere, and provided a very good basis for further work in other projects.
With this, I must say goodbye for now. This concludes the write-up for this small project, hope you have enjoyed reading, and see you in the next one!