gok8ctl - Golang CLI tool for managing k8s clusters

After finishing my previous project, I felt an itch to continue building. One of the things that was noticeable during that process was that my coding skills were not being pushed: while the app I built did teach me about integration of Prometheus inside a Golang application, there wasn’t much else new for me there. So, for my next project, I chose to concentrate on building software.

The question then became: what should I build? It occurred to me that if I wanted to concentrate on improving my software development skills, that I should choose a domain that I have some experience in, so as to remove an extra layer of complexity for the project. As I have some experience already with Kubernetes, and given that I love CLIs, the union of those two things seemed like a natural choice. I even already had a cluster running that I could easily test things on!

So, gok8ctl was born. Now, let me be honest: there’s very little reason why you should be using this in production environments instead of something like k9s. That’s a much more mature and feature complete application. I built this as an exercise to push the boundaries of what I know. Still, it works, and for that I’m happy. This write-up marks the release of v1.0.0, as I implemented all features I wanted for a first release. Let us get started!

The architecture#

One of the things I wanted to achieve with this project was, from the get go, to have a clear separation of concerns: the public facing command line interface would need to live separate from the internal implementation details. Also, each internal package would need to have a clear boundary and responsibility, as this is bound to make the code more maintainable and testable in the long-run.

For that, I ended up settling on the following layout:

├── cmd/           # CLI command definitions (user-facing layer)
├── internal/      # Private application code
│   ├── k8s/       # Kubernetes client operations
│   ├── config/    # Configuration management
│   ├── cache/     # Caching layer
│   ├── retry/     # Retry logic
│   ├── errors/    # Custom error types
│   └── output/    # Terminal output formatting
└── main.go        # Entry point

Another decision I took early on with regards to how I was going to build this project, was the use of the standard library, with the objective of using it as much as possible. As I’m discovering more and more of Go, I feel like the batteries included approach and the overall quality of the standard library are a blessing, so I

There are, however, three exceptions to this rule which I feel are worth mentioning: the first two are spf13/cobra and the official Kubernetes Go client library. The rationale for these is simple: the cobra library is an extremely solid, stable and widely use CLI framework, used on several projects (like kubectl, docker, and hugo); as for the k8s Go client library, this was a compromise made to simplify a fair bit the development of the application, but given that the objective was to create something that would integrate k8s, having to write a client library from scratch would’ve taken me way too long for something that feels like is already a very well solved problem by the creators of k8s. Finally, it made sense to rely on an external library to (un)marshal YAML, since that implementing one by hand is not an easy task, and again feels completely out of scope for such a project.

Concurrency and context#

One of the areas where Go excels at is the simplicity of implementing concurrencies in an idiomatic way; implementing concurrent calls for certain requests helps quite a fair bit with the feeling of having a performant application, as the parallelization of API calls for fetching deployments/services/pods/events can be subject to extraneous factors that might cause performance problems (or feelings thereof). I adopted a very simple strategy for these concurrent calls:

// internal/k8s/status.go
var wg sync.WaitGroup
var mu sync.Mutex

wg.Add(1)
go func() {
    defer wg.Done()
    list, err := c.clientset.AppsV1().Deployments(c.namespace).List(ctx, metav1.ListOptions{})
    mu.Lock()
    defer mu.Unlock()
    // ... handle result
}()

We can see pretty much all the pillars of Go concurrency in full display: the usage of WaitGroups, to coordinate the conclusion of multiple goroutines; the usage of Mutexes to protect shared state during concurrent writes; as well as the usage of goroutines by calling the go keyword.

Another pattern that was used throughout the codebase is leveraging the context package. I made sure that all long-running operations respect context cancellation, so as to allow the user to interrupt operations cleanly with Ctrl+C without leaving orphaned connections or goroutines. We can see this in action inside the cmd/root.go code:

// cmd/root.go
func Execute() error {
    ctx, cancel := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGTERM)
    defer cancel()
    return rootCmd.ExecuteContext(ctx)
}

Errors and cache#

One place that I felt that I could spend some time on was the creation and usage of an internal error library, so as to create a more tailored and structured way for handling errors during runtime. No that there’s anything wrong with the stdlib’s errors package, but as I wanted to increase my proficiency and given that explicit error handling is one of the cornerstones of Go, this felt like a place where I could spend some time to explore different possibilities. This ended up being quite simple to structure and write, as the whole package is around 50 lines of code:

// internal/errors/errors.go
// Sentinel errors for common cases
var (
	ErrNotFound     = errors.New("not found")
	ErrUnauthorized = errors.New("unauthorized")
	ErrForbidden    = errors.New("forbidden")
	ErrTimeout      = errors.New("timeout")
	ErrInvalidInput = errors.New("invalid input")
)

// ResourceError represents an error related to a specific resource
type ResourceError struct {
	Kind string
	Name string
	Err  error
}

func (e *ResourceError) Error() string {
	return fmt.Sprintf("%s %q: %v", e.Kind, e.Name, e.Err)
}

func (e *ResourceError) Unwrap() error {
	return e.Err
}

// NewResourceError creates a new ResourceError
func NewResourceError(kind, name string, err error) error {
	return &ResourceError{Kind: kind, Name: name, Err: err}
}

// NotFound creates a not found error for a resource
func NotFound(kind, name string) error {
	return NewResourceError(kind, name, ErrNotFound)
}

// IsNotFound checks if an error is a not found error
func IsNotFound(err error) bool {
	return errors.Is(err, ErrNotFound)
}

// Wrap wraps an error with additional contect
func Wrap(err error, msg string) error {
	if err == nil {
		return nil
	}

	return fmt.Errorf("%s: %w", msg, err)
}

As you can see, there’s nothing out of the ordinary here. But this allowed me to understand better how error handling works in Go, as well as giving me a more structured way to think about what kind of errors I wanted to return throughout the codebase.

Another place where it made sense to spend some time was to implement strategies to deal with the fact that this software might be used in situations where the networking is less than ideal, and given that there are API calls that might fail due to that, I’ve implemented both a simple cache, as well as a exponential backoff retry logic.

The cache package implements a simple data structure that stores both a value and a time-to-live duration, and can be retrieved with a Get() function. Also worthy of note is the fact that the value store is generic (by using an any type), and that we implemented a background cleanup goroutine with proper shutdown via a stop channel:

// internal/cache/cache.go
// entry holds a cached value with expiration
type entry struct {
	value     any
	expiresAt time.Time
}

// Cache is a simple thread-safe cache with TTL support
type Cache struct {
    data   sync.Map
    ttl    time.Duration
    stopCh chan struct{}
}

// Get retrieves a value from the cache
func (c *Cache) Get(key string) (any, bool) {
    v, ok := c.data.Load(key)
    if !ok {
        return nil, false
    }
    e := v.(entry)
    if time.Now().After(e.expiresAt) {
        c.data.Delete(key)
        return nil, false
    }
    return e.value, true
}

Unit testing#

An integral part of modern software development is that tests are fundamental to guarantee that a continuous development strategy doesn’t end up breaking the expected functions of the application. For that, I have added a somewhat comprehensive test suite, focusing on table-driven tests that span the core packages of this application:

internal/cache/cache_test.go
internal/config/config_test.go
internal/errors/errors_test.go
internal/k8s/client_test.go
internal/k8s/exec_test.go
internal/k8s/status_test.go
internal/output/color_test.go
internal/output/writer_test.go
internal/retry/retry_test.go

Unfortunately, the coverage is not total, as there are some places (especially around the cache and k8s package) that require more work to guarantee more adequate coverage. Fortunately, this means that there is space for improvement, and that is always good news for someone who wants to get better!

$ go test -cover ./...
ok      git.assilvestrar.club/lourenco/gok8ctl/internal/cache   0.023s  coverage: 51.5% of statements
ok      git.assilvestrar.club/lourenco/gok8ctl/internal/config  0.003s  coverage: 81.5% of statements
ok      git.assilvestrar.club/lourenco/gok8ctl/internal/errors  0.003s  coverage: 100.0% of statements
ok      git.assilvestrar.club/lourenco/gok8ctl/internal/k8s     2.187s  coverage: 54.6% of statements
ok      git.assilvestrar.club/lourenco/gok8ctl/internal/output  0.002s  coverage: 83.3% of statements
ok      git.assilvestrar.club/lourenco/gok8ctl/internal/retry   0.014s  coverage: 100.0% of statements

Makefile and CI/CD pipeline#

As the CI/CD pipeline depends on the Makefile, it’s only natural that both of these are talked about together. The first thing of note is that, by default, the Makefile produces fully static binaries that can run anywhere without glibc dependencies, due to the CGO_ENABLES=0 flag; this is a neat trick, especially given you might want to run this in systems that use musl instead of glibc. Another thing of note is that there’s version metadata injection at build-time:

VERSION?=dev
COMMIT?=$(shell git rev-parse --short HEAD)
DATE?=$(shell date -u +"%Y-%m-%dT%H:%M:%SZ")

LDFLAGS=-ldflags "-X main.version=$(VERSION) -X main.commit=$(COMMIT) -X main.date=$(DATE)"

Which means that if you run the tagged version from the upstream git repository, you get:

gok8ctl v1.0.0
 commit: 218fe87
 built: 2025-12-07T16:38:24Z

While if you build it from source, by default you get:

gok8ctl dev
 commit: 218fe87
 built: 2025-12-08T13:03:28Z

As you can see, the commit id is the same, but versioning is different. This is achieved by setting up the release step in the ci.yaml file so that when you tag something in git and push said tag, the release job is triggered:

  release:
    runs-on: self-hosted
    needs: test
    if: startsWith(github.ref, 'refs/tags/v')
    steps:
      - uses: actions/checkout@v4

      - name: Set up Go
        uses: actions/setup-go@v5
        with:
          go-version: '1.25'

      - name: Get version from tag
        id: version
        run: echo "VERSION=${GITHUB_REF#refs/tags/}" >> $GITHUB_OUTPUT

      - name: Build release binaries
        run: |
          VERSION=${{ steps.version.outputs.VERSION }} make cross-build

      - name: Create release
        uses: softprops/action-gh-release@v1
        with:
          files: |
            dist/gok8ctl-linux-amd64
            dist/gok8ctl-linux-arm64
          generate_release_notes: true

The last thing of note is that there’s a clear separation of different stages in the CI/CD pipeline: first, the test suite runs; if that’s successful, then the build process starts; and if that was also successful and there’s a tagged release, finally we release the build. This is, as far as I understand, the gold standard for developing modern software, and that’s why I decided to implement it like that.

Putting the app into production#

Finally, I’d like to highlight some functionalities of this software. Since we already have a running k8s cluster running from our previous project, this will provide us with a way to make sure that everything is running smoothly and working as expected. We start by triggering a restart of the argocd-server service:

$ gok8ctl restart argocd-server -n argocd

time=2025-12-08T14:17:48.742+01:00 level=INFO msg="triggered restart" deployment=argocd-server
Deployment "argocd-server" restart triggered

This will allow us to have a look at whether the logs functionality is working correctly:

$ gok8ctl logs argocd-server -n argocd --tail 10

time=2025-12-08T14:20:25.191+01:00 level=INFO msg="streaming logs" pods=1 resource=argocd-server
{"level":"info","msg":"Configmap/secret informer synced","time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"Loading TLS configuration from secret argocd/argocd-secret","time":"2025-12-08T13:17:50Z"}
{"level":"warning","msg":"Static assets directory \"/shared/app\" does not exist, using only embedded assets","time"
:"2025-12-08T13:17:50Z"}
{"level":"info","msg":"invalidated cache for resource in namespace: argocd with the name: argocd-notifications-cm","
time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"invalidated cache for resource in namespace: argocd with the name: argocd-notifications-secre
t","time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"argocd v3.2.0+66b2f30 serving on port 8080 (url: , tls: true, namespace: argocd, sso: false)"
,"time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"Enabled application namespace patterns: argocd","time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"0xc0005ba070 subscribed to settings updates","time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"Starting rbac config informer","time":"2025-12-08T13:17:50Z"}
{"level":"info","msg":"RBAC ConfigMap 'argocd-rbac-cm' added","time":"2025-12-08T13:17:50Z"}

Let us then see the status of the whole namespace:

$ gok8ctl status --events -n argocd --pods

Namespace: argocd
────────────────────────────────────────────────────────────
Fetched: 14:35:17

DEPLOYMENTS
NAME                              READY  UP-TO-DATE  AVAILABLE  AGE  STATUS
argocd-applicationset-controller  1/1    1           1          13d  ✓ ready
argocd-dex-server                 1/1    1           1          13d  ✓ ready
argocd-notifications-controller   1/1    1           1          13d  ✓ ready
argocd-redis                      1/1    1           1          13d  ✓ ready
argocd-repo-server                1/1    1           1          13d  ✓ ready
argocd-server                     1/1    1           1          13d  ✓ ready

SERVICES
NAME                                     TYPE       CLUSTER-IP     EXTERNAL-IP  PORTS                       AGE
argocd-applicationset-controller         ClusterIP  10.43.221.241  <none>       7000/TCP,8080/TCP           13d
argocd-dex-server                        ClusterIP  10.43.181.239  <none>       5556/TCP,5557/TCP,5558/TCP  13d
argocd-metrics                           ClusterIP  10.43.83.205   <none>       8082/TCP                    13d
argocd-notifications-controller-metrics  ClusterIP  10.43.139.81   <none>       9001/TCP                    13d
argocd-redis                             ClusterIP  10.43.111.30   <none>       6379/TCP                    13d
argocd-repo-server                       ClusterIP  10.43.192.18   <none>       8081/TCP,8084/TCP           13d
argocd-server                            ClusterIP  10.43.30.148   <none>       80/TCP,443/TCP              13d
argocd-server-metrics                    ClusterIP  10.43.146.147  <none>       8083/TCP                    13d

RECENT EVENTS
AGE      TYPE    REASON             OBJECT                               MESSAGE
16m      Normal  Killing            Pod/argocd-server-65544f4864-852kq   Stopping container argocd-server
16m      Normal  SuccessfulDelete   ReplicaSet/argocd-server-65544f4864  Deleted pod: argocd-server-65544f4864-852kq
16m      Normal  ScalingReplicaSet  Deployment/argocd-server             Scaled down replica set argocd-server-65544f486...
17m      Normal  Started            Pod/argocd-server-764f54784c-2949h   Started container argocd-server
17m      Normal  Pulling            Pod/argocd-server-764f54784c-2949h   Pulling image "quay.io/argoproj/argocd:v3.2.0"
17m      Normal  Pulled             Pod/argocd-server-764f54784c-2949h   Successfully pulled image "quay.io/argoproj/arg...
17m      Normal  Created            Pod/argocd-server-764f54784c-2949h   Created container: argocd-server
17m      Normal  SuccessfulCreate   ReplicaSet/argocd-server-764f54784c  Created pod: argocd-server-764f54784c-2949h
17m      Normal  ScalingReplicaSet  Deployment/argocd-server             Scaled up replica set argocd-server-764f54784c ...
106751d  Normal  Scheduled          Pod/argocd-server-764f54784c-2949h   Successfully assigned argocd/argocd-server-764f...

PODS
NAME                                              READY  STATUS   RESTARTS  AGE  NODE
argocd-application-controller-0                   1/1    Running  2         13d  omega
argocd-applicationset-controller-fc5545556-4nf9f  1/1    Running  2         13d  omega
argocd-dex-server-f59c65cff-vpf5b                 1/1    Running  2         13d  omega
argocd-notifications-controller-59f6949d7-vnbgd   1/1    Running  2         13d  omega
argocd-redis-75c946f559-zxjh8                     1/1    Running  2         13d  omega
argocd-repo-server-6959c47c44-cjw96               1/1    Running  2         13d  omega
argocd-server-764f54784c-2949h                    1/1    Running  0         17m  omega

All seems to be good. Let’s just drop into the shell of the pod that we have just restarted:

$ gok8ctl shell argocd-server-764f54784c-2949h -n argocd

Connecting to argocd-server-764f54784c-2949h/argocd-server...
argocd@argocd-server-764f54784c-2949h:~$ uname -a
Linux argocd-server-764f54784c-2949h 6.17.9-arch1-1 #1 SMP PREEMPT_DYNAMIC Mon, 24 Nov 2025 15:21:09 +0000 x86_64 x86_64 x86_64 GNU/Linux

So all is in working order, and our application seems to behave as expected. Brilliant!

Closing thoughts#

In closing, I want to make a few remarks about the journey of building this small CLI application:

It was extremely fun to create an interface that served to abstract process that I already had some insight on. That knowledge made everything much smoother, and it was certainly a good choice for this project, as it helped me concentrate on the engineering patterns that I was missing, and feel like now I have better mastery on.
There’s still room for improvement! The test suite coverage needs to improve, and there are still a decent amount of functionalities that could be added to the application so that it can cover more of the interaction surface that k8s has. In this regard, k9s might serve as a source of inspiration on which functionalities to implement next.
Building with the constraint of making the most use of the Golang standard lib was the right choice. This has taught me a lot of interesting development patterns that are widely used elsewhere, and provided a very good basis for further work in other projects.

With this, I must say goodbye for now. This concludes the write-up for this small project, hope you have enjoyed reading, and see you in the next one!