matduggan.com

Modern infrastructure work is, by many measures, better than it has ever been. We live in a time when a lot of the routine daily problems have been automated away by cloud providers, tooling or just improved workflows. However in the place of watching OS upgrades has come the endless tedium of writing configuration files. Between terraform, Kubernetes, AWS and various frameworks I feel like the amount of time I spend programming gets dangerously low.

I'm definitely not alone in feeling the annoyance around writing configurations, especially for Kubernetes. One tool every business seems to make at some point in their Kubernetes journey is a YAML generator. I've seen these vary from internal web apps to CLI tools, but the idea is that developers who don't have time to learn how to write the k8s YAML for their app can quickly generate the basic template and swap out a few values. You know people hate writing YAML for k8s when different companies on different continents all somehow end up writing variations on the same tool.

These internal tools vary from good to bad, but I'm hear to tell you there is a better way. I'm going to tell you about what worked well for me based on size of the company and the complexity of how they were using k8s. The internal tools are typically trying to solve the following common use-cases:

I need to quickly generate new YAML for launching a new service
I have the same service across different clusters or namespaces for a non-production testing environment and a production environment. I need to insert different environmental variables or various strings into those files.

We're going to try and avoid having to write too much internal tools and keep our process simple.

Why would I do any of this vs use my internal tool?

My experience with the internal tools goes something like this. Someone writes it, ideally on the dev tools team but often it is just a random person who hates doing the research on what to include in the YAML. The tool starts out by generating the YAML for just deployments, then slow expands the options to include things like DNS, volumes, etc. Soon the tool has a surprising amount of complexity in it.

Because we've inserted this level of abstraction between the app developer and the platform, it can be difficult to determine what exactly went wrong and what to do about it. The tool is mission critical because you are making configs that go to production, but also not really treated that way as leadership is convinced someone checks the files before they go out.

Spoiler: nobody ever checks the file before they go out. Then the narrative becomes how complicated and hard to manage k8s is.

Scenario 1: You are just starting out with k8s or don't plan on launching new apps a lot

In this case I really recommend against writing internal tools you are going to have to maintain. My approach for stacks like this is pretty much this:

Go to https://k8syaml.com/ to generate the YAML I need per environment
Define separate CD steps to run kubectl against the different testing and production namespaces or clusters with distinct YAML files
Profit

One thing I see a lot of with k8s is over-complexity starting out. Just because you can create all this internal tooling on top of the platform doesn't mean you should. If you aren't anticipating making a lot of new apps or are just starting out and need somewhere to make YAML files that don't involve a lot of copy/pasting from the Kubernetes docs, check out k8syaml above.

Assuming you are pulling latest from some container registry and that strings like "testdatabase.url.com" won't change a lot, you'll be able to grow for a long time on this low effort approach.

Here's a basic example of a config generated using k8syaml that we'll use for the rest of the examples.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: Testing
  labels:
    app: web
spec:
  selector:
    matchLabels:
      octopusexport: OctopusExport
  replicas: 1
  strategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: web
        octopusexport: OctopusExport
    spec:
      volumes:
        - name: test-volume
          persistentVolumeClaim:
            claimName: test-claim
      containers:
        - name: nginx
          image: nginx
          ports:
            - containerPort: 80
          env:
            - name: database
              value: testdata.example.com
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - web
                topologyKey: kubernetes.io/hostname

Scenario 2: You have a number of apps and maintaining YAML has started to turn into a hassle

This is typically where people start to over-complicate things. I advise folks to try and solve this problem as much as you can with the CI/CD process as opposed to attempting to build tooling that builds configs on the fly based on parameters passed to your tools CLI. It's often error-prone and even if it works now it's going to be a pain to maintain down the line.

Option 1: yq

This is typically the route I go. You create some baseline configuration, include the required environmental variables and then insert the correct values at the time of deploy from the CD stack by defining a new job for each target and inserting the values into the config file with yq. yq like the far more famous jq is a YAML processor that is easy to work with and write scripts around.

There are multiple versions floating around but the one I recommend can be found here. You will also probably want to set up shell completion which is documented in their help docs. Here are some basic examples based on the template posted above.

yq eval '.spec.template.spec.containers[0].name' example.yaml returns nginx. So in order to access our environmental variable we just need to run: yq eval '.spec.template.spec.containers[0].env[0].value' example.yaml. Basically if yq returns a - in the output, it's an array and you need to specify the location in the array. It's a common first-time user error.

So again using our example above we can define the parameters we want to pass to our YAML in our CD job as environmental variables and then insert them into the file at the time of deployment using something like this:

NAME=awesomedatabase yq -i '.spec.template.spec.containers[0].env[0].value = strenv(NAME)' example.yaml

Ta-dah our file is updated.

  containers:
    - name: nginx
      image: nginx
      ports:
        - containerPort: 80
      env:
        - name: database
          value: awesomedatabase

If you don't want to manage this via environmental variables its also equally easy to create a "config" YAML file, load the values from there and then insert them into your Kubernetes configuration. However I STRONGLY recommend adding a check to your CD job which confirms the values are set to something. Typically I do this by putting in a garbage value into the YAML and then checking inside the stage to confirm I changed the value. You don't want your app to fail because somehow an environmental variable or value got removed and you didn't catch it.

yq has excellent documentation available here. It walks you through all the common uses and many less common ones.

Why not just use sed? You can but doing a find replace on a file that might be going to production makes me nervous. I'd prefer to specify the structure of the specific item I am trying to replace, then confirm that some garbage value has been removed from the file before deploying it to production. Don't let me stop you though.

Option 2: Kustomize

Especially now that it is built-into kubectl, Kustomize is a robust option for doing what we're doing above with yq. It's a different design though, one favoring safety over direct file manipulation. The structure looks something like this.

Inside of where your YAML lives, you make a kustomization.yaml file. So you have the basics of your app structure defined in YAML and then you also have this custom file. The file will look something like this:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
  - service.yaml
  - deployment.yaml
  - test.yaml

Nested as a sub-directory under your primary app folder would a directory called sandbox. Inside of sandbox you'd have another YAML, sandbox/kustomization.yaml and that file would look like this:

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../base
patchesStrategicMerge:
- test.yaml

So we have the base file test.yaml and the new test.yaml. Here we can define whatever changes we want merged into the top-level test.yaml at the time of running. So for example we could have a new set of options:

apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
  name: backend-test
spec:
  minReplicas: 2
  maxReplicas: 4
  metrics:
  - type: Resource
  resource:
    name: cpu
    target:
      type: Utilization
      averageUtilization: 90

Then when we deploy this specific option we could keep it simple and just run kubectl apply -k overlays/sandbox. No need for manually merging anything and it is all carefully sorted and managed.

This is really only scratching the surface of what Kustomize can do and it's well worth checking out. In my experience though dev teams don't want to manage more YAML, they want less. However spending the time breaking configuration into these templates is both extremely safe (since we're distinctly calling the configuration we want merged in) and has minimal tooling overhead, it's worth exploring.

I thought you said you use yq yeah I do but I should probably use Kustomize. Typically though when apps start to reach the level of complexity where Kustomize really shines I start to push for option 3, which I think is the best of all of them.

Option 3: Use the client library

As time goes on and the complexity of what folks end up wanting to do in k8s increases, one issue is that the only way anyone knows how to interact with the cluster is through INSERT_CUSTOM_TOOL_NAME. Therefore the tool must be extended to add the support for whatever new thing they are trying to do, but this is obviously dangerous and requires testing by the internal dev tools person or team to ensure the behavior is correct.

If you are an organization who: lives and dies by Kubernetes, has an internal YAML generator or config generator but am starting to reach the limit of what it can do and don't want to spend the time breaking all of your environments into Kustomize templates, just use the official client libraries you can find here. Especially for TDD shops, I don't know why it doesn't come up more often.

I've used the Python Kubernetes client library and it is great, with tons of good examples to jumpstart your work. Not only does it open up allowing you to manage Kubernetes pretty much any way you would like, but frankly it's just a good overview of the Kubernetes APIs in general. I feel like every hour I've spent writing scripts with the library and referencing the docs, I learn something new (which is more than I can say for a lot of the Kubernetes books I've read in my life....). You can check out all those docs here.

Conclusion

The management of YAML configuration files for Kubernetes often comes up as a barrier to teams correctly managing their own application and its deployment. At first the process of writing these configs is an error-prone process and it is only natural to attempt to leverage the opportunity k8s provides to make an automated approach to the problem.

I would encourage folks starting out in k8s who have encountered their first headache writing YAML to resist the urge to generate their own tooling. You don't need one, either to make the base config file or to modify the file to meet the requirements of different environments. If you do need that level of certainty and confidence, write it like you would any other mission critical code.