~ Structured configuration in Go

Posted on Sat 9 2022 to Programming

There comes a point in time during the development of a piece of software when a configuration language needs to be used, you can only do so much via flags before it becomes too tenuous. The language chosen should provide a format that is easy for a person parse as well as a computer. Typically, most people would reach for YAML, TOML, or sometimes even JSON. For the development of Djinn CI, none of these fitted my needs, so I developed my own, specifically for Go.

What a configuration language should be

In my opinion, a configuration language should allow for a declarative way of configuring a piece of software. The syntax of the language, should be easy for a person to parse, since they will be spending a good amount of time reading and writing said configuration. A configuration language should be light on visual noise, that is, anything that might incur a person's ability to read the language. It should also allow for comments, so the person writing the configuration can explain what the configuration is for.

The last points, visual noise and comments, rules out JSON as being a configuration language. It is fine for serializing data and exchanging it between programs, but should be avoided as a primary configuration format. This does not rule out YAML or TOML though. Which, are fine configuration languages depending on what is being configured. I should stress, that there is no singular configuration language that will meet the requirements for how every piece of software is configured. The language chosen will vary depending on how you want to expose your software for configuration.

Note: When I use the term "primary configuration format" I am referring to the configuration that a person would need to edit themselves. JSON is fine for storing configuration that is edited by a program. My main gripes with JSON as a configuration format arise when I have to edit it myself.

Starting with TOML

When starting out with the development of Djinn, I initially settled on TOML. It's simpler than YAML, and much stricter, no assumptions will be made about the string yes for example. Below is an example of some of the configuration in TOML,

[net]
listen = ":8443"

[net.tls]
cert = "/var/lib/ssl/server.crt"
key  = "/var/lib/ssl/server.key"

This was fine to start off with, however, Djinn requires a certain level of nested structure as part of the configuration. Take provider configuration for example, a provider can be configured for each 3rd party you want to integrate with, in TOML this looked like this,

[[provider]]

[[provider.github]]
client_id     = "..."
client_secret = "..."

[[provider.gitlab]]
client_id     = "..."
client_secret = "..."

this was not great. The main reason being, it made the configuration less readable, and harder to parse at a glance. There were other instances of this too throughout the configuration of Djinn, such as the configuration of drivers,

[[driver]]

[[driver.qemu]]
disks  = "/var/lib/djinn/qemu"
cpus   = 1
memory = 2048

I wanted Djinn and its components to have configuration that was easy to read. With TOML, I was quickly running up against its limitations with regard to nested configuration structures. So I started exploring other options.

HCL

HCL is the configuration language from HashiCorp, if you've worked with Terraform then you will be familiar with it. As stated in the readme for the project,

HCL attempts to strike a compromise between generic serialization formats such as JSON and configuration formats built around full programming languages such as Ruby. HCL syntax is designed to be easily read and written by humans, and allows declarative logic to permit its use in more complex applications.

this appears to fulfill my needs for making it easier to work with nested structures of configuration. Let's take a look at how Djinn may be configured with HCL,

net {
    listen = ":8443"

    tls {
        cert = "/var/lib/ssl/server.crt"
        key  = "/var/lib/ssl/server.key"
    }
}

provider "github" {
    client_id      = "..."
    client_sercret = "..."
}

provider "gitlab" {
    client_id      = "..."
    client_sercret = "..."
}

driver "qemu" {
    disks  = "/var/lib/djinn/qemu"
    cpus   = 1
    memory = 2048
}

this is better, however the requirement for = to assign a value to a parameter and the quotes around the labels irk me. Furthermore, it would be nice if I could use size units when specifying an amount of something in bytes.

Structured configuration

Structured configuration is the type of configuration language I wanted for Djinn, whereby parameters could be grouped together into blocks, and nested within each other. Hence, the structure. The language I came up with was heavily influenced by HCL, and libucl and has support for duration and size literal values. Below is what the language looks like,

net {
    listen ":8443"

    tls {
        cert "/var/lib/ssl/server.crt"
        key  "/var/lib/ssl/server.key"
    }
}

provider github {
    client_id     "..."
    client_secret "..."
}

driver qemu {
    disks  "/var/lib/djinn/qemu"
    cpus   1
    memory 2KB
}

as you can see, it is very similar to HCL, however there is less visual noise as I call it. The library developed for this is called config which is used for decoding the configuration, there is not support as of yet for encoding. With this library you will be able to configure support for environment variable expansion and support for includes. I have found that this strikes the balance I require of a configuration language, declarative, with limited visual noise, and easy for people to read. This is hardly a silver bullet, and no doubt will demonstrate its limitations depending on what it is you're trying to configure. Nonetheless, I have found it be flexible for my use cases. You can see examples of this language in the djinn-ci/djinn repository itself in the dist directory.