Using secrets in Gallium Data
When writing filter logic in Gallium Data, it's not unusual to need access to some secrets: passwords, keys, that sort of thing.
By now, we all know that embedding secrets in code is just not a good idea, though it still happens thousands of times every day. That's why lots of smart people have come up with various ways of managing secrets. In this article, we'll take a look at a few examples.
Gallium Data typically runs as a Docker container, so the most obvious way to handle secrets is with Docker secrets. This is only available in Docker Swarm, so if you're not using that, this is not an option for you.
You create a Docker secret with a simple command, e.g.:
When starting Gallium Data as a Docker service, you then specify that it should have access to that secret:
This exposes the secret as a file in the container, in this case /run/secrets/secret_key_1
You can then easily read that secret in your filter code, for instance:
That's all there is to it. Obviously, you'd want to do some error handling, for instance if the secret file is not found, or does not contain what you expected.
Beyond just Docker, it's common to run Gallium Data instances in Kubernetes, which offers a Secrets service that's quite similar to Docker's. Kubernetes can expose secrets either as files, like Docker, or as environment variables, which work equally well. There are other, more complex ways of doing this with kubelets too.
There are multiple ways create a secret in Kubernetes, the simplest is with a command:
Once that's done, the secret can be exposed to your Gallium Data instances by declaring it in your pod definition (in the spec section):
The container can then read the secret from file /etc/secretVol/secret_key_1
Exposing the secret as an environment variable is not all that different, you also declare it in your pod definition:
and you can read it from your filter code with something like:
If you need to change the secret, the easiest thing to do is usually to just restart the pod, but there are ways to get notified of the change -- see the Kubernetes documentation for details.
Azure Secret Vault
If you live in the Windows world, then there's a good chance you're using Azure in one way or another. Microsoft offers an excellent secret management solution called Secret Vault, which gives you (and your enterprise managers) a great deal of flexibility.
Creating the secret in Secret Vault is easy, it can be done using the GUI, or using the Azure command line interface (assuming you've already created a vault):
Accessing the secret from your filter code requires that you add Microsoft's Azure library to your Gallium Data repository:
You'll need two libraries: com.azure/azure-security-keyvault-secrets and com.azure/azure-identity.
In your filter code, you can then authenticate with Azure with a service account and retrieve the secret with something like:
Note that you may want to externalize some of these Azure authentication bits using Docker secrets or Kubernetes Secrets -- these are not exclusive!
AWS Secrets Manager
A similar service exists in most cloud platforms. Amazon AWS offers a Secrets Manager, which has a lot in common with Azure's Secret Vault.
You create a secret either by using the GUI, or by creating a JSON file:
and using the AWS command line:
Just like with Azure, you can then authenticate with AWS by installing their library in your Gallium Data repository:
You'll need two libraries: software.amazon.awssdk/secretsmanager and software.amazon.awssdk/auth. Gallium Data automatically takes care of the other dependencies.
You can then authenticate with AWS in your filter code with:
Again, the AWS credentials should probably be externalized, perhaps using Docker secrets or Kubernetes Secrets.
When is a good time to retrieve secrets?
Retrieving secrets from Azure, AWS or similar cloud services can often take some time -- a second or two is not unusual. A good strategy is to retrieve the secret(s) in a connection filter, which is guaranteed to execute before anything else happens. That filter can then cache the result in the project context:
That way, the first connection pays the penalty for retrieving the secret, but all subsequent connections have instant access to it.