This tutorial will show you how to run Gallium Data on your machine. Nothing will be installed permanently, it’s all done with Docker containers, so you can throw everything away when you’re done.

There are four versions of this tutorial:

The concepts are the same, pick the one you're most familiar with.

This tutorial will take about 10 minutes, depending on your download speed.

You can also watch this tutorial as a video (below, 5 minutes 38 seconds).

Step 1: Docker is required

For this tutorial, we will require Docker. To verify that Docker is indeed available:

run the following command from a command line:

docker version

You should see some output similar to this:

Client: Docker Engine - Community
Cloud integration 1.0.22
Version: 20.10.12
API version: 1.41
etc...

The exact version numbers are not important -- the important thing is that Docker needs to be running. If you get an error message, you'll need to get Docker up and running before you can continue with this tutorial. Fortunately, there are lots of resources that can help you with that.

We’ll be starting three Docker containers — there are easier ways of doing this using e.g. Docker Compose or Kubernetes, but for this tutorial we want to make sure every component is visible and clearly understood.

We’ll be running all containers in their own Docker network, so let’s create that network.

Run the following command:

docker network create gallium-net

The response should be a long string of letters and numbers, which you can safely ignore, something like:

9ef282f6d3cce819a etc...

Docker is now ready, let’s move on to the next step.

Step 2: Start the three containers

If this is your first time running this tutorial, note that it will download about 1GB of container images, which can take a while on slow connections.

1 - Start the MongoDB database server

Gallium Data is a database proxy, which is pointless without a database. So let’s start a database.

If you have a MongoDB instance already running, you could use that, but for this tutorial, we recommend you follow these instructions.

Run the following from a command line:

docker run -d --rm --name mongo-gallium --network gallium-net -e MONGO_INITDB_ROOT_USERNAME=mongoadmin -e MONGO_INITDB_ROOT_PASSWORD=DataGamma -p 27017:27017 galliumdata/gallium-data-demo-mongo:3 --tlsMode preferTLS --tlsCertificateKeyFile /var/certs/mongo-demo.pem

This may take a minute as Docker downloads the image and starts it up. This image is simply the standard MongoDB image, preloaded with a sample collection of companies. There is nothing special about this image: Gallium Data can run with any MongoDB database.


2 - Start Gallium Data

Obviously we also need to start an instance of Gallium Data.

Run the following from a command line:

docker run -d --rm --name gallium-data --network gallium-net -p 8089:8080 -p 27018:27017 -e repository_location=/galliumdata/repo_mongo galliumdata/gallium-data-engine:1.3.0-1217

Again, this may take a minute. This is the standard Gallium Data image, with a demo repository, which is set up for this tutorial. In the real world, you will typically use additional options to create your own repository.


3 - Start the database client

Finally we’ll also need a database client. Here we’ll be using Mongo Express, but any MongoDB client would work equally well.

Run the following from a command line:

docker run -d --rm --name mongoexpress-gallium --network gallium-net -p 8081:8081 -e ME_CONFIG_MONGODB_ADMINUSERNAME=mongoadmin -e ME_CONFIG_MONGODB_ADMINPASSWORD=DataGamma -e ME_CONFIG_MONGODB_SERVER=gallium-data -e ME_CONFIG_MONGODB_PORT=27017 mongo-express

This is the standard Mongo Express image, with options to connect to Gallium Data.

So we now have three Docker containers running, speaking to each other as follows:

Step 3: Take a look at the database with Mongo Express

⇨ Connect to http://localhost:8081/db/test/companies

You are now querying the MongoDB database using a GUI query tool named Mongo Express, which talks to Gallium Data, which in turns talks to MongoDB.

At this point, Gallium Data does not do anything yet, so it's completely transparent.

You should see a list of companies from the database (at the bottom). You may need to scroll down to see additional companies.

⇨ Click on the company called Wetpaint (you may need to scroll down to find it) to see all its attributes, and notice that it has an attribute named category_code with a value of "web".

⇨ Click the Back button to return to the main screen.

Step 4: Let's hide some data

Let's say we want to hide all the companies that have a category_code of "web". This could be for security reasons, or any other reason. This is easily done with a Gallium Data query filter.

Gallium Data allows you to change requests to MongoDB as well as responses from MongoDB. If we want to hide certain companies, we can do so efficiently by modifying queries coming from the database client before they are sent to MongoDB. That way, MongoDB won't even retrieve those unwanted companies.

⇨ Open Gallium Data at http://localhost:8089/web/index.html
⇨ Log in
⇨ Open the project Simple Demo - MongoDB

You will see a few pre-defined filters, but they are not yet enabled (which is why they are in grey)

Click on the request filter named Filter queries for companies

Note that its parameters specify that it should be applied only to queries against the companies collection in the test database. We also check that the query document has a filter attribute.

⇨ Activate this filter by clicking the Active checkbox

⇨ Click the Publish button (top right) to push it to Gallium Data

⇨ Select the Code tab. This code gets executed every time a request is sent by the database client for the test.companies collection.

The code modifies the query document to add a condition to the filter: the attribute category_code should not be equal to "web". This is using MongoDB's query syntax.

Finally, the modified query document is logged.

This will be applied to all queries for the test.companies collection, therefore companies with category_code=web will always be filtered out.

This is a very simple example -- your code can have much more sophisticated logic based on the query, the data, the current user, and any other relevant factors.

⇨ Go back to Mongo Express and click the Find button, and you will notice that none of the companies returned have a category_code of "web" (scroll down to see all the companies, and click on them to see details). In particular, Wetpaint is no longer there. You have changed how the database works, without changing either the database client or the database server -- that's Gallium Data's superpower.

Step 5: Let's change some data

Let's say we want to change the number of employees of the companies that are in the enterprise space (i.e. category_code = "enterprise").

⇨ Go back to Gallium Data

⇨ Use the crumb trail at the top to go back to the project

⇨ Click the response filter called Hide number of employees for enterprise companies,

⇨ Click the Active checkbox

⇨ Click the Publish button at the top right. This will enable this filter.

If you look at its parameters, you will see that the filter will be invoked for any objects in test.companies (test is the database, companies is the collection), which satisfy the JSON expression category_code=enterprise.

When these conditions are met, the code will be executed.

⇨ Select the Code tab and take a look at the code. It's a single line, which sets the number_of_employees attribute to zero. This will only be run against companies satisfying the parameters we just looked at.

You could, for instance, make this conditional on who the current database user is, or on some other data, or any other condition. You could also remove the number_of_employees attribute altogether, or set its value to anything you want (like a string), but that's more likely to break applications that may rely on this data.

Go back to Mongo Express and click the Find button.

Scroll down, find AdventNet and click on it. It now has number_of_employees set to zero. Attributes can be changed, added, removed, calculated on the fly, incorporated from other sources, etc... Your imagination is the limit.

What have we seen?

In this tutorial, you got a glimpse of how Gallium Data can intercept the traffic between database client and database server, and modify this traffic. This enables you to:

  • change the behavior of existing applications and databases without changing either the application or the database

  • catch invalid or inefficient requests and change or reject them

  • tailor database responses to your exact needs, with a level of control and precision that would be almost impossible otherwise

  • monitor all traffic to and from the databases and react to whatever events are relevant to you

  • have a control point in front of your databases, which can run customized logic for your specific needs

Gallium Data has a number of pre-defined filters, but it also makes it easy to create your own filters and be as sophisticated as you want. You saw an example for MongoDB, similar functionality is available for PostgreSQL, and other database systems will be supported in the near future.

Not every database needs Gallium Data, but Gallium Data can solve many business requirements that would otherwise necessitate complex and expensive infrastructure. The solution using Gallium Data is often surprisingly simple.

Now, the question is: how will you use it?

What to do next

We encourage you to take Gallium Data for a spin with your own database(s). It's always more interesting to work with your own data than with demo data.

The tutorial project contains several other filters, but they are not active. You can take a look at them and try to activate them:

  • Time of day connection filter allows you to reject client connections based on IP address and time of day or week

  • Log all requests is a basic JavaScript request filter which simply prints out all requests received from the database client. You can see the logging messages in the Logs page, or from the command line by running:
    docker logs -f gallium-data
    Hit ctrl-c to regain control.

  • Log all responses, a basic JavaScript response filter which simply prints out all responses from the database server.

  • Encrypt twitter_username for secret companies is a more advanced example showing dynamic encryption and decryption of data.

See the examples page for more examples of how Gallium Data can be used.

Gallium Data is free, so you can use it as much as you want, on your machines, servers, in the cloud, wherever.

Consult the documentation for all the gritty details, such as how to use the debugger, or the API for various types of database packets.

Cleanup

Once you're done and you no longer want to use Gallium Data, you can clean everything out:

⇨ Execute the following commands from a command line:

docker stop gallium-data
docker stop mongoexpress
-gallium
docker stop mongo-gallium
docker network rm gallium-net

This will stop all the Docker containers started during this tutorial.

If you also want to remove the Docker images, execute the following commands:

docker rmi -f galliumdata/gallium-data-engine:1.3.0-1217
docker rmi -f galliumdata/gallium-data-demo-mongo:3
docker rmi -f mongo-express

This will remove everything installed, and leave your machine as it was at the beginning of this tutorial.