Column-level encryption with Gallium Data

Introduction

This is an advanced tutorial: if you have never used Gallium Data before, we highly recommend you go through the basic tutorial first.

This is a step-by-step tutorial to implementing column-level encryption using Gallium Data. We'll be using SQL Server, but any other supported database (such as MySQL, PostgreSQL and MongoDB) would work too, with some variations.

But first, we need to talk

Doing column-level encryption with Gallium Data is certainly possible, but it'll never be as easy as using the database's encryption facilities. Therefore, Gallium Data should come into the picture only for those scenarios where there are no alternatives, such as:

  • you cannot change the database schema (common when using a third-party or legacy app)

  • you must encrypt only some rows and not others

  • you need to use an encryption algorithm that is not supported by the database

  • you want to encrypt only parts of some columns

Using Gallium Data requires you to specify which queries will be intercepted for processing, therefore it requires some planning. Make sure you cover all the queries that will read and write the data in question.

There are two basic ways to encrypt/decrypt data with Gallium Data:

  • by rewriting the requests so that they use the database's encryption/decryption capabilities

  • by encrypting and decrypting values in Gallium Data filters

Both are demonstrated below.

Setup

We'll be using three Docker containers:

  • one for SQL Server,

  • one for Gallium Data, the proxy

  • one for mssql-cli, the database client

Start the Docker containers

Run the following from a command line:

docker network create gallium-demo-net

docker run -d --rm --name gallium-mssql --network gallium-demo-net galliumdata/gallium-data-demo-mssql:4

docker run -d --rm --name gallium-data --network gallium-demo-net -p 8089:8080 -e repository_location=/galliumdata/repo_mssql galliumdata/gallium-data-engine:
1.3.0-1217

Wait 5-10 seconds to let SQL Server start up, then

Run the following from a command line:

docker run -it --rm --name gallium-cli --network gallium-demo-net galliumdata/gallium-mssql-cli:1 -S gallium-data -U sa -P Password1 -d GalliumDemoDB

You will get a SQL prompt:

GalliumDemoDB>

Everything is now in place, let's get started.


Create the table

⇨ Run the following in mssql-cli:

create schema gallium_crypt

go

create table gallium_crypt.purchase_orders (

id int not null primary key,

customer_name nvarchar(200) not null,

status char(1) not null,

notes nvarchar(max) not null

)

go

We want to encrypt the notes column, but only for the rows where status='S' (for Secret).

Option 1: rewrite SQL commands

If we're OK with letting the database do the encryption/decryption, but we would like more flexibility than what SQL Server provides, we can rewrite the SQL commands on their way to the database.

This is essentially SQL Server's column-level encryption approach, but the changes to the SQL commands are done by Gallium Data instead of by the database client.

Setup

We'll need to create an encryption key in SQL Server.

Run the following in mssql-cli:

create master key encryption by password = 'Password1'

go

create certificate POCert with subject = 'Encrypt purchase_orders'

go

create symmetric key POKey with algorithm = AES_256

encryption by certificate POCert

go

We can then create filters that will intercept insert, select and update requests, and rewrite them to use encryption and decryption.

Clearly, this can only work if we have a pretty good idea of what these statements are going to look like in advance. If the database requests are not entirely predictable, this approach is impractical.


Encryption on insert

We'll assume that inserts into our table take the form of:

insert into "gallium_crypt"."purchase_orders" values (1, 'Adam', 'n', 'Some notes here')

This could also be a prepared statement or a call to a stored procedure, but for this tutorial we'll just use a simple insert statement.

We can catch this command with a request filter.

In Gallium Data, create a request filter of type Query filter - MSSQL - call it "Encrypt PO on insert"

Set its Query patterns parameter to:

regex:.*insert into ["\[]?gallium_crypt["\]]?\.["\[]?purchase_orders["\]]? values \(.*

That way, this filter will be invoked whenever the insert is received. Note that we're being very specific here: we could make the regular expression more flexible, for instance to accept extra spaces, but to keep it simple, we'll assume that the insert command follows this exact format.

⇨ Set the filter's code to:

var sql = context.packet.sql;
let rx = /(\(\d+,\s*'(?:[^']|(?:''))*',\s*'(\w)',\s*')(([^']|(?:''))*)'\)/g;
let rxres = rx.exec(sql);
if (rxres[2] !== "S") {
return;
}
let encrypted = "crypt:'+convert(varchar(max), EncryptByKey(Key_GUID('POKey'), '" + rxres[3] + "'), 2)";
sql = sql.substring(0, rxres.index + rxres[1].length) + encrypted + ")";
context.packet.sql = "open symmetric key POKey decryption by certificate POCert;" + sql;

Even though it's only 8 lines of code, this is a little dense, so let's go over it line by line.


var sql = context.packet.sql;

We just retrieve the SQL statement from the packet.


let rx = /(\(\d+,\s*'(?:[^']|(?:''))*',\s*'(\w)',\s*')(([^']|(?:''))*)'\)/g;

This is a regular expression that will extract the value for the status and notes columns.


let rxres = rx.exec(sql);

We run the regular expression against the SQL, giving us the values for the status and notes columns.


if (rxres[2] !== "S") {

return;

}

We check whether the value for the status column is S for secret -- if it's not, we just return, which lets the request run normally.


let encrypted = "crypt:'+convert(varchar(max), EncryptByKey(Key_GUID('POKey'), '" + rxres[3] + "'), 2)";

This will replace the value for the notes column. We prefix the value with "crypt:" to indicate that it's encrypted, and then use SQL Server's EncryptByKey function, and encode the result into a hex string.


sql = sql.substring(0, rxres.index + rxres[1].length) + encrypted + ")";

This builds up a new SQL statement that's just like the original, except that the value for notes is replaced with the string we built in the previous line.


packet.sql = "open symmetric key CustomersKey decryption by certificate POCert;" + sql;

In SQL Server, you cannot use a key unless you open it first, so we prefix the SQL with an open statement. We could optimize this so we don't open the key every time, but this is good enough.

Publish to Gallium Data (hint: Ctrl-S on Windows or Cmd-S on Mac)

Run the following in mssql-cli:

insert into "gallium_crypt"."purchase_orders" values (1, 'Adam', 'S', 'Some notes here');
insert into "gallium_crypt"."purchase_orders" values (2, 'Brigitte', 'N', 'Additional notes');
insert into "gallium_crypt"."purchase_orders" values (3, 'Carol', 'S', 'And more notes here');

Our filter will rewrite each command that has status='S' before it forwards it to SQL Server as:

insert into "gallium_crypt"."purchase_orders" values (1, 'Adam', 'S', 'crypt:'+convert(varchar(max), EncryptByKey(Key_GUID('POKey'), 'Some notes here'), 2))

If you look at the purchase_orders table now, you will see that a new row has been inserted, with an encrypted value in the notes column.

Run the following in mssql-cli:

select * from gallium_crypt.purchase_orders

with the result:

+------+-----------------+----------+------------------------------------------------------

| id | customer_name | status | notes

|------+-----------------+----------+------------------------------------------------------

| 1 | Adam | S | crypt:00E398112CB1B44B8B810003C8CF6A0602000000C123AB8

| 2 | Brigitte | N | Additional notes

| 3 | Carol | S | crypt:00E398112CB1B44B8B810003C8CF6A0602000000C681A24

+------+-----------------+----------+------------------------------------------------------

Hint: hit q to regain control

Notice that the notes column in row 2 was not encrypted because the status column is not 'S'.

Decryption on select

Now that we can insert some encrypted data, we need to decrypt it when we read it. All we need is a request filter to catch the kind of SQL queries we're interested in.

⇨ In Gallium Data, create a new request filter of type Query filter - MSSQL - call it "Decrypt PO on select"

Set its Query patterns parameter to:

regex:select.*\bnotes\b.*from ["\[]?gallium_crypt["\]]?\.["\[]?purchase_orders["\]]?.*

⇨ Set the filter's code to:

var sql = context.packet.sql;
sql = sql.replace(/notes from/g,
"case when notes like 'crypt:%' then " +
"convert(varchar(max), DecryptByKey(convert(varbinary(max), substring(notes, 7, 10000), 2))) " +
"else notes end from");
context.packet.sql = "open symmetric key POKey decryption by certificate POCert;" + sql;

We just replace notes with an expression that does the decryption, but only if the value for notes starts with crypt:.

Publish to Gallium Data

Run the following in mssql-cli:

select id, customer_name, status, notes from gallium_crypt.purchase_orders

with the result:

+------+-----------------+----------+------------------------------------------------------

| id | customer_name | status | notes

|------+-----------------+----------+------------------------------------------------------

| 1 | Adam | S | Some notes here

| 2 | Brigitte | N | Additional notes

| 3 | Carol | S | And more notes here

+------+-----------------+----------+------------------------------------------------------

You can convince yourself that the notes are really encrypted in the database by running a query that will not trigger our filter:

select * from gallium_crypt.purchase_orders

with the result:

+------+-----------------+----------+------------------------------------------------------

| id | customer_name | status | notes

|------+-----------------+----------+------------------------------------------------------

| 1 | Adam | S | crypt:00E398112CB1B44B8B810003C8CF6A0602000000C123AB8

| 2 | Brigitte | N | Additional notes

| 3 | Carol | S | crypt:00E398112CB1B44B8B810003C8CF6A0602000000C681A24

+------+-----------------+----------+------------------------------------------------------

Encryption/decryption on update

We also need to think about the case when the status of a purchase_orders row changes from S to something else, or from something else to S.

We'll assume that the update in question takes the form:

update gallium_crypt.purchase_orders set status='X'

⇨ In Gallium Data, create a new request filter of type Query filter - MSSQL - call it "Encrypt/decrypt PO on update"

Set its Query patterns parameter to:

regex:update ["\[]?gallium_crypt["\]]?\.["\[]?purchase_orders["\]]? set status=.*

Set its code to:

let sql = context.packet.sql;
let rx = /(.*set status=)('.')(.*)/g;
let rxres = rx.exec(sql);
let newStatus = rxres[2];

let newSql = rxres[1] + rxres[2];
if (newStatus === "'S'") {
let encrypt = "'crypt:'+convert(varchar(max), EncryptByKey(Key_GUID('POKey'),notes), 2)";
newSql += ", notes=case when notes like 'crypt:%' then notes else " + encrypt + " end";
}
else {
let decrypt = "convert(varchar(max), DecryptByKey(convert(varbinary(max), substring(notes, 7, 10000), 2))) ";
newSql += ", notes=case when notes like 'crypt:%' then " + decrypt + " else notes end";
}
newSql += rxres[3];
context.packet.sql = "open symmetric key POKey decryption by certificate POCert;" + newSql;

The code encrypts the notes column if the status is updated to 'S', and decrypts it if it's set to something else. Note that we avoid double-encrypting if the value is already encrypted.

In other words, it converts:

update gallium_crypt.purchase_orders set status='N' where id=1

into:

update gallium_crypt.purchase_orders set status='N', notes=case when notes like 'crypt:%' then convert(varchar(max), DecryptByKey(convert(varbinary(max), substring(notes, 7, 10000), 2))) else notes end where id=1

and vice-versa.

Publish to Gallium Data

Run the following in mssql-cli:

update gallium_crypt.purchase_orders set status='N' where id=1

Verify that the row has been decrypted with:

select * from gallium_crypt.purchase_orders

with the result:

+------+-----------------+----------+------------------------------------------------------

| id | customer_name | status | notes

|------+-----------------+----------+------------------------------------------------------

| 1 | Adam | N | Some notes here

| 2 | Brigitte | N | Additional notes

| 3 | Carol | S | crypt:00E398112CB1B44B8B810003C8CF6A0602000000C681A24

+------+-----------------+----------+------------------------------------------------------

Re-encrypt the row with:

update gallium_crypt.purchase_orders set status='S' where id=1

Verify that the row has been encrypted with:

select * from gallium_crypt.purchase_orders

with the result:

+------+-----------------+----------+------------------------------------------------------

| id | customer_name | status | notes

|------+-----------------+----------+------------------------------------------------------

| 1 | Adam | S | crypt:00E398112CB1B44B8B810003C8CF6A0602000000C123AB8

| 2 | Brigitte | N | Additional notes

| 3 | Carol | S | crypt:00E398112CB1B44B8B810003C8CF6A0602000000C681A24

+------+-----------------+----------+------------------------------------------------------

Option 2: encrypt/decrypt on the fly

This option allows us to take encryption and decryption out of the database altogether, and do it in the proxy instead.

This is a very different approach, which is somewhat similar to SQL Server's Always Encrypted feature.

The insert requests are handled in a manner similar to the first approach, but the decryption is done in a response filter, as the result set is sent back to the client.

Creating a key

To encrypt and decrypt, we'll need a secret key. Rather than creating it every time, we can create it once in a connection filter, and reuse it from other filters.

Deactivate the three filters we created above (if present)

Run the following in mssql-cli:

delete from gallium_crypt.purchase_orders

This is necessary because we are now going to use a very different encryption method.

⇨ Create a new connection filter of type JavaScript connection filter - call it "Create encryption key"

Set its code to:

if (context.projectContext.crypto) {
return;
}
let keyStr = "BMl5BYx/FSV8rOQF1IcwzDNRju0bZ7BNKTk584s5Dfk";
context.projectContext.crypto = context.utils.createObject();
context.projectContext.crypto.Base64 = Java.type("java.util.Base64");
let keyBytes = context.projectContext.crypto.Base64.getDecoder().decode(keyStr);
let SecretKeySpec = Java.type("javax.crypto.spec.SecretKeySpec");
context.projectContext.crypto.key = new SecretKeySpec(keyBytes, "AES");
context.projectContext.crypto.Cipher = Java.type("javax.crypto.Cipher");

Note that the key is embedded in the code, which is not something you want to do in production. See Using secrets in Gallium Data for pointers on this topic.

Encrypting on insertion

For insertion, we want to catch statements like:

insert into "gallium_crypt"."purchase_orders" values(10, 'Juliet', 'S', 'My notes here')

and encrypt the value for the notes column.

⇨ Create a new query filter of type Query filter - MSSQL - call it "Value encrypt PO notes"

Set its Query patterns parameter to:

regex:insert into ["\[]?gallium_crypt["\]]?\.["\[]?purchase_orders["\]]? values\s*\(.*

Set its code to:

// Extract the values from the SQL command
let rx = /(.*\(\d+,\s*'(?:[^']|(?:''))*',\s*'(\w)',\s*')(([^']|(?:''))*)'\)/g;
let rxres = rx.exec(context.packet.sql);
if (rxres[2] !== "S") {
return;
}

// Encrypt
let cipher = context.projectContext.crypto.Cipher.getInstance("AES");
cipher.init(context.projectContext.crypto.Cipher.ENCRYPT_MODE, context.projectContext.crypto.key);
let notesBytes = context.utils.getUTF8BytesForString(rxres[3]);
let encryptedBytes = cipher.doFinal(notesBytes);
let encryptedStr = context.projectContext.crypto.Base64.getEncoder().encodeToString(encryptedBytes);
context.packet.sql = rxres[1] + "crypt:" + encryptedStr + "')";

This code will take a SQL command like:

insert into "gallium_crypt"."purchase_orders" values(10, 'Juliet', 'S', 'My notes here')

and rewrite it to:

insert into "gallium_crypt"."purchase_orders" values(10, 'Juliet', 'S', 'crypt:vnuYoepe+CKPnGGZ/dO+YQ==')

Publish to Gallium Data

⇨ Exit mssql-cli with Ctrl-D

⇨ Restart mssql-cli with:

docker run -it --rm --name gallium-cli --network gallium-demo-net galliumdata/gallium-mssql-cli:1 -S gallium-data -U sa -P Password1 -d GalliumDemoDB

This is necessary because we need to open a new database connection.

Run the following in mssql-cli:

insert into "gallium_crypt"."purchase_orders" values(1, 'Angela', 'S', 'My notes here');
insert into "gallium_crypt"."purchase_orders" values(2, 'Bobby', 'N', 'More notes here');
insert into "gallium_crypt"."purchase_orders" values(3, 'Chris', 'S', 'Even more notes here');

Run the following in mssql-cli:

select * from gallium_crypt.purchase_orders

with the result:

+------+-----------------+----------+----------------------------------------------------+

| id | customer_name | status | notes |

|------+-----------------+----------+----------------------------------------------------|

| 1 | Angela | S | crypt:vnuYoepe+CKPnGGZ/dO+YQ== |

| 2 | Bobby | N | More notes here |

| 3 | Chris | S | crypt:W37sASN6J6ZNHb1isBql6s+QEyXoVVzo5rrW0y1a3ik= |

+------+-----------------+----------+----------------------------------------------------+

Notice that the notes column in row 2 was not encrypted because the status column is not 'S'.

Decrypting on selection

Create a new response filter of type MSSQL result set filter - name it "Value decrypt PO notes"

Set the Query pattern parameter to:

regex:select.*notes.*from\s+["\[]?gallium_crypt["\]]?\.["\[]?purchase_orders["\]]?.*

This will ensure that this filter gets called any time someone selects the notes column in the purchase_orders table. Notice that this will not cover the case for select * -- this is on purpose so we can still see the raw contents of the table.

Set the Column patterns parameter to:

notes=crypt:.*

That way, we will decrypt the notes column only if it's actually encrypted.

Set the code to:

let encrBytes = context.projectContext.crypto.Base64.getDecoder().
decode(context.packet.notes.substring(6));
let cipher = context.projectContext.crypto.Cipher.getInstance("AES");
cipher.init(context.projectContext.crypto.Cipher.DECRYPT_MODE,
context.projectContext.crypto.key);
let decrBytes = cipher.doFinal(encrBytes);
context.packet.notes = context.utils.stringFromUTF8Bytes(decrBytes);

This will simply decrypt the value of the notes column.

Publish to Gallium Data

Run the following in mssql-cli:

select id, customer_name, status, notes from gallium_crypt.purchase_orders

with the result:

+------+-----------------+----------+----------------------+

| id | customer_name | status | notes |

|------+-----------------+----------+----------------------|

| 1 | Angela | S | My notes here |

| 2 | Bobby | N | More notes here |

| 3 | Chris | S | Even more notes here |

+------+-----------------+----------+----------------------+

You can convince yourself that the notes are really encrypted in the database by running a query that will not trigger our filter:

select * from gallium_crypt.purchase_orders

with the result:

+------+-----------------+----------+----------------------------------------------------+

| id | customer_name | status | notes |

|------+-----------------+----------+----------------------------------------------------|

| 1 | Angela | S | crypt:vnuYoepe+CKPnGGZ/dO+YQ== |

| 2 | Bobby | N | More notes here |

| 3 | Chris | S | crypt:W37sASN6J6ZNHb1isBql6s+QEyXoVVzo5rrW0y1a3ik= |

+------+-----------------+----------+----------------------------------------------------+

Note on updates

In the first part of this tutorial, we were able to encrypt/decrypt the value of the notes column relatively easily.

Doing the encryption and decryption in a filter rather than using SQL makes this much more difficult, because the value of the notes column is typically not available to the filter when an update statement is received. It's not impossible, but it would require you to fetch the value of the notes column, unless it's always provided when status is updated. This is left as an exercise to the reader.

Conclusion

I hope it's clear that this method is not meant to replace SQL Server's column-encryption: it's only meant to be used when you need more flexibility that what SQL Server affords you.

It's usually a good idea to keep track of which key you use when encrypting a value, that can be done by adding something to the prefix of the value, e.g. crypt:123: assuming you have a key named "123". This allows you to use multiple keys.

Cleanup

Once you're done with this tutorial, and you want to remove everything that was installed,

⇨ Exit from mssql-cli with Ctrl-D

⇨ Execute the following commands from a command line:

docker stop gallium-mssql
docker stop gallium-data
docker network rm gallium-demo-net

This will stop all the Docker containers started during this tutorial.

If you also want to remove the Docker images:

docker rmi galliumdata/gallium-mssql-cli:1
docker rmi galliumdata/gallium-data-engine:
1.3.0-1217
docker rmi galliumdata/gallium-data-demo-mssql:4

This will remove everything downloaded by this tutorial.