SQL Server vs. Gallium Data
In SQL Server 2016, Microsoft introduced a new feature called dynamic data masking, which allows you to mask the values of certain columns and keep some data hidden from certain users without having to modify your applications.
Let's take a look at how SQL Server does data masking, and compare it to the way Gallium Data goes about it.
SQL Server data masking has the main advantage of being built-in
Gallium Data has the main advantage of being much more flexible
For this article, we'll use the following trivial database :
SQL Server dynamic data masking
Let's assume you want to partially or completely hide the customers' names. With SQL Server, you can define a dynamic mask with:
Any user (other than the owner of the table and DBAs) will then see the data as masked, e.g.:
You may wish to exempt some users from this type of masking with:
but that's a database-wide permission: that user will now have unmasked access to all masked data.
Overall, this is a nice feature, and it's fairly easy to use, but it does have some limitations. What if we need a mask that depends on the value, or on the user? You can only use a few built-in functions to do masking, and you cannot use your own functions.
How secure is this?
Data masking works well if your users are limited to a fixed set of SQL queries, which is often the case. But if database users can create their own SQL queries, they can easily bypass data masking with queries such as:
which allows the user to determine that the second letter of customer 2's name is 'r'. Obviously it's pretty easy to extend that and make an end-run around the masking. That's fine: data masking is not meant to be a security measure at the database level, it's just meant to help you make your applications more secure.
Gallium Data masking
For data masking, Gallium Data has two options: either modify the SQL command on its way to SQL Server, or filter the result set coming back from SQL Server.
Request filter - change the SQL command
A simple request filter in Gallium Data can look for the statement:
and rewrite it to:
This works well if you know in advance what SQL commands to expect.
You can of course be as fancy as you want in how you rewrite the query:
which will make the mask depend on the country column. Note that this rewriting can change depending on who the user is, the time of day, or any other conditions.
Response filtering - change the result set
The other (often complementary) option is to change the result set as it comes back from SQL Server. This is more expensive because we'll potentially execute code for every row in the result set, but modern computers are amazingly fast.
You now have complete freedom over how or whether to mask the data.
Advantages of SQL Server dynamic masking
vs. Gallium Data
Requires another system
Not as performant as SQL Server
Advantages of Gallium Data masking
Masking is completely up to you
Masking can be depend on the value of any columns
Masking can be different depending on the user, the user's IP address, etc...
Masking can be done on computed columns
Masking/unmasking can be done with surgical precision
vs. SQL Server
Masking can only be done using a limited set of functions
Masking is uniform -- all rows get the same mask
There is only one masking option per column -- everyone gets the same
Masking only works on stored columns
Masking/unmasking is all-or-nothing per user
Conclusion: who does it better?
These two approaches have different merits. They are not exclusive: it's possible to use both to implement different aspects of the requirements.
SQL Server has a simple mechanism that is pretty much all-or-nothing. This is appropriate for many applications that require nothing more.
Gallium Data gives you a lot more flexibility and may be preferable for the more complex cases when the masking requirements are not straightforward. Perhaps you need to mask only some rows for some users, or perhaps the masking depends on the data itself. You get to decide, without limitations.
If you're intrigued, you may be interested to see how Gallium Data can also do row-level security, or work with SQL Server's data classification feature to restrict data access, and how Gallium Data can easily restrict queries based on prior behavior.