Result set filter - Cassandra

A result set filter is invoked (potentially) for every row in a result set. The filter can modify the row, add more rows, drop the row, or return an error. This filter works for all result sets, whether they result from a CQL statement or from the execution of a stored procedure.

Unlike other filter, the result set filter has two important objects predefined:


Parameters

All parameters are optional. If no parameter is specified, then the filter will be invoked for every row of every result set.

User pattern

If specified, a comma-separated (or line break separated) list of database user names or regular expressions for database user names.

Client IPs

A comma-separated (or line-break separated) list of IP4 or IP6 addresses or regular expressions for IP addresses. If specified, this filter will only be invoked when the client's IP address matches.

Keyspace pattern

If specified, the filter will not be invoked unless it contains at least one column from a table in the specified keyspace. Can be a string or a regular expression.

Table pattern

If specified, the filter will not be invoked unless it contains at least one column from the specified table. Can be a string or a regular expression.

Query pattern

A series of strings and/or regular expressions, separated by commas or newlines. If this is specified, the filter will be invoked only if the CQL query that caused this result set matches the pattern.

Column patterns

A series of name=value pairs, separated by commas or newlines. The value is a regular expression, which means special characters like period must be escaped.

And/or

If more than one column pattern is specified, this determines whether they must all be true (and) or any one of them (or).

Action

Can be either code or hide. If it's code, then the code is executed. If it's hide, then the row is dropped from the result set, and the code is not executed.

Example: hide some products

The following examples use the Cassandra tutorial database.

The requirement is that toys over $20 should be invisible to users in France or Germany (i.e. users whose user name starts with "fr_" or "de_")

The filter will be invoked whenever a query of the form:

select * from gallium_demo.products

is executed. We can create a result set filter with the proper parameters, and add a bit of code to hide the data.

The parameters for the filter are:

This will ensure that this filter will be invoked for all toys.

Next we add a bit of code that checks the price and hides the row if necessary:

if (context.row.price > 20) {
    context.row.remove();
}

If we now re-run the query:

select * from gallium_demo.customers

we only see 5 rows. The database is of course untouched -- all we changed was the result set on its way to the client.

We can further qualify this, for instance by making this filter run only for certain users, or make it depend on the query.

Example: mask some values

Let's say there is a query that returns data that should not be seen by some users. The rule is:

For users in France and Germany (i.e. users whose user name starts with "fr_" or "de_"), customers' phone numbers should be hidden.

We can create a result set with the proper parameters, and add a bit of code to hide the data.

The parameters for the filter are:

User pattern: regex:fr_.*,de_.*

Keyspace pattern: gallium_demo

Table pattern: customer

The code for the filter will be invoked when these conditions are satisfied, and can hide the customer's phone numbers with:

if (context.row.phones) {

    context.row.phones = null;

}

We could also mask, say, the home phone number (which is stored in a compound map) with:

if (context.row.phones && context.row.phones.home) {

    context.row.phones.home.phone_number = "XXXXXXX";

}

Example: add a synthetic row to the result set

We can add new rows to the result set by cloning the current row and modifying it. 

We create a filter with the parameters:

Keyspace pattern: gallium_demo

Table pattern: products

Column patterns: id=7

This will ensure that the code for the filter will only be executed for product #7, which we'll duplicate with the following code:

for (var i = 1; i <= 3; i++) {

    let newRow = context.row.clone();

    newRow.id = 1000000 + i;

    newRow.name = 'Made-up product #' + i;

    context.packets.addPacket(newRow);

}