The Cassandra parser service

This is a microservice (as a Docker image) that parses Cassandra CQL commands and returns information about those commands.

This is useful when you need to control which CQL commands to let through, and which to reject, or just to know what kind of command a piece of CQL is.

TLDR;

If you send this service a CQL command, such as:

select col1, col2 from myKs.myTable where col3 = 1 and col4 = 'foo'

it will respond with information about this command:

{

"type":"select",

"keyspace":"myks",

"target":"mytable",

"columns":["col1","col2"],

"whereColumns":["col3","col4"],

"operators": ["=","="],

"parameters": ["1","'foo'"]

}

This allows you to to determine what the command does and what objects it references.

This microservice is built using Cassandra's own parser, so any single valid CQL command is accepted, including comments, blank lines, semicolons, etc... If Cassandra supports it, this service supports it.

Try it in 1 minute

Start the service:

docker run -d --rm --name cqlparser -p 8094:8099 galliumdata/galliumdata-cqlparser:1.0.0-18

Send a CQL command with curl:

curl -d 'select col1, col2 from myKs.myTable where col3 = 1' http://localhost:8094

Or with wget:

wget -nv -O - --post-data 'select col1, col2 from myKs.myTable where col3 = 1' localhost:8094

You will get the response:

{"keyspace":"myks","operators":["="],"columns":["col1","col2"],"type":"select","whereColumns":["col3"],"parameters":["1"],"target":"mytable"}

which tells you:

Stop the service:

docker stop cqlparser

Performance

This microservice can typically process many thousands of CQL statements per second on a small server, though of course that will depend on the complexity of the CQL statements.

Input format

The request can be either a raw CQL command, or a JSON object with a "cql" attribute containing the command, e.g.:

{"cql": "select col1, col2 from myKs.myTable where col3 = 1"}

Output format

The output will be a JSON document whose format depends on the type of command -- see Examples below.

If no keyspace is specified in the CQL command, the keyspace attribute will not be present in the response.

Cassandra converts most names to lowercase, unless they are enclosed in double quotes.

Examples

The following gives examples of all the types of CQL commands, along with their corresponding output.

SELECT

select * 

from myks.mytable 

where col1 = 'foo' and col2 != 5

Response

{

    "type": "select",

    "keyspace": "myks",

    "target": "mytable",

    "columns": [],

    "whereColumns": [

        "col1",

        "col2"

    ],

    "operators": [

        "=",

        "!="

    ],

    "parameters": [

        "'foo'",

        "5"

    ]

}

// Comments are suported

select col1, 

func(col2), 

func(func(col3)),  -- in the CQL

'aaa' + col4,

/* even in the middle of the code */

col5['att1']['att2']

from myks.mytable 

where col5 = 'foo' and col6 != 5;

{

    "type": "select",

    "keyspace": "myks",

    "target": "mytable",

    "columns": [

        "col1",

        "col2",

        "col3",

        "col4",

"col5"

    ],

    "whereColumns": [

        "col5",

        "col6"

    ],

    "operators": [

        "=",

        "!="

    ],

    "parameters": [

        "'foo'",

        "5"

    ]

}

// Aliases are included if there is at least one.

// whereColumns, operators and parameters

// are not included if there is no where clause.

select col1, 

func(col2), 

col3 as "My Col"

from myks.MyTable;

{

    "type": "select",

    "keyspace": "myks",

    "target": "mytable",

    "columns": [

        "col1",

        "col2",

        "col3"

    ],

    "aliases": [

null,

null,

"My Col"

    ]

}

INSERT

INSERT INTO myKs.mytable (col1, col2, col3)

VALUES (

55b6528e-aa58-4846-83e8-f0e1c23d518f, 

'Foo', 

1234)

IF NOT EXISTS

Response

{

"type": "insert",

"keyspace": "myks",

"target": "mytable",

"columns": ["col1", "col2", "col3"],

"parameters": [

   "55b6528e-aa58-4846-83e8-f0e1c23d518f",

   "'Foo'",

   "1234"

]

}

INSERT INTO myks.myTable JSON '{

"col1" : "55b6528e-aa58-4846-83e8-f0e1c23d518f",

"col2" :"Foo",

"col3" : 1234

}';

Response

{

"type": "insert",

"keyspace": "myks",

"target": "mytable",

"columns": ["col1", "col2", "col3"],

"parameters": [

   "55b6528e-aa58-4846-83e8-f0e1c23d518f",

   "Foo",

   "1234"

]

}

UPDATE

UPDATE myKs.myTable 

set col1 = 1, col2 = 'aaa', col3 = true 

WHERE col4 = 11 AND col5 = myfunction('bbb');

Response

{

"type": "update",

"keyspace": "myks",

"target": "mytable",

"columns": ["col1", "col2", "col3"],

"parameters": [

   "1",

   "'aaa'",

   "true"

],

"whereColumns": ["col4", "col5"],

"parameters2": [

   "11",

   "myfunction('bbb')"

]

}

DELETE

DELETE col1, col2[0], col3['att1']

FROM myKs.myTable

USING TIMESTAMP 1318292885055

WHERE col1 = 'foo' and col2[1] = 99;

Response

{

"type": "delete",

"keyspace": "myks",

"target": "mytable",

"columns": ["col1", "col2", "col3"],

"whereColumns": ["col1", "col2"],

"parameters": [

   "'foo'",

   "99"

]

}

The columns array will be empty if no columns are specified.

BATCH

The batch command returns a slightly different format than any other command.

begin batch

 insert into myks.mytable(col1, col2) values (1,2);

 update myks2.mytable2 set col1 = 'aaa' where col2=1 and col3 = 'foo'; 

APPLY BATCH;

Response

{

    "type": "batch",

    "statements": [

        {

            "type": "insert",

            "keyspace": "myks", 

            "target": "mytable", 

            "columns": ["col1", "col'],

            "parameters": ["1", "2"]

      },

        {

            "type": "update", 

            "keyspace": "myks2", 

            "target": "mytable2", 

            "columns": ["col1"],

            "parameters": ["'aaa'"],

            "whereColumns": ["col2", "col3"],

            "operators": ["=", "="],

            "parameters2": ["1", "'foo'"]

        }

    ]

}

ALTER KEYSPACE

ALTER KEYSPACE "MyKeyspace" 

WITH REPLICATION = {

'class' : 'SimpleStrategy',

'replication_factor' : 2

}

Response

{

"type": "alter keyspace", 

"target": "MyKeyspace"

}


ALTER MATERIALIZED VIEW

ALTER MATERIALIZED VIEW MyKeyspace."My View" 

WITH Comment = 'Hello' 

AND bloom_filter_fp_chance = 0.02

Response

{

"type": "alter view", 

"keyspace": "mykeyspace", 

"target": "My View"

}

ALTER ROLE

ALTER ROLE MYROLE WITH PASSWORD 'foo' LOGIN=true SUPERUSER=false OPTIONS={}

Response

{

"type": "alter role", 

"target": "myrole"

}

ALTER TABLE

ALTER TABLE MyKS.MYTABLE RENAME "Col1" to "Col2"

Response

{

"type": "alter table", 

"keyspace": "myks", 

"target": "mytable"

}

ALTER TYPE

ALTER TYPE MyKS.MyType ADD foo text

Response

{

"type": "alter type", 

"keyspace": "myks", 

"target": "mytype"

}

ALTER USER

ALTER USER MyUser WITH PASSWORD 'foo' NOSUPERUSER

Response

{

"type": "alter role", 

"target": "MyUser"

}

This is not a typo: the command ALTER USER is deprecated and is internally translated to an ALTER ROLE command.

Note that the user name's case is kept even though it is not in double quotes.

CREATE AGGREGATE

create aggregate IF NOT EXISTS MyKS.MyAggregate(int)

SFUNC avgState

STYPE tuple<int,bigint>

FINALFUNC avgFinal

INITCOND (0,0);

Response

{

"type": "create aggregate",

"keyspace": "myks", 

"target": "myaggregate"

}

CREATE INDEX

CREATE INDEX IF NOT EXISTS MyIndex

ON myks."My Table" ( KEYS( "col 1", "Col 2" ) )

Response

{

"type": "create index",

"target": "myindex",

"keyspace2": "myks",

"target2": "My Table"

}

CREATE FUNCTION

CREATE OR REPLACE FUNCTION myks.myfunction

(input double)

CALLED ON NULL INPUT 

RETURNS double 

LANGUAGE java AS 'return Double.valueOf(Math.log(input.doubleValue()));';

Response

{

"type": "create function",

"keyspace": "myks", 

"target": "myfunction",

}

CREATE KEYSPACE

CREATE KEYSPACE "My KS"

WITH REPLICATION = {

  'class' : 'NetworkTopologyStrategy',

  'boston'  : 3 , // Datacenter 1

  'seattle' : 2 , // Datacenter 2

  'tokyo'   : 2   // Datacenter 3

};

Response

{

"type": "create keyspace",

"target": "My KS",

}

CREATE MATERIALIZED VIEW

CREATE MATERIALIZED VIEW "MY KS"."My View"

AS SELECT col1, col2, col3

FROM myks2.mytable

WHERE col1 IS NOT NULL AND col2 IS NOT NULL

PRIMARY KEY (col1, col2)

WITH caching = {

  'keys' : 'ALL', 

  'rows_per_partition' : '100' }

AND comment = 'This is a comment' ;

Response

{

"type": "create view",

"keyspace": "MY KS",

"target": "My View",

"keyspace2": "myks2",

"target2": "mytable",

"columns": ["col1","col2","col3"],

"whereColumns": ["col1", "col2"]

}

CREATE ROLE

CREATE ROLE "My Role"

WITH PASSWORD = 'Foo'

AND LOGIN = true;

Response

{

"type": "create role",

"target": "My Role"

}

CREATE TABLE

CREATE TABLE myks.mytable (

col1 text,

col2 int,

col3 text,

PRIMARY KEY (col1, col2)

)

WITH CLUSTERING ORDER BY (col2 DESC);

Response

{

"type": "create view",

"keyspace": "myks",

"target": "mytable",

"columns": ["col1","col2","col3"]

}

CREATE TRIGGER

CREATE TRIGGER IF NOT EXISTS MyTrigger 

ON MyKS.MyTable

USING 'com.galliumdata.MyTrigger'

Response

{

"type": "create trigger",

"target": "mytrigger",

"keyspace2": "myks",

"target2": "mytable"

}

CREATE TYPE

CREATE TYPE "myKS"."My Type" (

col1 timestamp,

"Col 2" text,

col3 int

);

Response

{

"type": "create type",

"keyspace": "myKS",

"target": "My Type",

"columns": ["col1", "Col 2", "col3"]

}

CREATE USER

CREATE USER myUser WITH PASSWORD 'foo' SUPERUSER

Response

{

"type": "create role",

"target": "myUser"

}

The command CREATE USER is deprecated and is internally translated to a CREATE ROLE command.

Note that the user name's case is kept even though it is not in double quotes.

DROP AGGREGATE

DROP AGGREGATE IF EXISTS myks.myaggregate;

Response

{

"type": "drop aggregate",

"keyspace": "myks",

"target": "myaggregate"

}

DROP FUNCTION

DROP FUNCTION IF EXISTS myKS.myFunction;

Response

{

"type": "drop function",

"keyspace": "myks",

"target": "myfunction"

}

DROP INDEX

DROP INDEX IF EXISTS myKS.myIndex

Response

{

"type": "drop index",

"keyspace": "myks",

"target": "myindex"

}

DROP KEYSPACE

DROP keyspace IF EXISTS myKS

Response

{

"type": "drop keyspace",

"target": "myks"

}

DROP MATERIALIZED VIEW

DROP MATERIALIZED VIEW myks.myview

Response

{

"type": "drop view",

"keyspace": "myks",

"target": "myview"

}

DROP ROLE

DROP ROLE IF EXISTS myRole

Response

{

"type": "drop role",

"target": "myrole"

}

DROP TABLE

DROP TABLE IF EXISTS myks.mytable

Response

{

"type": "drop table",

"keyspace": "myks",

"target": "mytable"

}

DROP TRIGGER

DROP TRIGGER IF EXISTS myTrigger ON myKs.myTable

Response

{

"type": "drop trigger",

"target": "mytrigger",

"keyspace2": "myks",

"target2": "mytable"

}

DROP TYPE

DROP TYPE IF EXISTS myKs.myType

Response

{

"type": "drop type",

"keyspace": "myks",

"target": "mytype"

}

DROP USER

DROP USER IF EXISTS myUser

Response

{

"type": "drop role",

"target": "myUser"

}

The command DROP USER is deprecated and is internally translated to a DROP ROLE command.

Note that the user name's case is kept even though it is not in double quotes.

GRANT

GRANT SELECT ON ALL KEYSPACES TO myRole

Response

{

"type": "grant permissions",

"target": "myrole",

"target2": "data",

"parameters": ["SELECT"]

}

GRANT EXECUTE 

ON ALL FUNCTIONS IN KEYSPACE myKs 

TO myRole

Response

{

"type": "grant permissions",

"target": "myrole",

"target2": "functions/myks",

"parameters": ["EXECUTE"]

}

GRANT ALTER 

ON TABLE myKs.MyTable 

TO myRole

Response

{

"type": "grant permissions",

"target": "myrole",

"target2": "data/myks/mytable",

"parameters": ["ALTER"]

}

GRANT SELECT, EXECUTE 

ON MBEAN 'org.apache.cassandra.db:type=StorageService' 

TO myRole

Response

{

"type": "grant permissions",

"target": "myrole",

"target2": "mbean/org.apache.cassandra.db:type=StorageService",

"parameters": ["SELECT","EXECUTE"]

}

GRANT AUTHORIZE ON ALL ROLES TO myRole

Response

{

"type": "grant permissions",

"target": "myrole",

"target2": "roles",

"parameters": ["AUTHORIZE"]

}

All the GRANT commands use a two- or three-tier hierarchy to define the object on which the permission(s) is granted.

The top level can be: data, functions, roles, or mbean.

The second level is the keyspace name for data and functions, or the name of the object for roles and mbeans.

The third level, for data and functions, is the name of the table or of the function.

LIST PERMISSIONS

LIST ALL PERMISSIONS 

ON TABLE myKs.myTable 

OF myRole NORECURSIVE

Response

{

"type": "list permissions",

"target": "data/myks/mytable",

"target2": "myrole",

"parameters": ["CREATE", "ALTER", "DROP", "SELECT", "MODIFY", "AUTHORIZE", "DESCRIBE", "EXECUTE"]

}

LIST ROLES

LIST ROLES OF myRole NORECURSIVE

Response

{

"type": "list roles",

"target": "myrole"

}

LIST USERS

LIST USERS

Response

{

"type": "list roles"

}

The command LIST USERS is deprecated and is internally translated to a LIST ROLES command.

REVOKE PERMISSIONS

REVOKE SELECT, ALTER

ON myKs.myTable

FROM myRole

Response

{

"type": "revoke permissions",

"target": "data/myks/mytable",

"target2": "myrole",

"parameters": ["SELECT", "ALTER"]

}

The target attribute of the result is in the same format as for grants.

TRUNCATE TABLE

TRUNCATE TABLE myks.mytable

Response

{

"type": "truncate",

"keyspace": "myks",

"target": "mytable"

}

USE KEYSPACE

USE myKs

Response

{

"type": "use",

"target": "myks"

}

License

This microservice is freely available to end users, but you may not redistribute it.

If you are an OEM or VAR, please contact Gallium Data (info@galliumdata.com) and we'll be happy to discuss a licensing deal that can be mutually advantageous.