Example: modifying PDFs

A lot of databases contain PDF documents stored as blobs: contracts, invoices, etc...

When these documents are retrieved by various database clients and applications, it is sometimes desirable to modify them on the fly. This example shows how to do that with minimal effort.

This example works for SQL Server, but could be adapted with minimal effort for other databases.

Step 1: add the OpenPDF library

In Gallium Data, go to Libraries -> Find, enter Organization: com.github.librepdf and Artifact: openpdf and add version 1.4.2 (do not install version 2.x, which is not compatible).

Step 2: create a filter

We assume that there is a table called gallium_demo.documents with a column named pdf of type VARBINARY. 

Create a result set filter for SQL Server, name it e.g. "Modify PDFs"

Set the parameters for this new filter so that it gets invoked for the table, e.g.:

Query pattern = regex:select.*from.*documents.*

Step 3: adding metadata to the PDF

The code for the filter is:

// Get the document as a byte array

if (context.packet.pdf === null) {

    return;

}


const PdfReader = Java.type("com.lowagie.text.pdf.PdfReader");

let reader = new PdfReader(context.packet.pdf);


const ByteArrayOutputStream = Java.type("java.io.ByteArrayOutputStream");

let baos = new ByteArrayOutputStream();


const PdfStamper = Java.type("com.lowagie.text.pdf.PdfStamper");

stamp = new PdfStamper(reader, baos);


stamp.setInfoDictionary({Author: "Hello from Gallium Data"});

stamp.close();

context.packet.image = baos.toByteArray();

Step 4: test

When you retrieve a PDF from the database, using a query that triggers your new filter, you will see that the metadata has been added to the document:

Step 5: add a watermark

Now let's add a watermark to each page. Set the filter code to:

if (context.packet.pdf === null) {

    return;

}


const PdfReader = Java.type("com.lowagie.text.pdf.PdfReader");

let reader = new PdfReader(context.packet.pdf);


const ByteArrayOutputStream = Java.type("java.io.ByteArrayOutputStream");

let outStr = new ByteArrayOutputStream();


const PdfStamper = Java.type("com.lowagie.text.pdf.PdfStamper");

stamp = new PdfStamper(reader, outStr);


const BaseFont = Java.type("com.lowagie.text.pdf.BaseFont");

let bf = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.WINANSI, BaseFont.EMBEDDED);

const Element = Java.type("com.lowagie.text.Element");

for (let i = 1; i <= reader.getNumberOfPages(); i++) {

    let over = stamp.getOverContent(i);

    over.beginText();

    over.setFontAndSize(bf, 96);

    over.setRGBColorFill(0, 0, 255, 100);

    over.showTextAligned(Element.ALIGN_LEFT, "GALLIUM DATA", 70, 130, 45);

    over.endText();

}


stamp.close();

context.packet.image = outStr.toByteArray();

When you retrieve a PDF from the database, using a query that triggers your new filter, you will see that the watermark has been added to the document.


You can easily make the watermark invisible to the naked eye by giving it an alpha value of zero:

over.setRGBColorFill(0, 0, 255, 0);


See the documentation on OpenPDF and the API documentation for a lot more details.