Network debugging on Windows
Sometimes, things may not work as you expect. The most common error is probably when you try to connect to a database through Gallium Data, and you get an error. But of course, there are many other scenarios that can result in an error.
To debug the problem in depth on Windows, you can capture the network traffic to and from Gallium Data into a file, using Microsoft's pktmon tool, which is normally included in modern versions of Windows. This network capture file will almost always allow Gallium Data support to help you figure out what is going wrong.
1 - Start a Gallium Data instance
Make sure this instance is on a different machine than both the database server and the database client. This is because pktmon cannot capture loopback traffic.
Next, we need to know the two ports we're going to monitor.
In Gallium Data, take a look at the connection you want to debug, and note the incoming and outgoing ports (they might be the same).
Now, take a look at how you map ports in the Docker command you use to run Gallium Data. We're interested in the incoming port, which is mapped in Docker to the port labelled "Local port" in the Gallium Data connection. In the diagram below, that would be port 1432, assuming the Gallium Data instance is run with a command along the lines of:
docker run -d -p 8089:8080 -p 1432:1431 galliumdata/gallium-data-engine:X.X.X
So in this example, the two ports we're interested in are 1432 -- the port to which the database client connects -- and 1433 -- the port that Gallium Data uses to connect to the database server. It's possible for them to be the same number. We'll need these two ports in a minute.
2 - Make sure you can reproduce the problem
Typically that means logging into the database from a database client, or performing a specific database operation.
Whatever it is, make sure you can reproduce the problem consistently. The less traffic we capture, the easier it will be to debug, so ideally the problem should be reproducible with just one operation, e.g. logging in, or running a specific query.
Once you are satisfied that the problem can be reproduced consistently, set things up so that you can reproduce it one more time, but don't perform the operation that causes an error yet.
3 - Turn on network tracing
In step 1, you took note of which ports are in use. We'll call these <portA> and <portB> -- in real life, they might be 1432 and 1433, for instance.
On the Windows machine that is running Gallium Data in Docker, open an administrator command line and create a directory in a convenient place to hold the network traffic files. You should not typically need much disk space, a few megabytes should be enough. Make this directory the current directory.
Run the following commands:
pktmon filter add -p <portA> -t tcp psh fin
If <portA> and <portB> are not the same, you'll also need to run:
pktmon filter add -p <portB> -t tcp psh fin
To make sure these filters are in place, run:
pktmon filter list
The output should be similar to the following (you may have only one filter, and the port numbers may be different, of course):
Packet Filters:
# Name Protocol Port
- ---- -------- ----
1 <empty> TCP (FIN PSH) 1432
2 <empty> TCP (FIN PSH) 1433
If you see other filters that you did not just create, you may want to remove them, otherwise they will pollute the output with irrelevant traffic.
We're now ready to start tracing with the following command:
pktmon start -c --pkt-size 9000
From that point forward, all network traffic on the specified ports will be saved into a file.
You should see a status message similar to the following:
Logger Parameters:
Logger name: PktMon
Logging mode: Circular
Log file: C:\Users\Administrator\Documents\Capture\PktMon.etl
Max file size: 512 MB
Memory used: 128 MB
Collected Data:
Packet counters, packet capture
Capture Type:
All packets
Monitored Components:
All
Packet Filters:
# Name Protocol Port
- ---- -------- ----
1 <empty> TCP (FIN PSH) 1432
2 <empty> TCP (FIN PSH) 1433
Important: there is no need to rush, but you should avoid having pktmon active for more than a minute or two, otherwise it may capture a lot of network traffic unnecessarily and fill up your disk.
Now reproduce the problem by performing whatever operation causes it, like you did in step 2.
As soon as the problem has been reproduced, run the following commands:
pktmon stop
pktmon pcapng PktMon.etl -o traffic.pcapng
pktmon filter remove
At this point, you should have a file called traffic.pcapng containing the network traffic on the ports you specified. You can zip it if desired, it should typically get over 90% compressed.