Manipulating DNS Traffic

DNS (Domain Name Service) is one of the foundations of the internet -- it touches almost every aspect of it. In this article, we'll see how to snoop on all DNS traffic on your machine, and how to modify some of this traffic.

The practical applications of this kind of manipulation should be clear: if you can control DNS, you can change how a lot of your existing systems work, without touching them in any way.

I'll be using the Gallium Data DNS connector. If you just want to look at the DNS traffic on your machine, you can use a tool like WireShark or tcpdump, but to change the traffic, you'll have to use something like Gallium Data.

You can either read along and see what happens, or, if you're in the mood, you can follow along on your own machine. In that case, I strongly recommend you run through the Gallium Data tutorial for DNS first, to get familiar with the basic principles.

A project containing everything shown in this article is available here. If you import this project, remember to turn off other DNS connections in existing projects (if any). Also, make sure that Gallium Data is set to be the default DNS service on your machine, as shown in the tutorial.

Step 1: What's going on down there?

The first thing to do is to get a sense of how much DNS traffic there is on your computer. We can do that with a simple Request logger in Gallium Data.

Once that's in place, if we visit www.huffpost.com, we can see that the following names are looked up in DNS:

3p-geo.yahoo.com, 3p-udc.yahoo.com, 40f86e6ba526349903a3d140af856430.safeframe.googlesyndication.com, a.clickcertain.com, a.teads.tv, a.tribalfusion.com, a2245.casalemedia.com, acdn.adnxs.com, ad.doubleclick.net, ad.turn.com, ads.celtra.com, ads.yap.yahoo.com, alive.github.com, b1sync.zemanta.com, bam-cell.nr-data.net, bcp.crwdcntrl.net, beacon-sjc2.rubiconproject.com, c.amazon-adsystem.com, c.bing.com, c1.adform.net, c2shb.ssp.yahoo.com, c6ac35c022ca6e449f5703263d07be0b.safeframe.googlesyndication.com, ca4-bid.adsrvr.org, cache-ssl.celtra.com, cambria.assets.huffpost.com, cdn.doubleverify.com, cdn.flashtalking.com, cdn.taboola.com, cdn3.doubleverify.com, choices.trustarc.com, choices.truste.com, cm.g.doubleclick.net, connect.facebook.net, cs.emxdgt.com, dpm.demdex.net, dsum-sec.casalemedia.com, dt.adsafeprotected.com, eb2.3lift.com, eus.rubiconproject.com, f.flashtalking.com, fastlane.rubiconproject.com, fdz.flashtalking.com, fonts.googleapis.com, fonts.gstatic.com, fw.adsafeprotected.com, geo.yahoo.com, googleads4.g.doubleclick.net, htlb.casalemedia.com, ib.adnxs.com, idsync.reson8.com, img.huffingtonpost.com, insight.adsrvr.org, js-agent.newrelic.com, js-sec.indexww.com, krk.kargo.com, lciapi.ninthdecimal.com, mapi.huffpost.com, mapi.huffpost.com, maps.googleapis.com, maps.gstatic.com, match.adsrvr.org, modulous.huffpost.com, p.dlx.addthis.com, pagead2.googlesyndication.com, pixel.mathtag.com, pixel.moatads.com, pixiedust.buzzfeed.com, pr-bh.ybp.yahoo.com, px.moatads.com, quantcast.mgr.consensu.org, readmo.yahoo.com, rtb0.doubleverify.com, s.ad.smaato.net, s.amazon-adsystem.com, s.yimg.com, s0.2mdn.net, s3cf.flashtalking.com, sb.scorecardresearch.com, sb.scorecardresearch.com, secure.insightexpressai.com, securepubads.g.doubleclick.net, servedby.flashtalking.com, signaler-pa.clients6.google.com, ssum-sec.casalemedia.com, static.adsafeprotected.com, stats.g.doubleclick.net, sync.mathtag.com, tagan.adlightning.com, tags.mathtag.com, tapestry.tapad.com, tlx.3lift.com, tpc.googlesyndication.com, tps.doubleverify.com, tps11028.doubleverify.com, tps11036.doubleverify.com, tps11038.doubleverify.com, tps704.doubleverify.com, track.celtra.com, ups.analytics.yahoo.com, usw-ca2.adsrvr.org, www.buzzfeed.com, www.facebook.com, www.google-analytics.com, www.googletagmanager.com, www.googletagservices.com, www.huffpost.com, www.yahoo.com, z.moatads.com,

Good grief! That's 108 names, most of which do not seem to be related to the news [1].

Actually, on my machine, each of these queries is done twice: once as an A query for the IP4 address, and once as an AAAA query for the IP6 address. If both addresses are available, I assume that the browser uses whichever comes back first. That's 216 requests, just to load one page. Obviously there is some caching happening, so the actual number may not always be that high, but still -- that's a lot.

Step 2: Blocking queries

Let's assume we don't want anything to do with these advertisers. The most obvious thing to do would be to block those DNS queries, which is easily done with Gallium Data. We can create a simple request filter with the following value for the Question names parameter:

regex:.+\.googlesyndication\.com

regex:.+\.doubleclick\.net

regex:.+\.adnxs\.com

regex:.+\.amazon-adsystem\.com

regex:.+\.scorecardresearch\.com

regex:.+\.moatads\.com

regex:.+\.doubleverify\.com

regex:.+\.flashtalking\.com

regex:.+\.adsrvr\.org

That is a list of regular expressions, which will match some of the domains that look suspiciously advertising-related. We could make this list more complete, but this is a good start.

Now all we need to do is add a line of code to the filter:

context.result.skip = true;

which instructs Gallium Data to just drop the packet.

Now let's reload the same Huffington Post page (you may need to use a new private browsing window to avoid caching) -- that's better, a good bit of the advertising is gone. But of course, we're just dropping the DNS requests into the bit bucket, so the poor web page is going to keep looking up those names until it gives up, as we can see in the console. We can do better than that.

Step 3: Returning a bogus address

Instead of dropping the DNS requests, why don't we answer them instead? That way, the web page will at least stop asking.

We can do that with the following code in a request filter:

1 let pkt = context.packet;

2 pkt.isQuery = true;

3 pkt.authoritative = true;

4 let q = pkt.questions[0];

5 let ans;

6 if (q.typeName === "A") {

7 ans = pkt.addAnswer("A");

8 ans.ipAddress = "127.0.0.1";

9 }

10 else if (q.typeName === "AAAA") {

11 ans = pkt.addAnswer("AAAA");

12 ans.ipAddress = "::1";

13 }

14 ans.name = q.name;

15 context.result.response = pkt;

16 log.info("Blocked request: " +

q.typeName + ":" + q.name);

The code is a bit more involved this time, but it's not that bad. Here's what it's doing:

  • Line 2 - We take the incoming packet and turn it into a response -- it turns out that for DNS, requests and responses have the same general format, so we can do that.

  • Line 3 - We then mark the packet as authoritative -- we're telling the client (the web browser) that we are the final word on these addresses, so stop asking.

  • Lines 6-8 - If the request is of type A (IP4 address), we add an answer of type A and set its address to 127.0.0.1, which won't go anywhere.

  • Lines 10-12 - Same thing but with a request of type AAAA (IP6 address) -- ::1 means localhost in IP6.

  • Line 14 - We set the name of the answer to the name of the question.

  • Line 15 - We set the variable context.result.response to that packet, which tells Gallium Data to return that packet to the DNS client, rather than forward it to the DNS server.


Now let's reload that Huffington Post page one more time. This time, it feels a bit faster, and the DNS requests are no longer repeated because we returned authoritative answers.

Gallium Data actually has a special filter type to do exactly this without code, but I thought it would be more interesting to do it "by hand", so to speak.

Step 4: Going crazy

So far, we've done more or less the type of thing that ad blockers do. But how far can we push this? Could we go insane and send people to the Washington Post when they ask for the Huffington Post?

Only one way to know -- try it. We'll create a filter in Gallium Data that will intercept requests for www.huffpost.com, change that address to www.washingtonpost.com, send it to the DNS server, then intercept the response and make it look like it's for the Huffington Post. This feels sneaky, doesn't it?

To do this, we'll create a duplex filter (which gets called for both requests and responses), set it to kick in for www.huffpost.com, and give it the following code:

let pkt = context.packet;

if (context.packetType === "query") {

pkt.questions[0].name = "www.washingtonpost.com";

log.debug("Changed HuffPost query to WaPo");

}

else {

log.debug("Changing response back to HuffPost");

pkt.questions[0].name = "www.huffpost.com";

for (let ans of pkt.answers) {

ans.name = "www.huffpost.com";

}

}

The code is straightforward: if we receive a request (a.k.a. a query), we'll change the name being looked up from www.huffpost.com to www.washingtonpost.com, and if it's a response, we'll change that back to www.huffpost.com -- but the response will contain the IP address of the Washington Post. Surely, our evil plan cannot possibly fail!

Let's reload that Huffington Post page one last time (again, a new private browsing window might be helpful), and... Well, it doesn't work, and we get a certificate error in the browser. What happened?

Actually, our evil plan did work -- the browser asked for the address for www.huffpost.com, Gallium Data intercepted that and ran our code, which substituted www.washingtonpost.com in the request, and back again when the response came back. So our web browser was in fact directed to the Washington Post.

But there are at least two reasons why the page refused to load.

First, the browser expected to talk to www.huffpost.com but, when it established the SSL connection to the Washington Post server, it did not receive a certificate for that name. That's a huge red flag, and the browser refused to go any further.

Second, even if there was no SSL and we were back in the innocent days of plain HTTP, the page still would not have loaded because the browser would have asked for the site www.huffpost.com, and the Washington Post server would have responded with something like "I think you're very confused -- goodbye".

So, to exactly nobody's surprise, this whole scenario has clearly been foreseen by the very smart people who think these things up. That's a relief.

This kind of shenanigans, by the way, is why DNSSEC has evolved and is slowly being rolled out. It's pretty clear that this stuff is too easy to meddle with.

Conclusion

DNS is a foundation of the internet, so taking control of it opens up all kinds of possibilities. We've only scratched the surface here.

For instance, DNS is often used for service discovery to figure out what is available nearby, like printers, scanners, web services, etc... Imagine the possibilities.

It's also intriguing to think of resolving names differently depending on any number of factors: the client address, the time of day, or whatever makes sense for you. The same name doesn't have to mean the same thing to everyone. A lot of load balancers use this mechanism, so why not generalize it?

Also, Gallium Data is usually used as a proxy for databases, which means that database clients need to be directed to the proxy instead of talking directly to the database. This DNS proxy can come in handy for this, or for any other situation that requires smart name resolution.

If you enjoyed this exercise, you may want to check out Mess with DNS -- a fun and easy way to play with DNS on the server side.

Intercepting network traffic and modifying it on the fly can be a lot of fun. I encourage you to play with it, and if you use PostgreSQL, MongoDB, MySQL or SQL Server, you can have even more fun messing around with their network traffic.

[1] - I'm not picking on Huffington Post, this is (sadly) fairly normal these days. For instance, the Gallium Data home page looks up 17 addresses (we have no say in that). That's better than 108, but still. Fox News looks up 132 names.