Introduction
Every morning, the infosec field is greeted with an onslaught of freshly registered malicious domains. These domains are used to host phishing sites, maintain botnet command and control, harvest stolen information, and more.
Having the complete list of registered domains day-by-day offers substantial visibility that can be used for intel and repsonse. Fortunately, such lists not only exist, but are available (usually for free!) with little effort involved. This post will introduce TLD zone files, how to access them, and how they can be used to your benefit.
Zone Files
Before being swamped with domains, let’s talk a little about how these lists of domains are organized. Someone has to keep track of all the domains for a certain TLD (.com, .net, .ninja, etc.). These are called the registries. Each registry maintains a master list of all the domains they are responsible for. This master list is called a “zone file”.
It’s the registry’s responsibility to maintain this zone file. As you can imagine, the zone file for TLD’s updates many times a day as new domains are registered, other domains expire, and nameserver records are changed.
I Just Want to Download the Data!
So we know what zone files are for, but how do we access them? As mentioned before, each registry is responsible for maintaining the zone file for their TLD, but they are also responsible for maintaining access to the zone file. This means that in some cases we’ll need to go directly to the registrar, but there are some helpful exceptions.
.COM, .NET, and .NAME
Let’s start with the most obvious ones: .com, .net, and .name (since it’s bundled). These are maintained by Verisign. Access to these zone files consists of downloading a Zone Access Form and emailing the completed form to [email protected]
.
It took a couple of weeks for this access to be granted. After your form is approved, you will receive FTP credentials that can be used to download the zone files daily.
root@tld:~# ftp rz.verisign-grs.com
Connected to rz.verisign-grs.com.
220-**** Welcome to the VeriSign Global Registry Services gTLD Zone FTP Server ****
220-***
220-*** This computer system is owned and operated by VeriSign, Inc.
220-*** All software or information that you access or download from this
220-*** server is being licensed to you under the terms of our Registrar
220-*** License and Agreement. Unauthorized access to this system may
220-*** result in criminal prosecution.
220-***
220-*** All sessions established with this server are monitored and logged.
220-*** Disconnect now if you do not consent to having your actions monitored
220-*** and logged.
220-***
220-******!
220
Name: [redacted]
331 Please specify the password.
Password:
230 Login successful.
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> ls
200 PORT command successful. Consider using PASV.
150 Here comes the directory listing.
<snip>
-rw-r--r-- 1 ftp ftp 2497503218 Sep 29 15:20 com.zone.gz
-rw-r--r-- 1 ftp ftp 321976673 Sep 29 15:12 net.zone.gz
226 Directory send OK.
We’ll take a look at what we can do with these soon. First, let’s talk about how we can catch all that malware on .ninja
.
The Centralized Zone Data Service (CZDS)
You’ve likely noticed that there are a ton of new gTLDs appearing. At the time of this writing, there are 1070 valid and sponsored TLDs approved by the IANA - a department of ICANN.
Since each registry maintains its own zone file, it’s overwhelming to try to get access to all of them separately. Fortunately, ICANN solved this problem by creating the Centralized Zone Data Service (CZDS).
CZDS “provides a centralized access point… to the Zone Files provided by participating Top Level Domains”. This means that, by registering with CZDS, we can simultaneously request access to most of the TLD (including those gTLDs) zone files.
When you get access to a particular zone file, you’re able to download it via ICANN’s API. They even provide a Python client that can be used to bulk download all the zone files you have access to.
Unfortunately, you may not get access to all zone files. In fact, looking at the most recent report released by ICANN, TLDs such as .aaa
only have 3 people authorized to use CZDS to download the zone file. We’ll work with what we have, I suppose.
Ok, we have enough data. Let’s start parsing.
Parsing Zone Files
If you want to parse everything about the zone files, you can read about the full format in RFC 1035, but this post is only interested in the domain -> nameserver mappings. So, let’s just start by taking a look at the contents of the file.
root@tld:~# head com.zone -n 50
; The use of the Data contained in Verisign Inc.'s aggregated
; .com, and .net top-level domain zone files (including the checksum
; files) is subject to the restrictions described in the access Agreement
; with Verisign Inc.
$ORIGIN COM.
$TTL 900
@ IN SOA a.gtld-servers.net. nstld.verisign-grs.com. (
1443370544 ;serial
1800 ;refresh every 30 min
900 ;retry every 15 min
604800 ;expire after a week
86400 ;minimum of a day
)
$TTL 172800
NS A.GTLD-SERVERS.NET.
NS G.GTLD-SERVERS.NET.
NS H.GTLD-SERVERS.NET.
NS C.GTLD-SERVERS.NET.
NS I.GTLD-SERVERS.NET.
NS B.GTLD-SERVERS.NET.
NS D.GTLD-SERVERS.NET.
NS L.GTLD-SERVERS.NET.
NS F.GTLD-SERVERS.NET.
NS J.GTLD-SERVERS.NET.
NS K.GTLD-SERVERS.NET.
NS E.GTLD-SERVERS.NET.
NS M.GTLD-SERVERS.NET.
COM. 86400 DNSKEY 257 3 8 AQPD<snip>
COM. 86400 DNSKEY 256 3 8 AQOp<snip>
COM. 86400 NSEC3PARAM 1 0 0 -
COM. 900 RRSIG SOA 8 1 900 20151004161544 20150927150544 35864 COM. MpW<snip>
COM. RRSIG NS 8 1 172800 20151003045209 20150926034209 35864 COM. mcxl<snip>
COM. 86400 RRSIG NSEC3PARAM 8 1 86400 20151003045209 20150926034209 35864 COM. SLk71<snip>
COM. 86400 RRSIG DNSKEY 8 1 86400 20150930182533 20150923182033 30909 COM. pDtt<snip>
KITCHENEROKTOBERFEST NS NS1.HOSTINGNET
KITCHENEROKTOBERFEST NS NS2.HOSTINGNET
KITCHENFLOORTILE NS NS1.HOSTINGNET
The first 35 lines of the file include some information about the zone file, the root name servers, etc. The actual meat of the file starts on line 36.
Typically, zone files contain lines that have the following format:
- Name
- TTL
- Record Class
- Record Type
- Record Data
In our example, we see that KITCHENROKTOBERFEST.COM
(the .com is understood) points to the name servers at NS1.HOSTINGNET.COM
and NS2.HOSTINGNET.COM
.
In addition to lines showing how domain names map to nameservers, at the bottom of the file we have the A records (IP addresses) for each name server in the file. But what if we want to remove all the “fluff” and only keep the lines showing which domains map to which nameservers?
We can grab the “interesting” lines using a simple grep -E "^[a-zA-Z0-9-]+ NS ." com.zone
, which will give us only the lines with domains pointing to nameservers.
root@tld:~# grep -E "^[a-zA-Z0-9-]+ NS ." com.zone | head -n 10
KITCHENEROKTOBERFEST NS NS1.HOSTINGNET
KITCHENEROKTOBERFEST NS NS2.HOSTINGNET
KITCHENFLOORTILE NS NS1.HOSTINGNET
KITCHENFLOORTILE NS NS2.HOSTINGNET
KITCHENTABLESET NS NS1.HOSTINGNET
KITCHENTABLESET NS NS2.HOSTINGNET
KITEPICTURES NS NS1.HOSTINGNET
KITEPICTURES NS NS2.HOSTINGNET
BOYSBOXERS NS NS1.HOSTINGNET
BOYSBOXERS NS NS2.HOSTINGNET
root@tld:~# grep -E "^[a-zA-Z0-9-]+ NS ." com.zone | wc -l
281899907
Awesome. We can parse this output to do anything we want with our list of domains.
Conclusion
There are a ton of use cases for this data in terms of information security, such as typo-squat monitoring, DGA monitoring, bit flip monitoring, etc. However, while this is an infosec blog, zone files can be used for far more than that. It could be used to detect name trends, watch for certain keywords, and more.
Now, consider what would happen if you kept a version controlled diff of this data every day. That would allow you to see trends over time or, for infosec, watch how domains change. Domain move behind Cloudflare/Akamai? You’ll have a record of what nameserver they pointed to before the move.
I hope this sheds some light on not only how useful zone files are, but also how accessible they are. ICANN, Verisign, and other registries deserve credit for making this data available to the public.
As always, let me know if you have questions/comments.
-Jordan (@jw_sec)