More efficient information gathering from DNS servers
7th Jul 2021
author: Ladislav Lhotka
translation: Petra Raszková
Since the end of January 2021, all authoritative DNS servers operated by CZ.NIC association collect information about DNS transactions (queries and responses) using the new standard format Compacted-DNS (C-DNS).
Its specification is incorporated in RFC 8618. For the process of data gathering is used DNS Probe software developed in CZ.NIC labs in cooperation with FIT VUT in Brno. This completed approximately a half-year phase of transition from the traditional and previously used PCAP format. During this phase was tested performance as well as the stability of DNS Probe and afterwards the results obtained from both formats were compared.
C-DNS unlike PCAP was made especial for saving and transfer of great numbers of DNS transactions. It was designated in order to be as efficient and flexible as possible, which on the other hand is related with its relative complicatedness. A more detailed description is beyond ambit of this contribution, however, to its essential characteristics belong:
- using binary coding CBOR (RFC 8949)
- timestamps have optional accuracy
- it is not necessary to save all available data, on the contrary, it is possible to add entries not included in the standard
- data about DNS query and corresponding responses are saved collectively, shared entries are not duplicated
- larger numbers of these transactions are combined into blocks with variable length (the recommended block size is 10 000 transactions)
- blocks also include specific metadata and statistics, and among others, within blocks are applied other procedures of data compression
Thus, the task of DNS Probe software is to extract DNS client queries and server responses from network traffic (both UDP and TCP), pair those which belong together, and then generate C-DNS data according to the configuration. Those can be stored either on a local disc or to be sent for further processing via an encrypted TLS connection. Therefore, the DNS Probe includes also a data receiver (dns-probe-collector package). Based on our configuration (layout) we input and store data received in this way in Apache Hadoop database.
As the main advantage of the C-DNS format unlike PCAP is considered a significant reduction in the amount of stored data as well as transmitted data. Which is in our case absolutely obvious. The following table shows the amount of gathered data in both formats within December on nine selected production DNS-serves (in both cases further compressed via xz). The medial value of the last column of the table (ratio of C-DNS/ PCAP sizes) is 15.6%.
* podíl means quotient
On the top of that, by comparing the data, we found out that DNS Probe systematically reports transactions that we do not obtain from PCAP format. The difference in UDP traffic is negligible (correspond no more than 1%), however in matter of TCP is difference between 5-10%, and for some servers even more. Either known problems of PCAP format (for example, the response can be saved before the relevant query) can be reckoned as the reason, or potential software deficiencies used for searching and pairing queries and responses in PCAP.
We also plan to use the above-mentioned C-DNS and add own optional entries related to clients (resolvers): the autonomous system number (ASN) and the country code obtained from GeoIP database.