If exploited, the tsuNAME vulnerability had the potential to overwhelm and bring down authoritative servers, rendering entire DNS zones unreachable. In this article, the SIDN Labs team shares lessons learned along the winding road to public vulnerability disclosure for tsuNAME.
"In practice, the theory is different" – an Electricity and Magnetism professor I had in college used to say this during our lab experiments. If the laws of physics guaranteed one could touch a certain electrified object without being electrocuted, he would never demonstrate this principle with his own hands, as unforeseen conditions could occur in practice.
This reminds me of a similar experience we had with vulnerability disclosure. In theory, the whole process should be rather straightforward: notify -->fix -->disclose
. In practice, we had a vastly different experience.
Vulnerability disclosure? Responsible disclosure?
It turns out that we as a community cannot even agree on the right terminology: 'private', 'public', 'responsible', 'full', and 'coordinated disclosure' are terms that have no generally accepted meaning in academia or industry.
Coordinated vulnerability disclosure (CVD) is the preferred term, given that 'responsible' in 'responsible disclosure' implies a moral duty on the disclosing party – whereas in reality the burden is on whoever caused the vulnerability, and not whoever found it.
The tsuNAME vulnerability
The tsuNAME vulnerability consisted of clients and/or recursive resolvers (loop in the figure below) sending non-stop queries to authoritative servers. If exploited carefully, it could be used to overwhelm and bring down authoritative servers, rendering entire DNS zones unreachable. To start looping, a resolver or client had to find a DNS zone loop in the zone files on separate servers.
For example, consider an authoritative server for the fictitious dog.com
zone:
dog.com NS ns.cat.org
Now, consider the cat.org zone file:
cat.org NS ns.dog.com
We can see a loop in the two zones: cat.org <-> dog.com
. Given that resolvers would not cache such responses, some of them would loop indefinitely – or some clients would. We discuss the vulnerability in detail in our scientific paper. We show how New Zealand's .nz country-code top-level domain (ccTLD) suffered a 50% traffic increase because of the vulnerability.
The disclosure process
When we first found tsuNAME, we realised that, in our datasets, 99% of the looping queries we saw were from Google Public DNS (GDNS). Because we know some GDNS operators personally, we decided to first notify them of the issue in the hope that that would speed up the repair time. (It did not, and turned out to be a mistake on our part – more on that later.)
The figure below summarises our disclosure. We first notified our contacts at GDNS in September 2020. After months of waiting, in November 2020 we notified Google using their official CVD channel. Again, a couple of months passed with no resolution, so we decided to privately notify a group of DNS operators, who could have become victims of tsuNAME-like attacks. We scheduled an online disclosure session with help from DNS-OARC, to present our findings at their online OARC34 meeting. To keep the GDNS folk in the loop, we informed them we would be making this group disclosure.
In the meantime, our GDNS colleagues reached out to us and quickly resolved the issue – officially one day before our private disclosure at OARC34.
After that, we disclosed to multiple parties privately (yellow area in the figure above). Cisco also fixed its OpenDNS service in April, and ultimately we publicly disclosed the issue in May 2021. Overall, it took 8 months from the first contact to public disclosure.
Lessons learnt
1. Public disclosure is worth pursuing - it benefits everyone
The potential damage that could be caused by tsuNAME attacks that disrupt a top-level domain operator was a source of concern. Still, we wondered why there were no public reports about such attacks – after all, the vulnerability was probably not new. Was it because attackers had not yet discovered it, or were there other, more accessible and effective methods available? We faced an ethical dilemma: whether or not to disclose the vulnerability. Despite the risk of being perceived as alarmist, we ultimately decided to proceed with group and public disclosure.
Our decision to disclose the vulnerability ultimately contributed to having tsuNAME fixed, so no other operators could fall victim to such attacks. It just takes one party to disclose an issue to trigger a chain that ultimately benefits everyone. We therefore recommend that researchers do disclose vulnerabilities.
2. Disclosures have ethical implications
When disclosing a vulnerability, a researcher may have the best intentions, but must be aware that choices must be made, and each decision may have consequences for others.
We believe that we made the right choice in disclosing tsuNAME. The vendors were able to fix the vulnerability, preventing it from being used in amplification attacks.
In retrospect, we realise that we made an error when we chose to initially notify only GDNS on a private basis. We now see that we should have treated all DNS resolver vendors equally and notified them simultaneously. Our mistake resulted in a delay in the mitigation of the vulnerability. Moreover, our private disclosures did not yield the desired outcome, so we really needed to set a public disclosure date, and to inform vendors accordingly.
3. Ask for help to reduce the burden
Disclosing tsuNAME required more time and energy than we initially expected. In addition to preparing presentations, we also created guides for operators and developers outlining the steps needed to reproduce tsuNAME.
Many operators and developers may be discouraged from disclosing vulnerabilities when it is not part of their daily duties, as they may not have the time and energy to do so. To manage the communication process, a researcher may seek help from a vulnerability disclosure coordinator. This coordinator can take responsibility for contacting vendors, relieving the researcher of this burden and the associated exposure. Organisations such as CISA, for example, offer assistance of that kind.
4. You do not have the complete picture
During the Q&A session at the group disclosure at OARC34, 2 ccTLDs operators confirmed that they had previously experienced DNS events involving GDNS. The first operator, a European ccTLD, kindly shared the traffic statistics for their DNS event. In contrast to the .nz incident, which saw a 50% increase in traffic, this operator experienced a 10-fold increase.
Figure 3 shows the operator's aggregate traffic, with each colour representing the traffic to each authoritative server they operate. We can see a sharp increase in traffic beginning at 19:00 UTC and reaching a peak of 10 times their normal traffic, before drastically reducing after 11:00 UTC the following day when they manually removed the cyclic dependency from their zone.
A second ccTLD operator in the Americas contacted us by e-mail after the presentation, saying that they had been affected by similar events multiple times. They had also disclosed the matter privately to their contacts at Google, but the issue persisted for years, causing frustration. Although we cannot verify their claims, their account illustrates that private disclosure may not be effective.
5. Prepare for stressful responses
We had 2 types of reaction to our disclosure:
- Positive: mostly from vendors – GDNS, Unbound/NLnet Labs, BIND/ISC, Cisco/OpenDNS. These were the folk with the power to fix vulnerable software, so our main target audience.
- Negative: some operators accused us of fearmongering, while another stated that the problem was already known. While it is true that previous RFCs had addressed the issue of cyclic dependencies (RFC1034, RFC1035, RFC1536), they did not fully cover it, which is why the vulnerability was still present. We wrote an Internet Draft covering it, which was later incorporated into another IETF draft.
When disclosing a vulnerability, it is important to be prepared for the consequent exposure and potentially adverse reaction . Feedback or criticism can escalate quickly on social media platforms, such as Twitter, and it can easily be amplified. It is important to be prepared for this and to understand that not all feedback will be presented in a constructive manner. Additionally, it is important to recognise that the process of vulnerability disclosure can be emotionally taxing, and researchers may not always have the capacity or desire to handle it.
We understand that not all researchers may be comfortable with the attention and potential stress that can come with publicly disclosing a vulnerability. To avoid this, researchers may choose to disclose vulnerabilities anonymously by using new e-mail accounts, aliases and anonymising tools. Alternatively, researchers can seek assistance from a vulnerability disclosure coordinator.
Improving the disclosure process
We can draw two lessons from our disclosure experience:
Clarifying vendor roles and timeframes
Existing documents focus primarily on the disclosure itself, and have little to say about what is expected of vendors. After a vulnerability is disclosed to the vendor, it becomes their responsibility to determine when and how it will be addressed. However, what happens if the vendor refuses to fix the issue or indefinitely delays its resolution?
In our case with Google, it took over 60 days for them to address the problem after we utilised their official notification system. Although Google's bug tracking system provided us with updates at every stage, the timeline for bug resolution remained unclear. Their statement was only an imprecise "[P2 issues] need to be addressed on a reasonable timescale".
We were deeply concerned about the potential for other operators to become targets for DDoS attackers while we were waiting for a fix, potentially making us complicit if attacks occurred.
We would like to see a clearer timeframe for resolution in vendor issue handling, in part to control the risks (and stress) of the bug leaking or being discovered in parallel during an otherwise indefinite window.
Updating and endorsing CVD guidelines
The community would benefit from well-defined, succinct guidelines for vulnerability disclosure. Such guidelines should protect the individuals who report the vulnerabilities and address the ethical considerations involved at each stage. Furthermore, they should define the behaviours expected of vendors, and the associated timeframes. As things stand, the absence of regularly updated, succinct and widely endorsed documents leads to confusion, as we have personally encountered.
Further ahead
We have written a peer-reviewed scientific paper on our experience of disclosing the tsuNAME vulnerability.
Comments 2
Comments are disabled on articles published more than a year ago. If you'd like to inform us of any issues, please reach out to us via the contact form here.
Niels Dettenbach •
"Given that resolvers would not cache such responses, some of them would loop indefinitely " Just wondering why such resolvers are not handled by any rate limiting or similiar to request when requesting same records within their TTL?
Hide replies
Giovane Moura •
so please refer to section 4.4 in https://ant.isi.edu/~johnh/PAPERS/Moura21b.pdf TL;DR: some resolvers would just do it, and other stub resolvers are not bound by TTLs, are they are just minimalist. And GDNS was not caching such looping records. The fix: https://datatracker.ietf.org/doc/draft-ietf-dnsop-caching-resolution-failures/