Let's Take Down the Internet
In the late 1990s I read a brilliant article, that seems to have since disappeared, called “10 ways to take down the Internet”. Needless to say, the Internet is now so much more vast and complex, most of those strategies would not work today. They would still be capable of creating a great deal of chaos, but not take down any massive percentage of the world Internet.
However, there is one human failing that leaves the Internet vulnerable, and that is complacency. That natural human emotion that tells us, if it hasn't happened before, there's no need to worry about it. No matter how much the daily news try and encourage us to worry about all sorts of different issues, mostly the only things we care about are the problems we've seen before.
But in August 2016 we saw a game changer – the Mirai Botnet. A package of software that could seek out, detect and takeover Internet connected “smart” devices like webcams, TVs, printers, scanners, door bells and even children's toys – generally referred together as IoT devices. Once a device is detected, Mirai checks a range of known default logins until it finds one that works, infects the device and turns it into a bot on its network of control.
What makes Mirai botnets a game changer is their numbers. They are vast. Typically 10 to 100 times the size of traditional Windows/PC botnets. This is partly necessary because, when compared to a PC, each of these devices may only have a relatively small processing capability – but with the average connection speed in over 27 nations now topping 10Mb/s, a botnet of 100,000 infected devices can have a massive amount of firepower at its disposal.
The strength of the Mirai Botnet is its size. This causes a number of problems. Firstly, it will be impossible to block the attack at the source, as there would be so many sources spread across all countries, ISP and connections. Secondly, it may even be impossible to detect the attack at the source, as each source generates such a small amount of traffic the attack disappears into the background noise – but when totalled together, at the destination, it can be devastating.
This firepower was first on large scale public display in October 2016 when Mirai Botnets took out the DNS provider DynDNS, which took out big name sites like GitHub, Twitter, Reddit, Netflix and Airbnb. Then again in November 2016, they took out the country of Liberia.
Its thought likely that it was the same actor responsible for both attacks and the Liberia attack was probably an attempt to measure their firepower. As Liberia is primarily connected to the Internet over a single cable, if you can choke that link, you have a tangible measure of your fire-power.
Who do you target?
Complacency creates choke points – bottlenecks on the Internet which carry a disproportionate amount of significance. One such choke point is the Registry Operator Verisign.
Verisign is responsible, amongst other things, for the DNS data for the COM and NET domains. Not only is the number of names in the COM domain larger than the rest of the Internet put together, but COM and NET name servers are used exclusively somewhere between 65% and 90% of the domains in any country's domain list.
Verisign are definitely not complacent about this responsibility, but putting all those eggs into one basket seems highly irresponsible as it seems highly unlikely they could withstand a concerted attack from a large scale, well organised, Mirai Botnet.
If COM and NET were unavailable for an extended period of time, not only would all COM & NET domains start to falter, but between 65% and 90% of all other sites would too. This is probably as close to taking down the Internet as you could reasonably get.
The most obvious way to attack Verisign would be to launch a Mirai Botnet D/DoS attack against the DNS servers for COM and NET. DNS data is always cached (e.g. by your ISP), but to save memory and improve performance, often ISPs will limit the time they will hold DNS information in cache. Not holding it for the maximum amount of time allowable (known as the TTL). This prevents the cache from getting filled up with a lot of information that is rarely used more than once, so improves the performance for more frequently used data.
There is nothing wrong with doing this. But the shorter the amount of time the data is held in the cache, the less time the D/DoS attack has to last before it really start to bite.
One of the problem with a D/DoS attack is that the disruption only lasts as long as the attack is kept going. However, these imposed cache limits, along with the natural timeout of DNS (which is almost always 24 hours) means that if you can make a attack last more than 24 hours, you have a good chance of almost everything shutting down.
The most common way to achieve this would be with a UDP Flood. Basically, if you can send somebody more traffic than their Internet connections can cope with, you will have prevented any legitimate traffic getting through.
There could be two ways you could approach this.
Option-1 … Sheer Volume by Bandwidth
The most common type of D/DoS attack is sheer volume of traffic. It doesn't matter what filtering you put infront of your server, if the server's line is so full no legitimate traffic can get there anyway.
A good way to achieve this is with a DNS Amplification Attack. In this attack you send a question to a third party's DNS server but, instead of getting the much bigger answer sent back to you, you get the answer sent to your attack target. Its like signing your friend up on thousands of junk mailing lists – you've made the request, but they get snowed under with garbage.
By taking advantage of as many of the DNS Anycast networks as you can, you can draw in as many different innocent servers into helping with the attack. These days there are many different Anycast DNS networks out there, and using as many as possible would give you as much variety in your attack as possible.
The problem with a traditional DNS Reflection attack is that it can be relatively trivial to block the attack at the innocent DNS server, that is being used as the reflector. However, by using the nodes of an Anycast network you spread the attack across a wide range of different servers, managed by different suppliers, in different locations using different routers and paths.
Anycast networks are particularly susceptible to being abused in this way, so long as the sources of the attack are spread widely enough round the Internet. This makes Mirai an ideal source for a distributed Anycast DNS Reflection / Amplification attack.
Option-2 … Sheer Volume by Query
However, as with any reflection, an Anycast reflection would still be susceptible to being throttled or blocked at the reflectors. So sending query traffic directly from the Mirai Botnet to the targets would also be worth considering.
There are two choke points you could try and target. Firstly the routers. Routers are generally designed to handle large volumes, but often are optimised for streaming traffic in one direction at a time and are commonly extremely poor at switching packets left-right-left-right. So if you can send queries to the Verisign DNS servers that elicit an answer (i.e. is not filtered), it is highly likely that it will be a router that will throw in the towel first, not a server.
Its also possible that, if you can get the query volume high enough, you would be able to overload the servers themselves. Answering DNS queries is actually a relatively expensive operation for a modern operating system.
This is because each query will cause a number of context-switches. A context switch is when the CPU is required to switch from running one program to running another. With a DNS application, the CPU will have to be running in the kernel to accept the packet from the LAN, but then context-switch to the DNS code for the DNS server to accept, process & answer the query, and then switch back to the kernel to transmit the response packet onto the LAN.
More often than not, in DNS, the throttle on queries per second answered is down to the rate at which the CPU can context-switch. A processor can report itself as 90% idle, but still be unable to service more queries, if it has reached its limit on context switching – and this limit can be incredibly hard to quantify. Typically Intel CPUs are do a lot more than AMD ones.
At this point you are starting to get more into what is called a Level-7 attack, where instead of taking out a service by sheer volume of traffic, you take it out by sheer weight of workload. You ask the same sort of questions a legitimate customer would ask, so as to be indistinguishable from a legitimate customer. The big problem with a Level-7 attack is that a well crafted attack is much more difficult to detect.
This is where an Mirai Botnet is an ideal source – each attack doesn't need much bandwidth or processing power at the bot, as it attacks into a much deeper level, but you need far more bots in the attack, as each individual attack takes much longer to perform.
Instead of attacking the outer reaches of the system, you are hitting into the inner core. You're hitting the internal networks, routers, database and internal bandwidth – all of which may not have been rated to cope with the workload a Level-7 attack can trigger.
By examining the application, in this case DNS, you would work out what sort of query would trigger the highest work load and attack with that – ensuring enough variability that a simple outer layer pattern filter couldn't trap it.
In my experience all UDP Query Flood attacks, even ones that attempt to cause a Level-7 bottleneck, it ends up being possible to block them based on a commonality within the question they ask. Yes, each individual query looks like a legitimate customer query, but when you get a flood of queries with something in common, bingo! You can filter.
If you walk down the street and see a guy wearing a red shirt and cargo slacks, its unlikely you will think much of it, but when you see a thousand guys dressed like that you would quickly twig that something was up.
The unfortunate result might be that some poor unrelated guy, who just happened to wear a red shirt and cargo slacks that day, gets pulled over – he's the “false positive”, but if it saves your servers from falling over, and keeps normal service for 99% of your customers, its worth the loss.
Other mitigations are technically relatively easy, but some issues have been shied away from for far too long, or require co-operation that would be trivial to put in place, but simply doesn't exist.
1 … Nearly all UDP flood attacks, and especially UDP reflection attacks, rely on the almost universal ability for anybody to send out UDP packets with almost any spoofed source IP Address. No ISP should really be allowing this, but this issue has never been properly tackled.
This can also be addressed at the Exchange Point. At an exchange point each member of the exchange (called a “peer”) tells the other's what subnets they have connected at that IX. There is usually little or no reason to accept any traffic from any individual IX peer for an IP Address for which they have not given you a route back.
An RFC was published on this subject many years ago, but it has never been widely implemented – often this is just because not implementing it allows sloppy address management, and bad routing practices, to be better tolerated.
2 … It is a very high risk strategy to have both COM & NET under the same management. No only does this mean there is no diversity of application software, but there is also a huge amount of commonality at every level of the services, from bandwidth & connectivity to servers to operating system to staff to processes & procedures.
The risk this adds to the DNS is beyond what should be considered acceptable – however, the main reason this has never been changed is because its never been a problem before, so its assumed it will never be a problem in the future. Now that is complacency!
3 … There should be a much more concerted and universal policy within all levels of infrastructure providers to protect the IP Address of these sorts of infrastructure service.
For example, as a minimum, all Anycast DNS providers and all Top Level Domain Name Servers should, without fail, block any attempt to use their equipment to attack ROOT or COM/NET Name Servers. After all, we're only talking about a list of 26 IPv4 & 26 IPv6 address to ensure the protection of all ROOT, COM & NET Name Servers.
It also not a big leap so say that all ISPs should also have filters for protected infrastructure IP Addresses as well.
4 … Verisign should have far more copies of the COM & NET Authoritative zone data around the Internet. Its not that hard to design a custom DNS server that can hold COM & Net within less than 6Gb of RAM and serve out data at up to 250,000 queries per second, using cheap commodity hardware – I know this because I've done it. In fact we tested the DNS server up to 500 million names – not just the relatively small 145 million names of COM+NET.
Servers holding this Authoritative data should be placed all over the Internet, in the same way that the ROOT servers have proliferated in recent years. Every large ISP, IX and hosting facility should have the option of hosting a node that holds all the COM & NET Authoritative zone data.
Soon dot-WEB may be coming out. Many people think dot-WEB will be the one new Generic Top Level Domain (new-gTLD) that will have the best chance to get anywhere near the popularity of dot-COM – and it looks like Verisign will be the back-end provider & DNS publisher for dot-WEB.
Having two vitally important eggs in one basket is unwise – adding a third seems downright foolish.