It’s hard to believe we’re in 2016, almost 2017, and last month many of the major sites like Twitter, GitHub, Spotify and many other large websites went down for several hours. One of the issues of enterprise computing comes from architecting robust systems that can sustain loss of resources at different points. There are areas of those systems that have been studied extensively: take databases for example. There are dozens of strategies on how to make data redundant and distribute that data across locations, across systems with different characteristics so that if one fails, another one can take over quickly. The same goes for servers. Cloud instances and physical servers can now die and for the most part, users will see nothing of this. They will be re-routed elsewhere, using a load balancer.
Making DNS robust
DNS appears to have been less studied. And still, last month it showed that it can be a single point of failure given its current state. Sites that went down were using Dyn, one of the largest DNS vendors, and Dyn’s service became unavailable after a DDOS attack. There was no plan B that could re-route traffic to another DNS provider fast enough. Here, fast enough is the nuance to keep in mind. There is a way to update DNS, but it takes time to setup and propagate. It was also unclear how long the DDOS attack would last. There is no doubt that this will prompt architecture updates at the DNS level to replicate what databases and compute instances do now: offering a fast failover mechanism in the event of an attack. At the same time, network providers for the Internet need to find better ways to identify sources of DDOS and stop them before they hit providers. Some throttling has to occur, and the fact that DDOS attacks continue show that the issue is largely unaddressed.
It’s no surprise that Oracle, moving from on-premise to cloud markets as fast as it can, just acquired Dyn this week.
Leave a Reply