It's always DNS, How could the AWS DNS Outage be Avoided
"It's always DNS" the phrase that comes up from sysadmin and DevOps alike.
And there are reasons for this common saying, according to The Uptime Institute's 2022 Outage Analysis Report the most common reasons behind a network-related outage are a tie between configuration/change management errors and a third-party network provider failure. DNS failures often fall into these categories. A misconfiguration on this service can make services completely unreachable, leading to a major outage.
This was the case of last AWS us-east-1 outage on 20th October . An issue with DNS prevented applications from finding the correct address for AWS's DynamoDB API, a cloud database that stores user information and other critical data. Now this DNS issue happened to an infra giant like AWS and frankly it could happen to any of us, but are there methods to make our system resilient against this?
