Who should the school contact if the data has been encrypted?

On October 20, 2025, Amazon Web Services (AWS) experienced a major outage: a single DNS error in the US-EAST-1 region slowed down or completely stopped services worldwide for hours. Thousands of companies, both large and small, discovered how a single "point of failure" can ripple through the digital ecosystem.

It made me reflect on how resilience is often taken for granted today. Many cloud environments still operate with a single-region, or worse, single-provider mindset. As long as everything works, it is convenient. But when something breaks, and it inevitably will, we realize redundancy was not a priority, backups were “somewhere out there,” and recovery processes had never really been tested.

From a technical standpoint, the October 20 event was a perfect example of systemic dependency:

  • a DNS issue made DynamoDB endpoints unreachable

  • services relying on DynamoDB (Lambda, SQS, EC2) started to fail in cascade

  • upstream clients and applications reacted with massive retries, amplifying the impact

A classic lesson in distributed architecture: reliability is not a property of the provider, but of the design.

So the real question we should be asking is not “How reliable is AWS?” but rather,

How resilient is my architecture when AWS is not?

Multi-region, multi-cloud, circuit breakers, strategic caching, fallback to secondary providers: these are not academic concepts; they are what keep operations running when the cloud stumbles.

Perhaps the October 20 outage was not just an incident, but a collective reminder: the cloud is powerful, but not infallible, and the responsibility for resilience always remains ours.

other stories

See More Articles

Your subscription cannot be validated.
Your request has been successfully submitted.
Il campo SMS deve contenere tra i 6 e i 19 caratteri e includere il prefisso del paese senza usare +/0 (es. 39xxxxxxxxxx per l'Italia)
?