Something Bad Happened
For the first time in I really don't know how long, one of my AWS EC2 instances blew a disk. The one some of my consulting clients' stuff was on. Just... poof. Gone. That's something that hasn't happened to me in, I'd guess, 20 years? I honestly didn't think it happened anymore. Guess what? It does.
I didn't freak out. Well, I freaked out a little at first. And then more again a bit later (that's called 'foreshadowing'). But, in the meantime, I thought "no sweat, I've got all the code in GitHub, assets on S3, a data snapshot back-up every day going back a week. I'll just...
wait...
where are my..."
A Little History
OK, so back story. Some months ago, I decided to move to a reserved instance for that server. Bigger, cheaper. I migrated everything off my older on-demand instance to this new "box".
And every month or so, I do a quick review of my infrastructure. Make sure back-ups are running. No one's mining $72K NothingCoins on my plan. And, as expected, everything was cool. Snapshots being generated. Costs where I expect them. Nothing in GuardDuty.
The Hubris of Continued Success
Only, my casual confidence from a long history of Nothing Going Wrong led me to ignore the fact that I was backing up the old volume, not the new one. Because I mostly craft my own environments, rather than spread them all over AWS services, a lot of data/configurations/.env files had spread out on that particular machine, at least temporarily. My blog database was on there, for instance (more foreshadowing).
Just because you have most of the pieces, it's no fun at all to have to go figure out how to replace and reconfigure everything-- particularly if your data back-ups are also no good, and the configuration had drifted from your terraform/ansible/what-have-you.
Still, I'm a CTO, after all. I've been doing this for <mumbles> years. Surely, I had a belt and suspenders? Well, yes and no.
More no.
No, I didn't.
Thankfully, I had grabbed a few data dumps over the months on a more or less random "how paranoid am I feeling today" schedule. I also had fairly copious notes that I'd taken about the various projects I was working on.
Surprise! I have a new blog!
My blog, my personal pet project, of course got the least attention of all. That one is going to take a minute to fix, so I hope you're REALLY enjoying this article. Maybe read it again, while I just real quick... import all my content back in from the Google cache.
But in the end, I was able to replace most everything after a couple of late evening hours. We lost a few records. We had to re-enter a few things that'd been published. It was not nearly the End of the World disaster that I thought it was as it all unfolded.
Important Life Lesson or Two
First, double check your backups and maybe build in a little more redundancy, Rian.
But most importantly, It's not that bad. It's almost never that bad. Even when it's this kind of almost unbelievable perfect storm of bad luck, mistaken identities, lackluster planning, and a cobbler with pretty bad shoes. When this happened, I genuinely lost my cool. I pride myself on my cool, and it was nowhere to be seen. I may have even sworn a little.
But, whether I'm accidentally spending $50K because I set the bids wrong (sorry, New York Times), hacked by the Russians (OK, that was a close one), or just forced to apologize to a couple of clients and type my blog back in, it's usually not nearly as bad as you think it is at first.
So... when disaster strikes, repeat after me:
- Keep your cool.
- Panic kills.
- Remember what's important.
- All that stuff.