I have a load-bearing raspberry pi on my network - it runs a DNS server, zigbee2mqtt, unifi controller, and a restic rest server. This raspberry pi, as is tradition, boots from a microSD card. As we all know, microSD cards suck a little bit and die pretty often; I’ve personally had this happen not all that long ago.
I’d like to keep a reasonably up-to-date hot spare ready, so when it does give up the ghost I can just swap them out and move on with my life. I can think of a few ways to accomplish this, but I’m not really sure what’s the best:
- The simplest is probably cron + dd, but I’m worried about filesystem corruption from imaging a running system and could this also wear out the spare card?
- recreate partition structure, create an fstab with new UUIDs, rsync everything else. Backups are incremental and we won’t get filesystem corruption, but we still aren’t taking a point-in-time backup which means data files could be inconsistent with each other. (honestly unlikely with the services I’m running.)
- Migrate to BTRFS or ZFS, send/receive snapshots. This would be annoying to set up because I’d need to switch the rpi’s filesystem, but once done I think this might be the best option? We get incremental updates, point-in-time backups, and even rollback on the original card if I want it.
I’m thinking out loud a little bit here, but do y’all have any thoughts? I think I’m leaning towards ZFS or BTRFS.
Do you need a backup image?
For my NAS, all I do is:
Any full-disk backups just make the restore process easier, they’re hardly the primary plan. If you want that, just take a manual backup like once a year, and maybe swap them out every 2-3 years (or however long you think the SD card should last). If you keep writes down, it should last quite a while (and nothing in your use-case seems write-heavy).
But honestly, you should always have a manual backup strategy in case something terrible happens (e.g. your house burns down). Make that your primary strategy, and hot spares would just be a time-saver for the more common case where HW fails.
Well, this is my DNS server which means if it’s down the internet is down and I can’t resolve hostnames to ssh into. I know that can be worked around, but I’d really like a quick and easy fix that I could even talk someone through over the phone if I had to.
My real backups are squared away, no worries. Nightly automatic restic snapshots, one to an external drive on this very pi and another to a NAS at my parents’ house.
I ended up making my router my DNS server, so if my router goes down, the internet is down anyway. I have static routes for things on my LAN, so if I hit mydomain.com, I can route it to an internal address instead of going over the internet. So far it works pretty well.
That said, I don’t have a PiHole setup, so I don’t know if that complicates things (I’m guessing pointing the router at the PiHole with a fallback to external DNS would just show ads or whatever if the PiHole is down).
But yeah, having a quick fallback is important. I think that should be as automatic as possible.
I like the DNS on the router idea, I’ll look into it. I do have some split DNS set up as well as adblocking lists (technitium). Not sure what my router can do.
Edit: autocorrect got me
I think most can do it (esp. if you flash something like OpenWRT), but I have an entry-level enterprise router from Mikrotik and that’s a pretty standard feature on that tier.