Pi NAS for multi-location backups
from mnemonicmonkeys@sh.itjust.works to selfhosted@lemmy.world on 31 Aug 15:20
https://sh.itjust.works/post/45148627
from mnemonicmonkeys@sh.itjust.works to selfhosted@lemmy.world on 31 Aug 15:20
https://sh.itjust.works/post/45148627
Over the past few months I’ve been thinking about what would be the best way to help me and my parents improve privacy and data storage.
With all the posts with cluster PC’s recently, I’m wondering if the best option is to make a couple of NAS’s with Raspberry Pi’s with RAID, keep one at my place and another at my parents’ house, and syncing their data with 2 private folders: one for myself and one for my parents.
But that opens up a few more questions. How to sync the data to match? Syncthing? Kubernetes? Should I go ahead and add Nextcloud to the Pi’s? Should I make the Pi’s expandable so other services can be added later, or plan to hook up a separate Pi to handle that? What else could I be missing?
threaded - newest
Stock raspberry os and syncthing sounds like the easiest way to do this.
Can confirm, have done it this way for years.
There is plenty of backup management software. You want one that will not only keep a copy of your data, but also save you in the event you accidentally delete one or more files.
Could you provide any examples to start looking into?
Sync is not backup.
Let’s repeat that - sync is not backup.
If your sync job syncs an unintentional deletion, the file is deleted, everywhere.
Backup stores versions of files based on the definitions you provide. A common backup schedule for a home system mat be monthly full, Daily incremental. In this way you have multiple versions of any file that’s changed.
With sync you only have replicants of one file that can be lost through the sync.
Now, you could use backup software to a given location, and have that synchronized to remote systems. Syncthing could do this, with the additional safety of “send only” configured, so if a remote destination gets corrupted, it won’t sync back to the source.
Edit: as for Pi NAS, I’ve found Small-Form-Factor desktops to be a better value. They don’t have much physical space for drives, but I’ve been able to use two 3.5" drives or four 2.5" drives in one. My current one idles at <15w.
Or mini pc with one drive. Since you’re replicating this data to multiple locations, having local redundancy (e.g. Mirroring) isn’t really necessary.
Of course this assumes your net backup requirements are under about 12TB (or whatever the latest single drive size is).
You seem to be missing/ignoring that sync will protect against data loss from lost/broken devices. When that happens, those connections are severed with no deletions propagating through them. Not only that, you can configure syncthing to retain older versions for over a year to avoid issues of unwanted edits.
You have to be joking with this. There is no way I’m letting that tracker-filled ransomware near any of my computers.
Simple mirroring doesn’t protect against bitrot. RAID 6 does.
You’re clearly not suited for giving out advice, so you’re getting ignored and blocked. Don’t let the door hit you on the way out.
Only if you very carefully architect things to protect against it. I have absolutey seen instances where a drive had a fault and wouldn’t mount on the source, and a few hours later a poorly designed backup script saw the empty mount location on the source and deleted the entire backup. You have to be VERY CAREFUL when using a sync system as a backup. I don’t use syncthing, but if it can be configured to do incremental backups with versioning then you should absolutely choose that option.
I believe he was talking about a mini PC with a single drive, not Microsoft’s “One Drive”.
Lots wrong with this statement. The way you protect against bitrot is with block-level checksumming, such as what you get natively with ZFS. You can get bitrot protection with a single drive that way. It can’t auto-recover, but it’ll catch the error and flag the affected file so you can replace it with a clean copy from another source at your earliest convenience. If you do want it to auto-recover, you simply need any level of redundancy. Mirror, RAIDZ1, RAIDZ2, etc. would all be able to clean the error automatically.
Thank you. Now can you please explain this to my IT department that thinks force syncing everything on our computers to OneDrive is a solution to our lack of backups?
Well, I mean it kind of is a solution. It’s a cloud backup solution. OneDrive doesn’t just keep a single version of your file, there’s versioning, retention policies, etc.
Cloud makes a lot of sense for businesses with small IT staff and a lot of users because while it’s not fully in your control, it comes with all the things being discussed here “out of the box” and scales infinitely.
For self hosters there’s some fun and power in doing everything yourself, but even then adding cloud as part of your backup (if done securely) is usually a pretty good idea.
This would be a great use case for ZFS. You can use dataset replication to sync the data. I’ve been very happy with TrueNAS but I think it only became recently available for the Raspberry Pi so not sure I would recommend it but I think you could use ZFS. You can use RAID-Z too if you multiple drives. The only limitation you might encounter is memory since ZFS is a bit of a hog. On the plus side though, ZFS should protect against bitrot if you’re worried about that.
If you go the ZFS route, you could check out syncoid.
Some other tools that might be worth considering (that aren’t related to ZFS) are borg and restic.
Hope that helps.
ZFS doesn’t really hog memory, rather it consumes almost all available memory as cache. But it frees it as soon as it’s needed.
What I do is a local backup on a different disk with BorgBackup, then a copy of that local backup to a Pi at a friend’s place, with rsync.
I came here to say the same thing except that I have a pi locally and one at a relative’s house. I back up to the local pi and a nightly cron starts rsync to pull my local copy.
I chose this so that i could control the rstnc start time, bandwidth and stop time but also so I could leave the remote network vanilla with no open ports, etc. With bandwidth limiting, it may take a few days to catch up from full backups, but a differential is same day.
Be sure to use a RO filesystem or overlay FS on the Pi card. I’ve had them go corrupt.