I’m planning on setting up a nas/home server (primarily storage with some jellyfin and nextcloud and such mixed in) and since it is primarily for data storage I’d like to follow the data preservation rules of 3-2-1 backups. 3 copies on 2 mediums with 1 offsite - well actually I’m more trying to go for a 2-1 with 2 copies and one offsite, but that’s besides the point. Now I’m wondering how to do the offsite backup properly.
My main goal would be to have an automatic system that does full system backups at a reasonable rate (I assume daily would be a bit much considering it’s gonna be a few TB worth of HDDs which aren’t exactly fast, but maybe weekly?) and then have 2-3 of those backups offsite at once as a sort of version control, if possible.
This has two components, the local upload system and the offsite storage provider. First the local system:
What is good software to encrypt the data before/while it’s uploaded?
While I’d preferably upload the data to a provider I trust, accidents happen, and since they don’t need to access the data, I’d prefer them not being able to, maliciously or not, so what is a good way to encrypt the data before it leaves my system?
What is a good way to upload the data?
After it has been encrypted, it needs to be sent. Is there any good software that can upload backups automatically on regular intervals? Maybe something that also handles the encryption part on the way?
Then there’s the offsite storage provider. Personally I’d appreciate as many suggestions as possible, as there is of course no one size fits all, so if you’ve got good experiences with any, please do send their names. I’m basically just looking for network attached drives. I send my data to them, I leave it there and trust it stays there, and in case too many drives in my system fail for RAID-Z to handle, so 2, I’d like to be able to get the data off there after I’ve replaced my drives. That’s all I really need from them.
For reference, this is gonna be my first NAS/Server/Anything of this sort. I realize it’s mostly a regular computer and am familiar enough with Linux, so I can handle that basic stuff, but for the things you wouldn’t do with a normal computer I am quite unfamiliar, so if any questions here seem dumb, I apologize. Thank you in advance for any information!
External drives that I keep in my office at work. Also cloud storage.
so if any questions here seem dumb
Not dumb. I say the same, but I have a severe inferiority complex and imposter syndrome. Most artists do.
1 local backup 1 cloud back up 1 offsite backup to my tiny house at the lake.
I use Synchthing.
+1 syncthing
Weird down votes?
I use rsync.net
It’s not the lowest price, but I like the flexibility of access.
For instance, I was able to run rclone on their servers to do a direct copy from OneDrive to rsync.net, 400Gb without having to go through my connection.
I can mount backups with sshfs if I want to, including the daily zfs snapshots.
As others have said, use tools like borg and restic.
Shop around for cloud storage with good pricing for your use-case. Many charge for different usage patters, like restoring data, or uploading.
Check out storj.io, I like their pricing - they charge for downloading/restore (IIRC), and I figure that’s a cost I can live with if I need to restore.
My ratchet way of doing it is Backblaze. There is a docker container that lets you run the unlimited personal plan on Linux by emulating a windows environment. They let you set an encryption key so that they can’t access your data.
I’m sure there are a lot more professional and secure ways to do it, but my way is cheap, easy, and works.
I use backblaze as well, got an link to the docker container - that may save me a few dollar bucks a week and thus keep SWMBO happier
Probably a me problem but kept having problems with that docker on unraid, it’s just in the community apps ‘store’. The vm seemed to just crash randomly.
I switched over to their B2 storage and just use rclone to an encrypted bucket and it’s ~<$5/mo which I’m good with. Biggest cost is if I let it run too often and it spends a bunch of their compute time listing files to see if it needs to update them.
What’s the container’s name? I was about to get backblaze and then was frustrated at the cost difference between the desktop personal plan and the one for deploying on my server
I just use
restic
.I’m pretty sure it uses checksums to verify data on the backup target, so it doesn’t need to copy all of the data there.
I use Linux, so encryption is easy with LUKS, and Free File Sync to drives that rotate to a safety deposit box at the bank for catastrophic event, such as a house fire. Usually anything from the last few months are still on my mobile devices.
I assume daily would be a bit much considering it’s gonna be a few TB worth of HDDs which aren’t exactly fast
What is the concern here?
Syncthing to a pi at my parents place.
A pi with multiple terabytes of storage?
My most critical data is only ~2-3TB, including backups of all my documents and family photos, so I have a 4TB ssd attached which the pi also boots from. I have ~40TB of other Linux isos that have 2-drive redundancy, but no backups. If I lose those, i can always redownload.
Low power server in a friends basement running syncthing
using a meshVPN like tailscale or netbird would another option as well. it would allow you to use proper backup software like restic or whatever, and with tailscale on both devices, it would allow restic to be able to find the pi device even if the other person moved to a new house. (although a pi with ethernet would be preferable so all they have to do is plug it in to their new network and everything would be good. if it was a pi zero then someone would have to update the wifi password)
Funny you mention it. This is exactly what I do. Don’t use the relay servers for syncthing, just my tailnet for device to device networking.
But doesn’t that sync in real-time? Making it not a true backup?
Have it sync the backup files from the -2- part. You can then copy them out of the syncthing folder to a local one with a cron to rotate them. That way you get the sync offsite and you can keep them out of the rotation as long as you want.
You could use scheduled snapshots to provide the backup portion.
In theory you could setup a cron with a docker compose to fire up a container, sync and once all endpoint jobs are synced to shut down.
As it seemingly has an API it should be possible.Agreed. I have it configured on a delay and with multiple file versions. I also have another pi running rsnapshot (rsync tool).
How’d you do that?
For the delay, I just reduce how often it checks for new files instead of instantaneously.
Edit the share, enable file versioning, choose which flavor.
Cloud is kind of the default these days but given you’re on this community, I’m guessing you want to keep third parties out of it.
Traditionally, at least in the video editing world, we would keep LTO or some other format offsite and pay for housing it or if you have multiple locations available to you just have those drives shipped back-and-forth as they are updated at regular intervals.
I don’t know what you really have access to or what you’re willing to compromise on so it’s kind of hard to answer the question to be honest. Lots of ways to do it
I use borg backup. It, and another tool called restic, are meant for creating encrypted backups. Further, it can create backups regularly and only backup differences. This means you could take a daily backup without making new copies of your entire library. They also allow you to, as part of compressing and encrypting, make a backup to a remote machine over ssh. I think you should start with either of those.
One provider thats built for being a cloud backup is borgbase. It can be a location you backup a borg (or restic I think) repository. There are others that are made to be easily accessed with these backup tools.
Lastly, I’ll mention that borg handles making a backup, but doesn’t handle the scheduling. Borgmatic is another tool that, given a yml configuration file, will perform the borgbackup commands on a schedule with the defined arguments. You could also use something like systemd/cron to run a schedule.
Personally, I use borgbackup configured in NixOS (which makes the systemd units for making daily backups) and I back up to a different computer in my house and to borgbase. I have 3 copies, 1 cloud and 2 in my home.
I rsync a copy of it to a friends house every night. It’s straight forward, simple and free.
I rsync a copy to mom’s
Same — rsync to a pi 3 with a (single) ZFS drive at family’s house. Retain some daily/weekly/monthly snapshots.
I have a (free) VPS with static IPv4 which is how I connect everything.
Both the VPS and the remote site have limited network speed (I think 50Mbps for VPS), so the initial sync was done sneakernet (well…“airplane net”). Nightly rsync is no problem bandwidth-wise, and is mostly just any new videos I’ve uploaded to my local Immich instance.
A huge tape archive in a mountain. It’s pretty standard for geophysical data. I have some (encrypted) personal stuff on a few tapes there.
I have a rpi4 awith an external hdd at my parents house, which I connect via a wireguard vpn, mount and decrypt the external hdd and then it triggers a restic backup to a restic-rest server as append only.
The whole thing is done via a python script
I chose the rest-server because it allows “append only”, so the data can’t be deleted easily from my side of the vpn.
RClone to a cloud storage (hetzner in my case). Rclone is easy to configure and offers full encryption, even for the file names.
As the data is only uploaded once, a daily backup uploads only the added or changed files.
Just as a side note: make sure you can retrieve your data even in case your main system fails. Make sure you have all the passwords/crypto keys available.
I do the same using rclone, partly encrypted partly just dump.
I use batch scripts ln cron_daily to start this