View previous topic :: View next topic |
Author |
Message |
Nicias Guru
Joined: 06 Dec 2005 Posts: 446
|
Posted: Mon Feb 21, 2022 3:30 pm Post subject: Backup Best Practices |
|
|
Hello,
I have a home media server that I would like to be backing up regularly. My plan is to back it up to a veracrypt encrypted drive which I store at my workplace. I have a total of about 3TB of data to store, and I'll be using a 5TB drive. All of the data has par2 checkfiles.
My plan for a procedure for doing my backups was something like this:
1. Connect drive to server.
2. Run smartctl -t conveyance
3. Mount veracrypt container
4. Run fsck on /dev/mapper/veracrypt1 (unmounting before and remounting after)?
5. rsync over changes
6. Run fsck again?
7. Run dd if=/dev/sdX of=/dev/null to force the drive to read all sectors.?
8. Run smartctl -t long?
But now I think that seems like overkill.
Do I need steps 4, 6, 7, 8? Is there something else I should be including?
Thanks! |
|
Back to top |
|
|
szatox Advocate
Joined: 27 Aug 2013 Posts: 3188
|
Posted: Mon Feb 21, 2022 5:05 pm Post subject: |
|
|
It absolutely is an overkill in terms of work you're doing, and sub-par in terms of results.
Mount, decrypt, rsync, umount is OK, there is no need for the other steps.
Also, consider running rsync with hardlinks ( --link-dest= <previous_backup_path> ) or using snapshots* for incremental backups, so you can have multiple versions of your files.
Finally, clone your backup to another drive.
Whether or not this step is necessary depends on how important that data is for you. You're really planning to go out of your way there, so I assume you want to be very safe. So just have a second disk with the same data instead.
Preferably a different brand or age (so they don't fail at the same time due to manufacturing flaws), in another location (theft, fire, floods, power surges, meteors etc), synchronized over the internet.
Running long test on a monthly schedule or so is fine. There is no need to do that every time. Long test attempts to read the whole disk, so dd is not necessary either. Rsync reads destination files too.
* Both options have their flaws and can fail spectacularly under certain conditions, know the limitations of your chosen solution so you don't shoot your foot. Also, having a clone helps recover from mistakes as well as hardware failures. |
|
Back to top |
|
|
Nicias Guru
Joined: 06 Dec 2005 Posts: 446
|
Posted: Mon Feb 21, 2022 5:19 pm Post subject: |
|
|
Thanks for the quick reply!
I actually was going to have two backups, in two different locations (I work in two different buildings).
I will look into --link-dest. |
|
Back to top |
|
|
figueroa Advocate
Joined: 14 Aug 2005 Posts: 2971 Location: Edge of marsh USA
|
Posted: Tue Feb 22, 2022 6:23 am Post subject: |
|
|
In my opinion, all the checking is overkill, and it's not very efficient. I make automatic backups nightly (crontab) to secondary hard drive. The backups are made using tar with compression but not encryption.
When I make my off-site backups, I use tar with gpg encryption but no compression. Since they are already compressed, I don't compress again. But, it would be easy to add compression when the script reads "--compress-algo none". Compression reduces I/O and writes to media.
Here is one of my actual scripts just for user data with USERNAME changed to protect the guilty.
Code: | $ cat bin/targpgflash3.scr
#!/bin/sh
#Encrypt and store data and OS backups on external media.
#If backup partition(s) already mounted, comment out mount and umount commands.
# For backups made Saturday
date > /run/media/USERNAME/SP256/date0.txt
cp -pu /home/USERNAME/bin/targpgflash1.scr /run/media/USERNAME/SP256/
cp -pu /home/USERNAME/bin/targpgflash2.scr /run/media/USERNAME/SP256/
cp -pu /home/USERNAME/bin/targpgflash3.scr /run/media/USERNAME/SP256/
mount /mnt/backup3
cd /mnt/backup3
date > /run/media/USERNAME/SP256/date1.txt
tar cvf - data/* | gpg -c --batch --yes --passphrase-file /scratch/bin/.passrc --compress-algo none -o /run/media/USERNAME/SP256/databackup.tar.gpg
date >> /run/media/USERNAME/SP256/date1.txt
#cd /
#umount /mnt/backup3
#mount /mnt/backup3
#cd /mnt/backup3
date > /run/media/USERNAME/SP256/date2.txt
tar cvf - janbak/* | gpg -c --batch --yes --passphrase-file /scratch/bin/.passrc --compress-algo none -o /run/media/USERNAME/SP256/janbackup.tar.gpg
date >> /run/media/USERNAME/SP256/date2.txt
cd /
sync
umount /mnt/backup3
date >> /run/media/USERNAME/SP256/date0.txt
##Decrypt and un-tar examples
#gpg -d myarchive.tar.gpg | tar xvf -
##Just decrypt to tar archive examples
#gpg -o backup.tar -d backup.tar.gpg
#gpg -d -o backup.tar /run/media/USERNAME/SP256/backup.tar.gpg
#gpg -o janbak/janbak.tar -d /run/media/USERNAME/SP256/janbackup.tar.gpg |
/mnt/backup3 is one of my backup partitions containing the compressed tar archives that were made automatically.
SP256 is one of my external media that is auto-mounted to that label. External media is in a rotation of three: weekly1, weekly2, monthly. This is for data. There is another set used for the operating system, for a total of six devices.
There is no checking the results. Any errors go to standard output (the screen).
If you have any questions, do ask. _________________ Andy Figueroa
hp pavilion hpe h8-1260t/2AB5; spinning rust x3
i7-2600 @ 3.40GHz; 16 gb; Radeon HD 7570
amd64/23.0/split-usr/desktop (stable), OpenRC, -systemd -pulseaudio -uefi |
|
Back to top |
|
|
szatox Advocate
Joined: 27 Aug 2013 Posts: 3188
|
Posted: Tue Feb 22, 2022 10:31 am Post subject: |
|
|
Oh, one more thing:
Quote: | 1. Connect drive to server. |
If it requires a manual action, it's not a very good backup. You will forget to do that, you'll feel lazy, something will distract you on your way, you will "have not made any progress today", or you'll get bored of this process.
It it's not running unattended, it's not running at all.
And if you mean connecting the drive over network, rsync works better when it _knows_ it's running over network. NFS/sshfs will take its toll in performance. Connecting to a remote server with rsync directly will make it work smarter, reducing transferred data volume. And if you connect with rsync directly, there is no need for the "connect" step. |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 4402 Location: Bavaria
|
Posted: Tue Feb 22, 2022 10:45 am Post subject: |
|
|
szatox wrote: | If it requires a manual action, it's not a very good backup. [...] It it's not running unattended, it's not running at all.
[...] Connecting to a remote server [...] |
szatox, your are right, a good backup strategy is when its done automatically, but dont forget the most important information: Only the backup server initiate all connects to working server. Do it never the other way. Why ?
We had some breaks via online attacks. All data has been encrypted ... and all data on the backup server ALSO because on the backup server was a running daemon ...
A secure path is when there is no running daemon on the backup server (to minimize the attack surface); the daemon(s) is/are only on all working server AND the backup server initiate the backup and "says: Hello, I want to backup you". |
|
Back to top |
|
|
psycho Guru
Joined: 22 Jun 2007 Posts: 534 Location: New Zealand
|
Posted: Wed Feb 23, 2022 3:36 am Post subject: |
|
|
Nicias, your "two different locations" plan is sensible. You've achieved nothing, backing up your data to your NAS or whatever, if a burglar steals the desktop *and* the NAS. Some people might see it as overkill, but if your data's important to you, yes, you need at least one backup offsite so that even if the building burns down, your data isn't lost. Of course cloud backups can achieve that (and you can copy a whole encrypted volume to them, so that they're pretty safe with a strong enough password, even if the provider doesn't offer end-to-end encryption, although obviously that's better as another layer), but even an occasional manual physical copy to a USB drive or something else that's not stored onsite is much better than having all your eggs in one basket. Keeping a laptop somewhere else and syncing it whenever it's onsite can be another casual solution, so you're not counting entirely on your local backup.
You can add the local backup to your shutdown script so you don't even have to think about it. I use a process similar to the one you're outlining: I decrypt a volume on a NAS and rsync my local encrypted data to that volume. It's got enough redundancy that I could lose a drive (on the NAS) without losing my data...and whenever I think of it (which, admittedly, is not often enough...sometimes I'll forget for a week or more) I back it up to a USB stick on my keyring too. The keyring drive not only stores the encrypted data, but also the tools (via live USB OS) to access it. Then in addition to the regular stuff I've got old backups in other weird places. As everyone learns the hard way eventually, overkill is better than underkill when it comes to backups. |
|
Back to top |
|
|
figueroa Advocate
Joined: 14 Aug 2005 Posts: 2971 Location: Edge of marsh USA
|
Posted: Wed Feb 23, 2022 4:55 am Post subject: |
|
|
Yes, I also have two large (but physically small - Samsung Bar Plus) USB flash drives on my keyring with a copy of my monthly backup set. I also get email reminders from my calendar to make the weekly and monthly off-site backups. On site backups are automatic and made while I sleep, desktop and server running 24/7. _________________ Andy Figueroa
hp pavilion hpe h8-1260t/2AB5; spinning rust x3
i7-2600 @ 3.40GHz; 16 gb; Radeon HD 7570
amd64/23.0/split-usr/desktop (stable), OpenRC, -systemd -pulseaudio -uefi |
|
Back to top |
|
|
psycho Guru
Joined: 22 Jun 2007 Posts: 534 Location: New Zealand
|
Posted: Wed Feb 23, 2022 9:55 am Post subject: |
|
|
figueroa wrote: | Yes, I also have two large (but physically small - Samsung Bar Plus) USB flash drives on my keyring |
Exactly, large but small. I like the minuscule ones (like the SanDisk Ultra Fit) that only just stick out of the port...they're much smaller and lighter than any of the keys on my ring so no hassle to carry around, and with a 256GB capacity they can backup quite a lot. Realistically the cloud is even more convenient and I can't remember the last time there was really *no* Internet access (not even via mobile networks) around here...but still, for data I really don't want to lose, having it physically in my pocket just feels safer than having it on disks I don't own (and can't access without their owners' permission), thousands of miles away. |
|
Back to top |
|
|
Hund Apprentice
Joined: 18 Jul 2016 Posts: 218 Location: Sweden
|
|
Back to top |
|
|
user Apprentice
Joined: 08 Feb 2004 Posts: 204
|
Posted: Wed Feb 23, 2022 12:24 pm Post subject: |
|
|
3-2-1 Backup Strategy
1) rdiff-backup hourly for non-binary content, daily for binary content (one copy data locally)
daily incremental tar archive (incremental state monthly reseted)
asymmetric gpg encryption of daily tar archive (no online access to secret gpg key)
2) daily encrypted archive copy to local NAS (second copy of data on-site but different media)
3) daily encrypted archive copy to different geolocated remote destinations with write-only access (N copies of data off-site)
- content always encrypted outside of local host (NAS or off-site)
- restore need secret gpg key and initial monthly tar archive with latest incremental tar archive
- not usable for petabyte of content |
|
Back to top |
|
|
figueroa Advocate
Joined: 14 Aug 2005 Posts: 2971 Location: Edge of marsh USA
|
Posted: Thu Feb 24, 2022 5:42 am Post subject: |
|
|
Since the subject is "Best Practices," I'll toss this out for consideration. For users and administrators, this is a burning issue. Apparently one-size does not fit all.
Best Practices
1. Do make backups.
2. Make backups often.
3. Make backups automatically.
4. Store some backups off-site.
Considerations
1. Location
2. Compression
3. User data versus operating system
4. Tolerance for loss
5. Physical security
6. Network security
7. Encryption
8. Automation
9. Scheduling
10. Recovery/Restoration
Discussion Starters
1. Tolerance for loss drives consideration of using real-time/near-real-time backups, versus hourly/nightly/weekly/monthly.
2. Some users backup none of their operating system, saying they will just re-install if they loose the system. Others backup a few critical operating system files to help them duplicate the OS upon re-installation, i.e. /var/lib/portage/world, /etc/portage/, /etc/fstab, /etc/hosts, /boot/config~. Still others backup their entire OS with third-party programs (i.e. backintime) or usually built-in tools (i.e. rsync, tar, dar, gzip, gpg, etc. for stage4 or equivalent). _________________ Andy Figueroa
hp pavilion hpe h8-1260t/2AB5; spinning rust x3
i7-2600 @ 3.40GHz; 16 gb; Radeon HD 7570
amd64/23.0/split-usr/desktop (stable), OpenRC, -systemd -pulseaudio -uefi |
|
Back to top |
|
|
pa4wdh l33t
Joined: 16 Dec 2005 Posts: 815
|
Posted: Thu Feb 24, 2022 7:09 am Post subject: |
|
|
To chime in on the "best practices": You may have heard of the 3-2-1 rule: Make 3 backups on 2 different media types and store 1 of them on a different location.
For me personally:
I make (automated) daily backups of unique and changing data. Things like homedir, mail, etc. This backup is encrypted with gpg and stored on a VPS. The decryption key is stored on a (encrypted) USB Stick and on paper.
I make quarterly full backups of all systems usually after i updated them. This data doesn't change much so i don't lose much if disasters happen just before i make my next backup. This is stored on USB attached HDD which is only attached when i make the backups and is encrypted with LUKS. _________________ The gentoo way of bringing peace to the world:
USE="-war" emerge --newuse @world
My shared code repository: https://code.pa4wdh.nl.eu.org
Music, Free as in Freedom: https://www.jamendo.com |
|
Back to top |
|
|
psycho Guru
Joined: 22 Jun 2007 Posts: 534 Location: New Zealand
|
Posted: Mon Mar 07, 2022 9:59 pm Post subject: |
|
|
Re "Some users backup none of their operating system, saying they will just re-install", yes, this is an interesting discussion point, and a question of how much additional time this will take. The time to "just re-install" is always going to be longer than the time to simply restore the fully installed and configured system from a backup...but even with a very quickly installed binary distro, how much longer it takes depends upon how much customisation is done, and whether the customisation itself has been backed up somehow.
I used to invest many hours in customising my installations (as root, i.e. to parts of the system that would not be backed up with the normal user customisations of GUI setups and so on), all of which I did manually without taking the time to document it in the form of shell scripts...so this made "just re-install" an unreasonable option, as to re-install from scratch would have cost me many, many hours: backing up user data alone was certainly not viable. Eventually though, I made the effort to track all of these manual tweaks in shell scripts, so that I can transform a fresh out-of-the-box installation into a fully configured and customised system very quickly and easily...theoretically with a single instruction, although for ease of managing the process I prefer to do it in stages with a few different scripts. So, the clean-install-plus-data-restore approach is viable now and takes maybe half an hour longer (much of which is the actual installation) on e.g. systems using apt with binary repositories.
Gentoo is different though: "just install" is a comical notion if you're in the middle of something important and need to "just install" Gentoo from scratch to the point where you can get back to restoring and using your data! Unless you've got a binary package server to accelerate things, reinstalling Gentoo is a very time-consuming process (even assuming you've meticulously backed up or scripted all your edits to /etc/* and your kernel .config and so on, so that none of that needs to be re-done)...I would never consider a Gentoo system with only its user data backed up to be properly backed up. Some kind of full system backup or snapshot is essential, unless you have lots of time on your hands and are only concerned about being able *eventually* to restore things, and not concerned about the time involved in doing so. |
|
Back to top |
|
|
pietinger Moderator
Joined: 17 Oct 2006 Posts: 4402 Location: Bavaria
|
Posted: Mon Mar 07, 2022 10:49 pm Post subject: |
|
|
psycho wrote: | Re "Some users backup none of their operating system, saying they will just re-install", yes, this is an interesting discussion point, and a question of how much additional time this will take. |
Yes, it would take a very long time to install a new system; even if /etc is backuped (I do). But think how often you had to restore a complete system because hard disk failures (I had none the last 20 years; ok I buy a new computer all 5 or 6 years) and how often you have to restore a DATA file because it was accidently deleted (I do this often). Yes I do backups only for /home /etc and my kernel configuration. Think also for a nice side-effect of a new installation: All files where I tested something and are dead are gone also ... |
|
Back to top |
|
|
figueroa Advocate
Joined: 14 Aug 2005 Posts: 2971 Location: Edge of marsh USA
|
Posted: Tue Mar 08, 2022 3:40 am Post subject: |
|
|
In the few times I've had a primary hard drive suddenly die on a Gentoo system, I've been very happy to be able to restore a full stage4 tarball or stage4-equivalent backup set that was no more than a week old. I make these automatically on the following crontab schedule.
Code: | 01 6 7 * * /home/USERNAME/bin/stage4.scr 2>&1
01 6 17 * * /home/USERNAME/bin/stage4b.scr 2>&1
01 6 27 * * /home/USERNAME/bin/stage4c.scr 2>&1 |
On the same primary desktop, I also make the stage4-equivalent backup on this schedule:
Code: | 01 6 2,16 * * /home/USERNAME/bin/gentoo2bak.scr 2>&1
01 6 9,23 * * /home/USERNAME/bin/gentoo2bak2.scr 2>&1
01 6 28 2 * /home/USERNAME/bin/gentoo2bak3.scr 2>&1
01 6 30 1,3-12 * /home/USERNAME/bin/gentoo2bak3.scr 2>&1 |
Each of these zstd compressed tarballs or set of tarballs is 3.8 GB, takes about 19 minutes to create and contain no user files. Yes, they are actually redundant, since my backup plan and design is always a work-in-progress.
These backups are tested from time-to-time to ensure they restore cleanly and actually work. The actual stage4's are on an every 10 day (more or less) schedule, whereas the "gentoo2bak" scripts alternate weekly with the third version running once a month. This way I can always go back in time a decent amount of time so I'm not dependent on just the most recent backup. Some of these backups (more or less weekly) are also encrypted on-the-fly to external media for off-site storage, also on a 3x rotating basis.
I've shared my scripts several times, and recently, in these forums. The main point about this post is that the backups are:
1. automated
2. tested
3. at least 3x redundant
I'm as close as possible to being 100% confident in these backups. Personal files are similarly backed up, but every night. I am mindful that I don't have real-time protection Should I be working on something that I would fret it's being lost forever, I'll manually duplicate to another drive or by NFS to another machine. _________________ Andy Figueroa
hp pavilion hpe h8-1260t/2AB5; spinning rust x3
i7-2600 @ 3.40GHz; 16 gb; Radeon HD 7570
amd64/23.0/split-usr/desktop (stable), OpenRC, -systemd -pulseaudio -uefi |
|
Back to top |
|
|
A.S. Pushkin Guru
Joined: 09 Nov 2002 Posts: 418 Location: dx/dt, dy/dt, dz/dt, t
|
Posted: Tue Mar 15, 2022 5:17 pm Post subject: System backups |
|
|
One more thank you for all your suggestions.
It's more evidence why Gentoo is such a great distro.
I have gotten such great education as I use Linux.
Real democracy for computer users. _________________ ASPushkin
"In a time of universal deceit - telling the truth is a revolutionary act." -- George Orwell |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|