Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Backup Best Practices
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Gentoo Chat
View previous topic :: View next topic  
Author Message
Nicias
Guru
Guru


Joined: 06 Dec 2005
Posts: 446

PostPosted: Mon Feb 21, 2022 3:30 pm    Post subject: Backup Best Practices Reply with quote

Hello,

I have a home media server that I would like to be backing up regularly. My plan is to back it up to a veracrypt encrypted drive which I store at my workplace. I have a total of about 3TB of data to store, and I'll be using a 5TB drive. All of the data has par2 checkfiles.

My plan for a procedure for doing my backups was something like this:

1. Connect drive to server.
2. Run smartctl -t conveyance
3. Mount veracrypt container
4. Run fsck on /dev/mapper/veracrypt1 (unmounting before and remounting after)?
5. rsync over changes
6. Run fsck again?
7. Run dd if=/dev/sdX of=/dev/null to force the drive to read all sectors.?
8. Run smartctl -t long?

But now I think that seems like overkill.

Do I need steps 4, 6, 7, 8? Is there something else I should be including?

Thanks!
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3188

PostPosted: Mon Feb 21, 2022 5:05 pm    Post subject: Reply with quote

It absolutely is an overkill in terms of work you're doing, and sub-par in terms of results.

Mount, decrypt, rsync, umount is OK, there is no need for the other steps.
Also, consider running rsync with hardlinks ( --link-dest= <previous_backup_path> ) or using snapshots* for incremental backups, so you can have multiple versions of your files.
Finally, clone your backup to another drive.

Whether or not this step is necessary depends on how important that data is for you. You're really planning to go out of your way there, so I assume you want to be very safe. So just have a second disk with the same data instead.
Preferably a different brand or age (so they don't fail at the same time due to manufacturing flaws), in another location (theft, fire, floods, power surges, meteors etc), synchronized over the internet.

Running long test on a monthly schedule or so is fine. There is no need to do that every time. Long test attempts to read the whole disk, so dd is not necessary either. Rsync reads destination files too.


* Both options have their flaws and can fail spectacularly under certain conditions, know the limitations of your chosen solution so you don't shoot your foot. Also, having a clone helps recover from mistakes as well as hardware failures.
Back to top
View user's profile Send private message
Nicias
Guru
Guru


Joined: 06 Dec 2005
Posts: 446

PostPosted: Mon Feb 21, 2022 5:19 pm    Post subject: Reply with quote

Thanks for the quick reply!

I actually was going to have two backups, in two different locations (I work in two different buildings).

I will look into --link-dest.
Back to top
View user's profile Send private message
figueroa
Advocate
Advocate


Joined: 14 Aug 2005
Posts: 2971
Location: Edge of marsh USA

PostPosted: Tue Feb 22, 2022 6:23 am    Post subject: Reply with quote

In my opinion, all the checking is overkill, and it's not very efficient. I make automatic backups nightly (crontab) to secondary hard drive. The backups are made using tar with compression but not encryption.

When I make my off-site backups, I use tar with gpg encryption but no compression. Since they are already compressed, I don't compress again. But, it would be easy to add compression when the script reads "--compress-algo none". Compression reduces I/O and writes to media.

Here is one of my actual scripts just for user data with USERNAME changed to protect the guilty.

Code:
$ cat bin/targpgflash3.scr
#!/bin/sh
#Encrypt and store data and OS backups on external media.
#If backup partition(s) already mounted, comment out mount and umount commands.

# For backups made Saturday
date > /run/media/USERNAME/SP256/date0.txt

cp -pu /home/USERNAME/bin/targpgflash1.scr /run/media/USERNAME/SP256/
cp -pu /home/USERNAME/bin/targpgflash2.scr /run/media/USERNAME/SP256/
cp -pu /home/USERNAME/bin/targpgflash3.scr /run/media/USERNAME/SP256/

mount /mnt/backup3
cd /mnt/backup3
date > /run/media/USERNAME/SP256/date1.txt
tar cvf - data/* | gpg -c --batch --yes --passphrase-file /scratch/bin/.passrc --compress-algo none -o /run/media/USERNAME/SP256/databackup.tar.gpg
date >> /run/media/USERNAME/SP256/date1.txt
#cd /
#umount /mnt/backup3

#mount /mnt/backup3
#cd /mnt/backup3
date > /run/media/USERNAME/SP256/date2.txt
tar cvf - janbak/* | gpg -c --batch --yes --passphrase-file /scratch/bin/.passrc --compress-algo none -o /run/media/USERNAME/SP256/janbackup.tar.gpg
date >> /run/media/USERNAME/SP256/date2.txt
cd /
sync
umount /mnt/backup3

date >> /run/media/USERNAME/SP256/date0.txt

##Decrypt and un-tar examples
#gpg -d myarchive.tar.gpg | tar xvf -
##Just decrypt to tar archive examples
#gpg -o backup.tar -d backup.tar.gpg
#gpg -d -o backup.tar /run/media/USERNAME/SP256/backup.tar.gpg
#gpg -o janbak/janbak.tar -d /run/media/USERNAME/SP256/janbackup.tar.gpg

/mnt/backup3 is one of my backup partitions containing the compressed tar archives that were made automatically.

SP256 is one of my external media that is auto-mounted to that label. External media is in a rotation of three: weekly1, weekly2, monthly. This is for data. There is another set used for the operating system, for a total of six devices.

There is no checking the results. Any errors go to standard output (the screen).

If you have any questions, do ask.
_________________
Andy Figueroa
hp pavilion hpe h8-1260t/2AB5; spinning rust x3
i7-2600 @ 3.40GHz; 16 gb; Radeon HD 7570
amd64/23.0/split-usr/desktop (stable), OpenRC, -systemd -pulseaudio -uefi
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3188

PostPosted: Tue Feb 22, 2022 10:31 am    Post subject: Reply with quote

Oh, one more thing:
Quote:
1. Connect drive to server.

If it requires a manual action, it's not a very good backup. You will forget to do that, you'll feel lazy, something will distract you on your way, you will "have not made any progress today", or you'll get bored of this process.
It it's not running unattended, it's not running at all.

And if you mean connecting the drive over network, rsync works better when it _knows_ it's running over network. NFS/sshfs will take its toll in performance. Connecting to a remote server with rsync directly will make it work smarter, reducing transferred data volume. And if you connect with rsync directly, there is no need for the "connect" step.
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 4402
Location: Bavaria

PostPosted: Tue Feb 22, 2022 10:45 am    Post subject: Reply with quote

szatox wrote:
If it requires a manual action, it's not a very good backup. [...] It it's not running unattended, it's not running at all.

[...] Connecting to a remote server [...]


szatox, your are right, a good backup strategy is when its done automatically, but dont forget the most important information: Only the backup server initiate all connects to working server. Do it never the other way. Why ?

We had some breaks via online attacks. All data has been encrypted ... and all data on the backup server ALSO because on the backup server was a running daemon ... :-(

A secure path is when there is no running daemon on the backup server (to minimize the attack surface); the daemon(s) is/are only on all working server AND the backup server initiate the backup and "says: Hello, I want to backup you".
Back to top
View user's profile Send private message
psycho
Guru
Guru


Joined: 22 Jun 2007
Posts: 534
Location: New Zealand

PostPosted: Wed Feb 23, 2022 3:36 am    Post subject: Reply with quote

Nicias, your "two different locations" plan is sensible. You've achieved nothing, backing up your data to your NAS or whatever, if a burglar steals the desktop *and* the NAS. Some people might see it as overkill, but if your data's important to you, yes, you need at least one backup offsite so that even if the building burns down, your data isn't lost. Of course cloud backups can achieve that (and you can copy a whole encrypted volume to them, so that they're pretty safe with a strong enough password, even if the provider doesn't offer end-to-end encryption, although obviously that's better as another layer), but even an occasional manual physical copy to a USB drive or something else that's not stored onsite is much better than having all your eggs in one basket. Keeping a laptop somewhere else and syncing it whenever it's onsite can be another casual solution, so you're not counting entirely on your local backup.

You can add the local backup to your shutdown script so you don't even have to think about it. I use a process similar to the one you're outlining: I decrypt a volume on a NAS and rsync my local encrypted data to that volume. It's got enough redundancy that I could lose a drive (on the NAS) without losing my data...and whenever I think of it (which, admittedly, is not often enough...sometimes I'll forget for a week or more) I back it up to a USB stick on my keyring too. The keyring drive not only stores the encrypted data, but also the tools (via live USB OS) to access it. Then in addition to the regular stuff I've got old backups in other weird places. As everyone learns the hard way eventually, overkill is better than underkill when it comes to backups.
Back to top
View user's profile Send private message
figueroa
Advocate
Advocate


Joined: 14 Aug 2005
Posts: 2971
Location: Edge of marsh USA

PostPosted: Wed Feb 23, 2022 4:55 am    Post subject: Reply with quote

Yes, I also have two large (but physically small - Samsung Bar Plus) USB flash drives on my keyring with a copy of my monthly backup set. I also get email reminders from my calendar to make the weekly and monthly off-site backups. On site backups are automatic and made while I sleep, desktop and server running 24/7.
_________________
Andy Figueroa
hp pavilion hpe h8-1260t/2AB5; spinning rust x3
i7-2600 @ 3.40GHz; 16 gb; Radeon HD 7570
amd64/23.0/split-usr/desktop (stable), OpenRC, -systemd -pulseaudio -uefi
Back to top
View user's profile Send private message
psycho
Guru
Guru


Joined: 22 Jun 2007
Posts: 534
Location: New Zealand

PostPosted: Wed Feb 23, 2022 9:55 am    Post subject: Reply with quote

figueroa wrote:
Yes, I also have two large (but physically small - Samsung Bar Plus) USB flash drives on my keyring

Exactly, large but small. I like the minuscule ones (like the SanDisk Ultra Fit) that only just stick out of the port...they're much smaller and lighter than any of the keys on my ring so no hassle to carry around, and with a 256GB capacity they can backup quite a lot. Realistically the cloud is even more convenient and I can't remember the last time there was really *no* Internet access (not even via mobile networks) around here...but still, for data I really don't want to lose, having it physically in my pocket just feels safer than having it on disks I don't own (and can't access without their owners' permission), thousands of miles away.
Back to top
View user's profile Send private message
Hund
Apprentice
Apprentice


Joined: 18 Jul 2016
Posts: 218
Location: Sweden

PostPosted: Wed Feb 23, 2022 11:25 am    Post subject: Reply with quote

I can highly recommend Back In Time for backups. It's based on rsync and it simplifies everything a lot.

https://packages.gentoo.org/packages/app-backup/backintime
_________________
Collect memories, not things.
Back to top
View user's profile Send private message
user
Apprentice
Apprentice


Joined: 08 Feb 2004
Posts: 204

PostPosted: Wed Feb 23, 2022 12:24 pm    Post subject: Reply with quote

3-2-1 Backup Strategy

1) rdiff-backup hourly for non-binary content, daily for binary content (one copy data locally)
daily incremental tar archive (incremental state monthly reseted)
asymmetric gpg encryption of daily tar archive (no online access to secret gpg key)
2) daily encrypted archive copy to local NAS (second copy of data on-site but different media)
3) daily encrypted archive copy to different geolocated remote destinations with write-only access (N copies of data off-site)

- content always encrypted outside of local host (NAS or off-site)
- restore need secret gpg key and initial monthly tar archive with latest incremental tar archive
- not usable for petabyte of content :)
Back to top
View user's profile Send private message
figueroa
Advocate
Advocate


Joined: 14 Aug 2005
Posts: 2971
Location: Edge of marsh USA

PostPosted: Thu Feb 24, 2022 5:42 am    Post subject: Reply with quote

Since the subject is "Best Practices," I'll toss this out for consideration. For users and administrators, this is a burning issue. Apparently one-size does not fit all.

Best Practices
1. Do make backups.
2. Make backups often.
3. Make backups automatically.
4. Store some backups off-site.

Considerations
1. Location
2. Compression
3. User data versus operating system
4. Tolerance for loss
5. Physical security
6. Network security
7. Encryption
8. Automation
9. Scheduling
10. Recovery/Restoration

Discussion Starters
1. Tolerance for loss drives consideration of using real-time/near-real-time backups, versus hourly/nightly/weekly/monthly.
2. Some users backup none of their operating system, saying they will just re-install if they loose the system. Others backup a few critical operating system files to help them duplicate the OS upon re-installation, i.e. /var/lib/portage/world, /etc/portage/, /etc/fstab, /etc/hosts, /boot/config~. Still others backup their entire OS with third-party programs (i.e. backintime) or usually built-in tools (i.e. rsync, tar, dar, gzip, gpg, etc. for stage4 or equivalent).
_________________
Andy Figueroa
hp pavilion hpe h8-1260t/2AB5; spinning rust x3
i7-2600 @ 3.40GHz; 16 gb; Radeon HD 7570
amd64/23.0/split-usr/desktop (stable), OpenRC, -systemd -pulseaudio -uefi
Back to top
View user's profile Send private message
pa4wdh
l33t
l33t


Joined: 16 Dec 2005
Posts: 815

PostPosted: Thu Feb 24, 2022 7:09 am    Post subject: Reply with quote

To chime in on the "best practices": You may have heard of the 3-2-1 rule: Make 3 backups on 2 different media types and store 1 of them on a different location.

For me personally:
I make (automated) daily backups of unique and changing data. Things like homedir, mail, etc. This backup is encrypted with gpg and stored on a VPS. The decryption key is stored on a (encrypted) USB Stick and on paper.
I make quarterly full backups of all systems usually after i updated them. This data doesn't change much so i don't lose much if disasters happen just before i make my next backup. This is stored on USB attached HDD which is only attached when i make the backups and is encrypted with LUKS.
_________________
The gentoo way of bringing peace to the world:
USE="-war" emerge --newuse @world

My shared code repository: https://code.pa4wdh.nl.eu.org
Music, Free as in Freedom: https://www.jamendo.com
Back to top
View user's profile Send private message
psycho
Guru
Guru


Joined: 22 Jun 2007
Posts: 534
Location: New Zealand

PostPosted: Mon Mar 07, 2022 9:59 pm    Post subject: Reply with quote

Re "Some users backup none of their operating system, saying they will just re-install", yes, this is an interesting discussion point, and a question of how much additional time this will take. The time to "just re-install" is always going to be longer than the time to simply restore the fully installed and configured system from a backup...but even with a very quickly installed binary distro, how much longer it takes depends upon how much customisation is done, and whether the customisation itself has been backed up somehow.

I used to invest many hours in customising my installations (as root, i.e. to parts of the system that would not be backed up with the normal user customisations of GUI setups and so on), all of which I did manually without taking the time to document it in the form of shell scripts...so this made "just re-install" an unreasonable option, as to re-install from scratch would have cost me many, many hours: backing up user data alone was certainly not viable. Eventually though, I made the effort to track all of these manual tweaks in shell scripts, so that I can transform a fresh out-of-the-box installation into a fully configured and customised system very quickly and easily...theoretically with a single instruction, although for ease of managing the process I prefer to do it in stages with a few different scripts. So, the clean-install-plus-data-restore approach is viable now and takes maybe half an hour longer (much of which is the actual installation) on e.g. systems using apt with binary repositories.

Gentoo is different though: "just install" is a comical notion if you're in the middle of something important and need to "just install" Gentoo from scratch to the point where you can get back to restoring and using your data! Unless you've got a binary package server to accelerate things, reinstalling Gentoo is a very time-consuming process (even assuming you've meticulously backed up or scripted all your edits to /etc/* and your kernel .config and so on, so that none of that needs to be re-done)...I would never consider a Gentoo system with only its user data backed up to be properly backed up. Some kind of full system backup or snapshot is essential, unless you have lots of time on your hands and are only concerned about being able *eventually* to restore things, and not concerned about the time involved in doing so.
Back to top
View user's profile Send private message
pietinger
Moderator
Moderator


Joined: 17 Oct 2006
Posts: 4402
Location: Bavaria

PostPosted: Mon Mar 07, 2022 10:49 pm    Post subject: Reply with quote

psycho wrote:
Re "Some users backup none of their operating system, saying they will just re-install", yes, this is an interesting discussion point, and a question of how much additional time this will take.

Yes, it would take a very long time to install a new system; even if /etc is backuped (I do). But think how often you had to restore a complete system because hard disk failures (I had none the last 20 years; ok I buy a new computer all 5 or 6 years) and how often you have to restore a DATA file because it was accidently deleted (I do this often). Yes I do backups only for /home /etc and my kernel configuration. Think also for a nice side-effect of a new installation: All files where I tested something and are dead are gone also ... ;-)
Back to top
View user's profile Send private message
figueroa
Advocate
Advocate


Joined: 14 Aug 2005
Posts: 2971
Location: Edge of marsh USA

PostPosted: Tue Mar 08, 2022 3:40 am    Post subject: Reply with quote

In the few times I've had a primary hard drive suddenly die on a Gentoo system, I've been very happy to be able to restore a full stage4 tarball or stage4-equivalent backup set that was no more than a week old. I make these automatically on the following crontab schedule.
Code:
01      6       7   *       *   /home/USERNAME/bin/stage4.scr 2>&1
01      6       17   *       *   /home/USERNAME/bin/stage4b.scr 2>&1
01      6       27   *       *   /home/USERNAME/bin/stage4c.scr 2>&1

On the same primary desktop, I also make the stage4-equivalent backup on this schedule:
Code:
01      6       2,16   *       *   /home/USERNAME/bin/gentoo2bak.scr 2>&1
01      6       9,23   *       *   /home/USERNAME/bin/gentoo2bak2.scr 2>&1
01      6       28   2       *   /home/USERNAME/bin/gentoo2bak3.scr 2>&1
01      6       30   1,3-12   *   /home/USERNAME/bin/gentoo2bak3.scr 2>&1

Each of these zstd compressed tarballs or set of tarballs is 3.8 GB, takes about 19 minutes to create and contain no user files. Yes, they are actually redundant, since my backup plan and design is always a work-in-progress.

These backups are tested from time-to-time to ensure they restore cleanly and actually work. The actual stage4's are on an every 10 day (more or less) schedule, whereas the "gentoo2bak" scripts alternate weekly with the third version running once a month. This way I can always go back in time a decent amount of time so I'm not dependent on just the most recent backup. Some of these backups (more or less weekly) are also encrypted on-the-fly to external media for off-site storage, also on a 3x rotating basis.

I've shared my scripts several times, and recently, in these forums. The main point about this post is that the backups are:
1. automated
2. tested
3. at least 3x redundant

I'm as close as possible to being 100% confident in these backups. Personal files are similarly backed up, but every night. I am mindful that I don't have real-time protection Should I be working on something that I would fret it's being lost forever, I'll manually duplicate to another drive or by NFS to another machine.
_________________
Andy Figueroa
hp pavilion hpe h8-1260t/2AB5; spinning rust x3
i7-2600 @ 3.40GHz; 16 gb; Radeon HD 7570
amd64/23.0/split-usr/desktop (stable), OpenRC, -systemd -pulseaudio -uefi
Back to top
View user's profile Send private message
A.S. Pushkin
Guru
Guru


Joined: 09 Nov 2002
Posts: 418
Location: dx/dt, dy/dt, dz/dt, t

PostPosted: Tue Mar 15, 2022 5:17 pm    Post subject: System backups Reply with quote

One more thank you for all your suggestions.

It's more evidence why Gentoo is such a great distro.

I have gotten such great education as I use Linux.


Real democracy for computer users.
_________________
ASPushkin

"In a time of universal deceit - telling the truth is a revolutionary act." -- George Orwell
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Gentoo Chat All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum