View previous topic :: View next topic |
Author |
Message |
glitch13 Apprentice
Joined: 02 Oct 2002 Posts: 213 Location: New Orleans, LA
|
Posted: Thu Jun 30, 2005 3:02 pm Post subject: Backing up with rsync [SOLVED] |
|
|
My company has a web based application that our employees use to track insurance claims, and the system has the capability to "attach" files to the claims. These files are all stored in this fashion:
Code: | /documents/<claim number>/<attachedfile.ext> |
Currently there are roughly 25 gigs of pdfs and office documents in /documents/ and we have started feeling the need for an off-site backup. Since we only get around 5-20 megs new data a day, I naturally thought of using rsync every night (or maybe every other night) to keep our backup server in sync with the production one.
Well, I setup an rsync server on the production machine, and on the backup machine I executed this command:
Code: | rsync -rvP rsync://ourdomain.com/attachments /documents |
25 gigs later, I have a the /documents directory duplicated on our backup machine (finished sometime last night). I've checked the two directories (production and backup) and there's currently only a 1 meg difference in total size between the two, reflecting the addition of files since it finished last night.
My problem is, I did a dry run with the same command again on the backup server to see what new files it would fetch:
Code: | rsync -rvPn rsync://ourdomain.com/attachments /documents |
and it's wanting to transfer ALL the files AGAIN! What did I do wrong? Isn't rsync only supposed to transfer newer/changed files?
PS: here's the /etc/rsync/rsyncd.conf on the production machine:
Code: | uid = root
gid = root
use chroot = no
max connections = 2
log file = /var/log/rsync.log
pid file = /var/run/rsyncd.pid
motd file = /etc/rsync/motd.txt
hosts allow = <backup server ip>
hosts deny = *
[attachments]
path = /documents
comment = Claim File Attachments
read only = yes
|
Last edited by glitch13 on Fri Jul 01, 2005 1:41 am; edited 1 time in total |
|
Back to top |
|
|
neilhwatson l33t
Joined: 06 Feb 2003 Posts: 719 Location: Canada
|
Posted: Thu Jun 30, 2005 3:26 pm Post subject: |
|
|
Check the time stamps on the source and destination files. _________________ The true guru is a teacher.
Neil Watson |
|
Back to top |
|
|
limn l33t
Joined: 13 May 2005 Posts: 997
|
Posted: Thu Jun 30, 2005 3:40 pm Post subject: |
|
|
Your rsync command tells rsync just to recursively copy the files. You will need to include at least one option telling it to do a comparison between the files. For a backup '-a' is often used.
BTW, what you are doing does not require a rsync server. You might want to consider initiating the transfer from the server to the backup. |
|
Back to top |
|
|
neilhwatson l33t
Joined: 06 Feb 2003 Posts: 719 Location: Canada
|
Posted: Thu Jun 30, 2005 3:49 pm Post subject: |
|
|
Typically I use a command like this to perform backups from the backup server:
Code: |
rsync -utrxlogp --stats -e ssh --delete --exclude-from /usr/local/sbin/rsync-exclude client:/home/ /home >> /var/log/rsync.log
|
_________________ The true guru is a teacher.
Neil Watson |
|
Back to top |
|
|
Reverend n00b
Joined: 19 Mar 2003 Posts: 22
|
Posted: Thu Jun 30, 2005 3:55 pm Post subject: |
|
|
I have always used rsync with the -av options. According to "man rsync":
Code: | -a, --archive archive mode, equivalent to -rlptgoD |
by specifying only -rvP, you neglect the -l, -p, -t, -g, -o, -D options:
Code: | -l, --links copy symlinks as symlinks
-p, --perms preserve permissions
-t, --times preserve times
-g, --group preserve group
-o, --owner preserve owner (root only)
-D, --devices preserve devices (root only)
|
Thus, everytime rsync sees the files as not in sync.
I'd recommend, if possible, bring the backup disk to your site -- maybe even put it in the same computer as the source. That way you can do the backup very quickly and can test it out. Once the data has been replicated, you can then move it off-site and do your incrementals. |
|
Back to top |
|
|
glitch13 Apprentice
Joined: 02 Oct 2002 Posts: 213 Location: New Orleans, LA
|
Posted: Thu Jun 30, 2005 4:27 pm Post subject: |
|
|
Reverend wrote: | I'd recommend, if possible, bring the backup disk to your site -- maybe even put it in the same computer as the source. That way you can do the backup very quickly and can test it out. Once the data has been replicated, you can then move it off-site and do your incrementals. |
I'm actually currently doing this on two local machines for now, just left that out of the post for brevity and clarity, but thanks for the tip anyway
I tried an
Code: | rsync -avPn rsync://server/attachments /documents |
And it's still trying to pull it all down. Is there anything I can do to massage the current archive I have on the backup machine to allow rsync to do it's magic (so I wouldn't have to repull the 25gigs)? |
|
Back to top |
|
|
limn l33t
Joined: 13 May 2005 Posts: 997
|
Posted: Thu Jun 30, 2005 5:01 pm Post subject: |
|
|
My previous post was incorrect.
rsync -r does compare the source and target files, but does not preserve time stamps. If the time stamp of the target file is the same as the local file it does not update it. If they are not the same, it sets them to the current time.
When you issued the rsync -r, rsync started updating the target files with the current time stamp. When you issued the rsync -a it is bringing the time stamps, and anything else included in -a, into line between the source and target.
It is not re-copying the files entire. Try it out on a subset, and look at the actual bytes sent compared to the total size of the files. Because it is changing the files, it lists them in the verbose output. |
|
Back to top |
|
|
glitch13 Apprentice
Joined: 02 Oct 2002 Posts: 213 Location: New Orleans, LA
|
Posted: Thu Jun 30, 2005 6:12 pm Post subject: |
|
|
limn wrote: | It is not re-copying the files entire. Try it out on a subset, and look at the actual bytes sent compared to the total size of the files. Because it is changing the files, it lists them in the verbose output. |
I looks like it's doing a straight recopy. Here's one of the random files I checked from rsync's output the second time I ran it. Here's the source file:
Code: | -rw------- 1 apache apache 734853 Nov 11 2004 1711adj-est and photos 11-11-04.pdf |
and as you can see, it was added Nov 11, 2004 so it's not a new one. Here's it's output from rsync:
Code: | 100230/1711adj-est and photos 11-11-04.pdf
734853 100% 43.80MB/s 0:00:00
|
So they're getting bulk recopied it looks like. I'm trying it again with these options:
Code: | rsync -v --archive --delete --delete-after --stats --progress rsync://server/attachments /documents |
Then I'll try it again to see if it works... |
|
Back to top |
|
|
limn l33t
Joined: 13 May 2005 Posts: 997
|
Posted: Thu Jun 30, 2005 6:54 pm Post subject: |
|
|
Look at the line at the end of your rsync, like this one:
Quote: | sent 714 bytes received 418 bytes 452.80 bytes/sec |
|
|
Back to top |
|
|
glitch13 Apprentice
Joined: 02 Oct 2002 Posts: 213 Location: New Orleans, LA
|
Posted: Fri Jul 01, 2005 1:41 am Post subject: |
|
|
The command I tried did the trick, thanks all.
Code: | rsync -v --archive --delete --delete-after --stats --progress rsync://server/attachments /documents |
|
|
Back to top |
|
|
|