Question about the safety of data

depontius · Advocate Joined: 05 May 2004 Posts: 3509

This is all interesting reading, especially the last stuff about ext4 defaults.

I've seen the most important word a few times in the thread, but not that often considering the topic - backups.

A few years back, my venerable and ancient 2x40GB RAID1 was finally getting overstuffed, so I bought a pair of 2TB drives for the new RAID1. Somehow I had gotten the impression that btrfs had actually managed to mature, at least enough for a home setup - so I used it. At the same time, I didn't quite trust either these gigantic new drives or btrfs, so I finally put in place a real backup plan. I have a 2TB portable drives. At the time I had two, now I have three. Every night I ping-pong between the two, rsync'ing my RAID to the least-recently written portable. I also implemented offsite backup. One drive is sitting in my cabinet at work. (That's why I'm up to three drives, two to ping-pong and one at work.)

Anyway, I've had UPS, but didn't during that timeframe, and had a power fault. My ext4 partitions came back OK, my XFS (MythTV) partition came back. My btrfs partition utterly died. Not just corrupted - GONE. Running btrfs-scan, or some command like that couldn't even tell that btrfs had ever existed on the drives. What's worse, I didn't catch it immediately upon powering back up - I don't remember why - I think it was evening and I didn't want to power up the client machines since it was bedtime - I just got the servers back up. Anyway, that night's backup backed up nothing onto the portable drive at home. Wiped.

Saved by offsite backup.

After a bit of fussing I reformatted as ext4, restored from the drive at work, and have never looked back.

As for remedial actions... My backup script now makes sure that the source drive is properly mounted, not just the destination drive. It will never back up an empty mount point again. I'm up to three portable drives, so that except for the work-day Mondays and Fridays, when I'm actually transporting drives, there are always two drives plugged in and one at work. As mentioned, I'm back to ext4. Oh, and UPS.

In retrospect, I feel some guilt about wiping the btrfs. Had I the extra space sitting around, I really should have somehow gotten that data to the btrfs developers.

EDIT - to add one other thing, I run the recommended weekly mdadm sync.
_________________
.sigs waste space and bandwidth

axl · Veteran Joined: 11 Oct 2002 Posts: 1144 Location: Romania

there's a strange mix of things.

dealloc. tco. systemd. i know the drama behind systemd. but never heard of this tco or dealloc thing. could anyone please elaborate ?

meanwhile, i kinda hate btrfs.

someone once told me that my posts are like blog posts. and this is exactly what I aimed when i started the thread. talking about my stuff. catching news. maybe some drama. so ... i just want to understand what people are talking about. seems things escalated but i dont understand the fight.

anyway. going back to btrfs. blog style. I mentioned yesterday i was on my way back to raid/xfs. first you got to get rid of btrfs.

so first step was to convert btrfs from raid1 back to single. which took 10+ hours.

I say +10 because i know these drives do a complete run of each other in 8 hours. i mentioned that before. but again. i caught what they were doing, while they were doing it.

so this is the convert from raid1 to single. side by side drive A and drive B.

https://www.youtube.com/watch?v=X_jpViD6wKE

also http://dale.ro/~axl/sda1.png and http://dale.ro/~axl/sdb1.png

then came the step to remove drive b from array. which is weird. btrfs is weird. talk about a device that is a mount point.

anyway.

the end result of removing drive b from mount point took another few hours.

https://www.youtube.com/watch?v=DqG7Quv8o0Q

http://dale.ro/~axl/sda2.png

http://dale.ro/~axl/sdb2.png

there needs to be like a legend. what am i looking at. again.

we are talking about 2 WD reds of 4Tb that were put in a raid1 btrfs configuration. first animation shows the btrfs being reconfigured as a single. second shows drive B being pulled out of the btrfs. i took the long way around. i mentioned that yesterday.

just to be safe.

tomorrow. btrfs 2 xfs. moving data around. and maybe, just maybe, how raid syncs data. meanwhile... what's all the drama about?

Hu · Administrator Joined: 06 Mar 2007 Posts: 21709

Goverp · Advocate Joined: 07 Mar 2007 Posts: 2014

Tso's delalloc article is interesting. IIUC, one way of approaching the problem is to consider a (decent) filesystem as composed of two parts: (a) a journal of all update transactions; and (2) a cache of the results of applying all the journalled transactions (the actual files). The problem arises because the cache (file system) has been updated ahead of a syncpoint in the journal, which of course is madness. Many mad things are done in the name of performance.

Which raises a few questions: a) would configuring data journalling in ext4 mitigate the described problem; b) can logging file systems such as f2fs, logfs ,handle such situations better; and c) is there an excessive performance hit ;-)

_________________
Greybeard