Redundant local external storage that's not always on

eccerr0r · Posted: Fri Sep 20, 2024 12:44 pm Post subject:

I back up my PVR and my NAS/Shell/VM/HTTPD/SMTPD/etc. boxes. Both boxes take about a half to a full hour to rsync -- the PVR has somewhat high churn rate, the shell box VMs take a while to do diffs and depending on how much churn it could take more than an hour to rsync... Both cases I think the bottleneck is the speed of seeks and read rate of the hard drive sets but the PVR backup is real close to being CPU-limited. When it's not cpu limited, it still gets real slow when trying to do diffs between two directory trees with a lot of directories and small files.

The PVR has a 2TB disk (single disk), the shell box is only 4TB (3x2TB RAID5). My rsync backups are to two different machines with encrypted RAID5. I now back up the PVR to a 5 disk 500GB encrypted RAID5 as that's the hard drives I had spare from upgrades or acquired e-waste, and the shell box I back up to more 2TB disks.

Currently thinking about changing my backup strategy after acquiring more e-waste hdd's instead of running so many 500G HDDs... though the RAID works just fine and I now have plenty of 500G disks as cold spares for that array. At least the 500G disk array takes only an hour and a bit to rebuild, my 2T disk array takes 5 hours. Ouch.

I don't think I'll go to 6 disks though that's the limit, mostly to leave one port open for a hot spare if needed on the 6-port motherboards. Need to find sata PCIe cards...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?

Bob P · Advocate Joined: 20 Oct 2004 Posts: 3374 Location: USA

pietinger

lars_the_bear · Guru Joined: 05 Jun 2024 Posts: 537

maiku

Bob P · Advocate Joined: 20 Oct 2004 Posts: 3374 Location: USA

Oh, I'm not a pi-basher by any stretch of the imagination, but based on the amount of time between my posts I think you can get the idea how old some of my Pi experiences were.

I'm sure that the RPi ecosystem has improved a lot since the early adoption period. I jumped on the bandwagon early. Back then we had to learn the hard way (through failures) about SD card problems that would take down the system and how to design the system to avoid the Pi's inherent weaknesses. Back then that problem wasn't even recognized or discussed on Pi site as we were blazing the trail, trying to press the platform do more than it was originally envisioned to do as a "teaching system for school children." Back then those of us who tried to deploy them as constant-on computing platforms were running into new problems that took some time to figure out. I'm not saying that the Pi is a bad system, but way back when, things weren't very well documented and eventually I got tired of beating my head against the wall solving problems. Today things are a lot better, as anyone who wants to use the system can benefit from the solutions that were provided by all of the people who toughed it out in the early period and documented all of the problems and workarounds. When I was active the system just wasn't as robust as it sounds like it is today, so I moved along.

For me the lack of a proper disk interface forced me into kludge adaptations that weren't sufficiently robust for my needs, so I looked for other solutions. I just couldn't justify trying to press the system do something that it was never designed to do... with what was probably an unreasonable expectation of fault tolerance on my part. I mean things were really buggy back in the beginning. I spent a ridiculous amount of time trying to make the Wolfson Audio card work to build a Pi renderer. It turned out that I wasn't doing anything wrong, it was just that the problem was unsolvable because the card's drivers didn't work as promised.

Fast forward many years later and the ecosystem has matured quite a bit, to the point that the weaknesses are well known, the workarounds are well-documented, and new features exist that improved the plaform. But in the old days the SBC concept just wasn't all that I needed for the jobs I was doing, so I moved on. I tried getting back into the ecosystem when the later versions were relased but their unobtainium status at the time killed my joy. I haven't been back since.

I've never understood why they continue to design those things so that all of the IO ports stuck out in 4 different directions instead of using an inline configuration. That's always been a design compromise that I never undestood.

I still have all of my cards, hats, relay boards, and all kinds of other stuff sitting in a box and I keep promising myself that I'll do something with them someday. For me those things will be one-off projects, rather than production deployments. My problem is that the amount of time that it takes to work out a one-off solution has been cost prohibitive for me. I just don't have enough time. I could see where it would be worth time to develop a system that you could replicate and put into production. That would make the time investment worthwhile, but time has been my worst enemy in that regard.

If you don't mind me asking, what kind of deployments have you put into production on the Pi?

Bob P · Advocate Joined: 20 Oct 2004 Posts: 3374 Location: USA

eccerr0r · Posted: Mon Oct 21, 2024 10:36 pm Post subject:

IMHO if one filesystem requires ECC RAM to get the same reliability as another without ECC RAM, the former is lacking.

If both filesystems benefit equally with ECC RAM, then neither really benefit.

If one filesystem corrupts less than another both without ECC RAM, that filesystem is better...

Sigh. I'm running ext3/ext4fs over LVM over RAID5, without ECC RAM. Been thinking about "downgrading" my Core2Quad server to an Atom that has ECC RAM -- but that Atom is like half to third the speed of the C2Q in GCC and it's expected to be part of my distcc farm. The atom is faster than the C2Q in AES however...

Wow... 1Gbps Ethernet is approximately 8TiB/day. Have to keep this in mind when estimating how fast things go...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?

Bob P · Advocate Joined: 20 Oct 2004 Posts: 3374 Location: USA

When ZFS does it's scrubs, the filesystem gets moved back and forth a lot between disks and memory. The copy-on-write system churns the disk quite a bit to prevent bit-rot.

I once had my box report to me that it suffered, recognized, and corrected a spontaneous memory bit failure during a scrub. I'm glad that the error was fixed automatically by ECC and the BIOS reported it. This happened one time over the course of very many years of service on that box. If I was not running ECC RAM that flipped bit would have been written to disk. Who knows what might have resulted from that.