Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
LUKS on RAID weird behavior... any experiences?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9883
Location: almost Mile High in the USA

PostPosted: Wed Oct 19, 2022 2:51 pm    Post subject: LUKS on RAID weird behavior... any experiences? Reply with quote

Due to layering I would have expected my ext4fs over LUKS cryptsetup over MDRAID5 to "just work"... but something strange is happening.

One of my disks is coughing up tons of SATA errors including bad sectors. However MDRAID has not yet kicked it from the array...but somehow I got massive corruption on the filesystem?? Theoretically if a sector can't be read it errors out and tries to recalculate it from the remaining disks...

but somehow, sometimes I get a blank (zeroed) sector. Which then LUKS decrypts... and returns garbage to ext4fs instead of a blank sector?

Is this possible? This seems farfetched due to the theoretical "don't return garbage without saying so" and ideally each of the layers should know when garbage is being handled and taken with a grain of salt... but I'm not sure how this corruption is happening...

Perhaps it's not a good idea to run LUKS over MDRAID yet? I'm sure a lot of people are doing this but anyone had disks fail in this setup yet?
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
araxon
Tux's lil' helper
Tux's lil' helper


Joined: 25 May 2011
Posts: 85

PostPosted: Fri Oct 21, 2022 11:03 am    Post subject: Reply with quote

That sounds like a faulty drive or cable. I use ext4 on top of luks on top of mdadm raid1 in multiple machines for 10 years now and never had a single problem.

If the drive returns zeros and tells the system it is the correct requested data, then it makes sense that luks "decrypts" the zeros to some garbage and then lets the ext4 to make sense of it. I see no problem with what the higher layers do, except the drive.

What drive it is? What do the errors look like in the log?
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9883
Location: almost Mile High in the USA

PostPosted: Fri Oct 21, 2022 2:26 pm    Post subject: Reply with quote

"Never had a single problem" meaning you never had a hard drive fail either and thus never had to go through the motions of swapping a disk?

BTW this is 4-disk MDRAID5 (in which xors of faulty data would also generate some weirdness), but I ended up with a lot of corruption from the filesystem, when emerge crashed and flagged the filesystem for fsck, I got hundreds of thousands of errors that I needed to fix during the subsequent fsck.

Since this is a backup server, no big loss but I may need to re-image this machine...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54799
Location: 56N 3W

PostPosted: Fri Oct 21, 2022 5:26 pm    Post subject: Reply with quote

eccerr0r,

I have a drive that reads correctly but writes rubbish.
I found that out by accident one day. It was flagged for a fsck but I imaged it first, as fsck is will known for making a bad situation worse.
Then I compared the image with the original - so I had two identical read from the drive.

Now fsck was allowed to do its stuff. It was happy but the partition still wouldn't mount.
Fsck changed more on the next run

Giving the drive up for dead (I still had my image), I wrote a few blocks of random data repeated with the same data at several locations on the drive.
Reads were consistent but I didn't get the data that I wrote.

Have you got one of those?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9883
Location: almost Mile High in the USA

PostPosted: Fri Oct 21, 2022 6:20 pm    Post subject: Reply with quote

Interesting, I might have to consider that possibility. Usually it's bad cable but it tends to be CRC/parity checked and it will reject writes that fail CRC/parity. But if the on-drive buffer RAM is no good...that may very well be the issue...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Zucca
Moderator
Moderator


Joined: 14 Jun 2007
Posts: 3896
Location: Rasi, Finland

PostPosted: Sat Oct 22, 2022 5:59 am    Post subject: Reply with quote

SMR drive?

I've heard that one should not use SMR drives on some RAID arrays.
I don't know which combination of SMR hard drive model, hardware or software raid and raid level is bad (since I mostly use SSDs and lower capacity HDDs) but I've heard it can cause problems.
_________________
..: Zucca :..

My gentoo installs:
init=/sbin/openrc-init
-systemd -logind -elogind seatd

Quote:
I am NaN! I am a man!
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9883
Location: almost Mile High in the USA

PostPosted: Sat Oct 22, 2022 6:54 pm    Post subject: Reply with quote

Unless there were 500G 3.5" SMR disks, not that I know of...

Anyway, there were some pending sectors but just did a repair on the array but now...
Code:
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0

Hmm... go figure...
Anyway the only suspects that make sense at this point is the drive generates random data. That would be the only thing that would pass garbage data up the stack I would think, unless there's a bug in the code that passes bad "poisoned" data, possibly due to a race condition - then that would be a software bug and no hardware hacks whether swapping to different, like non-SMR disks could fix that.

I've been using non-LUKS encrypted RAIDs for years including many drive replacements and this is the worst corruption I've seen so far. While there's still a possibility that there's hardware issues (RAM checks good, CPU is *assumed* good - which is a problem), the problem drive seems to stick to one drive... but yes could be fighting two different issues at this point, but this is still too suspicious as they seem really closely related as I got corruption soon after a disk bad sector report.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
araxon
Tux's lil' helper
Tux's lil' helper


Joined: 25 May 2011
Posts: 85

PostPosted: Mon Oct 24, 2022 7:03 pm    Post subject: Reply with quote

eccerr0r wrote:
"Never had a single problem" meaning you never had a hard drive fail either and thus never had to go through the motions of swapping a disk?

No, that's not what I meant. I have been using Gentoo since about 2005 nearly exclusively on dozens of servers over the time. Mostly with MD RAIDs 1 or 5. I have had to replace many failed drives and mdadm always saved the day. And yes, even in combination with LUKS, I had a few failed drives. Guess I was just lucky, and my drives just died, instead of spewing random data.
Back to top
View user's profile Send private message
Zucca
Moderator
Moderator


Joined: 14 Jun 2007
Posts: 3896
Location: Rasi, Finland

PostPosted: Tue Oct 25, 2022 7:57 pm    Post subject: Reply with quote

eccerr0r wrote:
Anyway the only suspects that make sense at this point is the drive generates random data. That would be the only thing that would pass garbage data up the stack I would think
Does mdraid5 actually use parity data when reading? I mean if the drive does not report it's faulty, then what?
I can't find the source at the moment, but mdraid "isn't good" at checking the integrity of the data it serves. If a drive has failed it will act for sure, but what if a drive tells "I'm ok"?

EDIT: https://www.youtube.com/watch?v=l55GfAwa8RI&t=340s

And some discussions on stack exchange: https://unix.stackexchange.com/questions/105337/bit-rot-detection-and-correction-with-mdadm
(Bit rot is the worst. I like btrfs and its ability to do almost every action online, but every action on btrfs is slow. That's why I choose mdraid+lvm+xfs_or_ext4 most of the time.)

Anyway. I have suspicion that your drive thinks it's ok.
_________________
..: Zucca :..

My gentoo installs:
init=/sbin/openrc-init
-systemd -logind -elogind seatd

Quote:
I am NaN! I am a man!


Last edited by Zucca on Tue Oct 25, 2022 8:12 pm; edited 1 time in total
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54799
Location: 56N 3W

PostPosted: Tue Oct 25, 2022 8:10 pm    Post subject: Reply with quote

Zucca,

Quote:
If a drive has failed it will act for sure, but what if a drive tells "I'm ok"?

If a drive fails, it gets kicked out of the array and you no longer have any parity data to check.

Its actually quite rare that its that black and white.
When an unreadable black is encountered, mdadm uses some other N drives to get at the data.
I'm not sure if it tries to fix it at that time or not. The drive with the failed read is not always kicked out of the array. In part, thats determined by how long the error handler takes.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
szatox
Advocate
Advocate


Joined: 27 Aug 2013
Posts: 3489

PostPosted: Tue Oct 25, 2022 9:51 pm    Post subject: Reply with quote

Zucca wrote:
Does mdraid5 actually use parity data when reading? I mean if the drive does not report it's faulty, then what?
I can't find the source at the moment, but mdraid "isn't good" at checking the integrity of the data it serves. If a drive has failed it will act for sure, but what if a drive tells "I'm ok"?

I have tested mdraid6 for parity check on read and it didn't do that. If data chunks are available, it assumes they have correct data, even though double parity does allow the disks to outvote the corrupt strip. There might be some option to enable it, but it's definitely disabled by default (afair for performance reasons)
I haven't tested raid5, but I don't think it would be enabled by default either... In case of data corruption it could only report the fact anyway, there is not enough redundancy to recover without out-of-band hint pointing at the failed block.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9883
Location: almost Mile High in the USA

PostPosted: Tue Oct 25, 2022 10:48 pm    Post subject: Reply with quote

Indeed, for performance purposes, MDRAID5 only reads primary data blocks. The parity blocks are read only during degraded mode, recovery, or integrity checks.
Which makes it even weirder, if my RAID couldn't read a sector, it should have computed the block based on the other disks...so one of the other disks is coughing up random data...

??!

Or is it just that one disk coughing up random data and a series of blocks have issues...

hmm...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Zucca
Moderator
Moderator


Joined: 14 Jun 2007
Posts: 3896
Location: Rasi, Finland

PostPosted: Fri Oct 28, 2022 4:51 pm    Post subject: Reply with quote

Interestingly (at least back in 2016) linux md will assume data blocks to be correct, even on raid6, and then rewrite (recalculate) parity blocks during scrub.
See this answer and the comments below it.
Md raid cache will help some at least.

I'd like to conduct some tests on vm... But until I have time...
_________________
..: Zucca :..

My gentoo installs:
init=/sbin/openrc-init
-systemd -logind -elogind seatd

Quote:
I am NaN! I am a man!
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9883
Location: almost Mile High in the USA

PostPosted: Fri Oct 28, 2022 5:34 pm    Post subject: Reply with quote

Now I don't know about SSDs but all HDDs (and floppy drives too!) since antiquity at least use CRC if not have ECC to error check/correct data written.
However if the hard drive electronics/microcontroller is broken and lies about data being correct or not, then we're screwed no matter what.

This should be quite rare one would hope, hard drive manufacturers test this firmware thoroughly, but perhaps I got unlucky with a bad drive.

Incidentally, after jiggling the SATA data/power cables and scrubbing the array, the errors have quieted up and so far the RAID has been behaving correctly and been getting the data back as written...
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Zucca
Moderator
Moderator


Joined: 14 Jun 2007
Posts: 3896
Location: Rasi, Finland

PostPosted: Sun Oct 30, 2022 8:53 am    Post subject: Reply with quote

eccerr0r wrote:
Now I don't know about SSDs but all HDDs (and floppy drives too!) since antiquity at least use CRC if not have ECC to error check/correct data written.
I have the impression ECC needs those magical 520 bit sectors.
eccerr0r wrote:
However if the hard drive electronics/microcontroller is broken and lies about data being correct or not, then we're screwed no matter what.
Checksums to save the day? Right?
eccerr0r wrote:
Incidentally, after jiggling the SATA data/power cables and scrubbing the array, the errors have quieted up and so far the RAID has been behaving correctly and been getting the data back as written...
Hm. Signaling issue? But surely kernel should have noticed that?
_________________
..: Zucca :..

My gentoo installs:
init=/sbin/openrc-init
-systemd -logind -elogind seatd

Quote:
I am NaN! I am a man!
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9883
Location: almost Mile High in the USA

PostPosted: Sun Oct 30, 2022 3:53 pm    Post subject: Reply with quote

Not sure what was "magical" but there are a LOT of bits on disks/platters that aren't and don't need to be accessible by the OS, including ECC/CRC bits and tracking information. Also not sure what's magic about 520 - more bits, why not 600 bits per 512 bit sector - will provide more redundancy to repair bad reads. But ultimately, this is overhead, why not take a 1TB disk and do intra-track RAID1 mirroring (i.e. on a 300 sector track, RAID1 sectors 1-150 onto sectors 151-300) and get a 500GB disk?

In any case any hardware that fails to detect errors to spec (hard drive manufacturers do specify the expected error rate, and it's not zero) ... simply put it, it's faulty. Whether trusting the drive to get it right or trusting the computer to get it right is immaterial - both really need to get it right.

And yes I don't get it, I expect the OS to know what is poisoned by cables since SATA (and even UDMA/PATA) is CRC checked...so MDRAID should have detected there was poison and read off the remaining disks to reconstruct... Still very weird.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 54799
Location: 56N 3W

PostPosted: Mon Oct 31, 2022 8:09 pm    Post subject: Reply with quote

eccerr0r,

.... unless it reads correctly but writes rubbish, correctly.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 9883
Location: almost Mile High in the USA

PostPosted: Mon Oct 31, 2022 9:52 pm    Post subject: Reply with quote

yeah that would fall under the bad hard drive category and would be SDC, but since an error was detected at the SATA/UDMA/cable level, the error should be handled all the way up the chain until the consumer knows it's bad...

There is a saying in computer hardware, sometimes it's better to report nothing than to report garbage... this apparently is not being honored.
_________________
Intel Core i7 2700K/Radeon R7 250/24GB DDR3/256GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum