[Solved - HPA] GDisk - Corrupted partition table recovery

DigitalCorpus · Apprentice Joined: 30 Jul 2007 Posts: 283

I have 2, 4 TB EVO 860 SSDs I've copied my spinning rust data onto as I migrate from my old Q9400 server to my new i5-8500T server. I left one plugged in after having a full Gentoo install and DE functioning while I confirmed my overclock limit with this RAM & CPU, cause I forgot. The plugged in one suffered partition tabel corruption. Both SSDs are a single-partition with a Protected MBR & GPT layout,F2FS formatted. I *could* just wipe, copy, and MD5-verify everything on this single corrupted SSD, but I want to learn a bit more about how to properly recover the partition table. Here's the output of gdisk:

NeddySeagoon · Posted: Sat Jan 30, 2021 2:46 pm Post subject:

DigitalCorpus,

Maybe the filesystem is damaged?
How did you do the copy?
The wrong answer is dd, unless the drives are *exactly* the same size. If you used dd and the output drive is smaller than the input drive, the end of the input drive is truncated.
That's not recoverable.
So, is

DigitalCorpus · Apprentice Joined: 30 Jul 2007 Posts: 283

I did rsync -a for the vast majority of the data copy from spinning rust to SSD. I did cp -a for a few outstanding files. sort-ed, md5deep -rel for verification.

Sorry, I skipped a part: I noted there was an initial problem when the disk presented but no partitions mounted and cat /proc/partitions showed that the partition, eg. /dev/sdb*1*, was no longer "detected" by the system. Opening the disk in gdisk then told me the backup structures were corrupted etc.

When verifying the OC limits, I did have a few kernel panics or oops after POST, during kernel loading, rc init, and/r logging into XFCE4. Disk corruption was expected /shrug

non-corrupted SSD:

NeddySeagoon · Posted: Sat Jan 30, 2021 11:26 pm Post subject:

DigitalCorpus,

DigitalCorpus · Apprentice Joined: 30 Jul 2007 Posts: 283

Yeah, it's odd that the pair of SSDs differ in size. I don't know which mechanisms are used for reporting from the various tools, but from /proc/partitions to smartctl, they differ. At the same time, the FTL is a layer of obfuscation and I *assume* that if NAND cells were damaged, the firmware blacklists them and, if it was possible, rewrites the data elsewhere from reconstructed parity info, else the data is viewed as corrupted in software.

Should say nothing... FWIW, this system is an Intel P965 (Intel 82801HB) w/ AHCI in case we get more into the weeds and things look atypical.

DigitalCorpus · Apprentice Joined: 30 Jul 2007 Posts: 283

I used gdisk to backup the GPT to disk for both SSDs, sdb = good & sdc = bad, and here is the non-zero hex from them using dhex w/ a little padding:

Bad, sdc:

NeddySeagoon · Posted: Sun Jan 31, 2021 9:53 am Post subject:

DigitalCorpus,

Why do yo have a HPA on that drive?

DigitalCorpus · Apprentice Joined: 30 Jul 2007 Posts: 283

I haven't heard of HPA until you had me grep my boot log. Don't know how it got "turned on"

I tried the mount as root and got that result, which is why I posted USE flags

DigitalCorpus · Apprentice Joined: 30 Jul 2007 Posts: 283

If I'm understanding the man page correctly, the following command should remove HPA from the affected drive, currently sdc.

NeddySeagoon · Posted: Sun Jan 31, 2021 7:30 pm Post subject:

DigitalCorpus,

I don't understand the

DigitalCorpus · Apprentice Joined: 30 Jul 2007 Posts: 283

DigitalCorpus · Apprentice Joined: 30 Jul 2007 Posts: 283

If this data was not the backup, I'd have taken it a little slower and tested the non-once-per-instance version of the command. I did a little more searching and netted this writeup someone did that advised overprovisioning of an SSD with HPA. I mean, I get it, but that's rather a rather hardcore limit instead of just creating a smaller partition.

https://www.thomas-krenn.com/en/wiki/SSD_Over-provisioning_using_hdparm

Edit:
Possibly the mount command has a HPA check that prevented operation from continuing???

Edit 2:
fsck-ing checked out fine. All systems go.

NeddySeagoon · Posted: Sun Jan 31, 2021 9:01 pm Post subject:

DigitalCorpus,

SSDs are over provisioned anyway. There is no need to 'footer' with a HPA.
Look at smartctl -a to see the over provisioning.

DigitalCorpus · Apprentice Joined: 30 Jul 2007 Posts: 283

Neddy,

I'd say the overprovisioning is dependent upon model and manufacturer. The MX500 and 2 860 EVOs I have have none listed under ID 180 or any similarly named ID listed out by smartctl.

NeddySeagoon · Posted: Sun Jan 31, 2021 9:25 pm Post subject:

DigitalCorpus,

It will still be there. Its the only way to cope with manufacturing defects and through life sector fails.
Its been on rotating rust HDD for over 20 years now, hence you never find any bad sectors on a HDD until its end of life.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.

DigitalCorpus · Apprentice Joined: 30 Jul 2007 Posts: 283

Needy,

Yes, it is there, but as Adandtech has pointed out over the years, it’s largely an unpublished spec. So if this were a write-heavy drive, I’d be adding a 10-20% on top of the unknown for performance consistency’s sake.

NeddySeagoon · Posted: Mon Feb 01, 2021 9:57 am Post subject:

DigitalCorpus,

Agreed. It will be 10% or less though. Silicon storage is made in chunks of 2^n.
The 10% comes from the approximate difference between 1000 and 1024. That's kB or kiB.
Any more would need an extra address bit on the controller, which pushes up the cost. Any less is cost savings to the manufacturer.
Reducing it too far will upset buyers and push up the costs due to warranty returns.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.