View previous topic :: View next topic |
Author |
Message |
aCOSwt Bodhisattva
![Bodhisattva Bodhisattva](/images/ranks/rank-bodhisattva.gif)
Joined: 19 Oct 2007 Posts: 2537 Location: Hilbert space
|
Posted: Wed Oct 24, 2012 4:57 pm Post subject: If EXT4, don't use 3.5.7 / 3.6.2 kernels [OBSOLETE] |
|
|
ext4 data corruption regression _________________
Last edited by aCOSwt on Wed Oct 31, 2012 8:51 am; edited 1 time in total |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
gerard27 Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
Joined: 04 Jan 2004 Posts: 2377 Location: Netherlands
|
Posted: Wed Oct 24, 2012 5:47 pm Post subject: |
|
|
Thanks aCOSwt.
I installed 3.5.7 two days ago.
No problems so far but to be on the safe side I switched back to 3.4.9.
Gerard. _________________ To install Gentoo I use sysrescuecd.Based on Gentoo,has firefox to browse Gentoo docs and mc to browse (and edit) files.
The same disk can be used for 32 and 64 bit installs.
You can follow the Handbook verbatim.
http://www.sysresccd.org/Download |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
bandreabis Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
![](images/avatars/188011968046698f5684c86.jpg)
Joined: 18 Feb 2005 Posts: 2495 Location: イタリアのロディで
|
Posted: Thu Oct 25, 2012 6:17 am Post subject: |
|
|
Affected kernels have been Hard Masked!
At least yesterday they were. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
aCOSwt Bodhisattva
![Bodhisattva Bodhisattva](/images/ranks/rank-bodhisattva.gif)
Joined: 19 Oct 2007 Posts: 2537 Location: Hilbert space
|
Posted: Thu Oct 25, 2012 6:41 am Post subject: |
|
|
Could be some poorly tested specific option's fault :
Quote: | > the full set of options for all my ext4 filesystems are:
>
> rw,nosuid,nodev,relatime,journal_checksum,journal_async_commit,nobarrier,quota,
> usrquota,grpquota,commit=30,stripe=16,data=ordered,usrquota,grpquota
ok journal_async_commit is off the reservation a bit; that's really not
tested, and Jan had serious reservations about its safety.
* Can you reproduce this w/o journal_async_commit? |
_________________
|
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
c00l.wave Apprentice
![Apprentice Apprentice](/images/ranks/rank_rect_2.gif)
Joined: 24 Aug 2003 Posts: 268
|
Posted: Sat Oct 27, 2012 12:01 pm Post subject: |
|
|
Note that there's a null-pointer dereference occuring when large files are being deleted from ext4 filesystems in 3.4.9, which was my main reason to upgrade to and stay at 3.5.7 on the systems I maintain (I've actually hit the null-pointer bug when moving backups of many tens of gigabytes across disks). It's much more probable to hit that bug than hitting the journal bug that led into masking panics - for what I read, the current bug is considered to occur only under very specific circumstances that require having created and mounted the filesystem with uncommon options.
The null-pointer bug seems to have been missed by Gentoo devs but now has a bug report as well (at least it reads like the same one I encountered).
So I would add "don't use 3.4.9 either" or "but run 3.5.7/3.6.2 anyway if you run default filesystems" (without warranty, you'd better have backups either way). _________________ nohup nice -n -20 cp /dev/urandom /dev/null & |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
platojones Veteran
![Veteran Veteran](/images/ranks/rank_rect_5_vet.gif)
![](images/avatars/21016147504bef78082c100.jpg)
Joined: 23 Oct 2002 Posts: 1602 Location: Just over the horizon
|
Posted: Sat Oct 27, 2012 12:51 pm Post subject: |
|
|
Mask has been lifted for 3.6.2. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
ppurka Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
Joined: 26 Dec 2004 Posts: 3256
|
Posted: Sat Oct 27, 2012 4:41 pm Post subject: |
|
|
platojones wrote: | Mask has been lifted for 3.6.2. | Not surprised. It was an esoteric bug reproducible only on an esoteric configuration. _________________ emerge --quiet redefined | E17 vids: I, II | Now using kde5 | e is unstable :-/ |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
platojones Veteran
![Veteran Veteran](/images/ranks/rank_rect_5_vet.gif)
![](images/avatars/21016147504bef78082c100.jpg)
Joined: 23 Oct 2002 Posts: 1602 Location: Just over the horizon
|
Posted: Sat Oct 27, 2012 11:37 pm Post subject: |
|
|
ppurka wrote: | platojones wrote: | Mask has been lifted for 3.6.2. | Not surprised. It was an esoteric bug reproducible only on an esoteric configuration. |
Not sure it's been reproduced at all. Only the original reporter on the thread and supposedly one other (2nd hand report) so far. Ts'o has yet to be able to reproduce it. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Tony0945 Watchman
![Watchman Watchman](/images/ranks/rank-G-2-watchman.gif)
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Fri Nov 02, 2012 8:15 am Post subject: |
|
|
As a user, I'm thoroughly confused. I run a stable system as much as possible. I was on 3.4.9, a routine update installed 3.5.7. I rebuilt the kernel and removed 3.4.9, then 3.5.7 was masked and I re-emerged 3.4.9, rebuilt the kernel, and removed 3.5.7, now emerge -auvND world wants to re-install 3.5.7
I don't understand the mount option problem. My /etc/fstab has the following line:
Code: | /dev/sda2 / ext4 noatime 0 1 |
What is the latest stable safe kernel to run? Should I mask both 3.4.9 and 3.5.7 ?
What mount options are safe to use? I don't remember the full line when I created the file system years ago. How can I display this?
Should I tar off the system and reformat the drive with ext3? Random data loss is a scary thing. I applaud wholeheartedly those individuals who take the risk to test these kernels, but I don't want risk on my personal system. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
c00l.wave Apprentice
![Apprentice Apprentice](/images/ranks/rank_rect_2.gif)
Joined: 24 Aug 2003 Posts: 268
|
Posted: Fri Nov 02, 2012 8:46 am Post subject: |
|
|
I don't think the 3.4.9 bug causes random data loss - loss should happen only to files that were being written/deleted at that time. After the null-pointer dereference occured, I found the backup files I copied to be randomly 0 byte size on either target or source and one file cut off. Having upgraded to 3.5.7 I compared file sizes and copied the larger file to the destination drive but I haven't tried if the data is still ok (those were only old backups moved to make space for newer ones). I'm not a kernel developer but the effect of the 3.4.9 bug does not appear to be worse than simply cutting power while writing to disk - the journal will revert any pending transactions and fsck will check for structural conistency.
If you don't remember having set any fancy options for your ext4 partitions, I wouldn't mind the bug in 3.5.7. However, it would be much more severe if it stroke. It's your own choice but I stayed with 3.5.7 so far.
To be completely safe, you could also choose a kernel older than 3.4. I wouldn't want to "downgrade" to ext3, though.
(Your "noatime" mount option is nothing special, it just disables the usually unnecessary "access time" logging.) _________________ nohup nice -n -20 cp /dev/urandom /dev/null & |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
aCOSwt Bodhisattva
![Bodhisattva Bodhisattva](/images/ranks/rank-bodhisattva.gif)
Joined: 19 Oct 2007 Posts: 2537 Location: Hilbert space
|
Posted: Fri Nov 02, 2012 8:51 am Post subject: |
|
|
Tony0945 wrote: | As a user, I'm thoroughly confused. I run a stable system as much as possible. I was on 3.4.9, a routine update installed 3.5.7. I rebuilt the kernel and removed 3.4.9, then 3.5.7 was masked and I re-emerged 3.4.9, rebuilt the kernel, and removed 3.5.7, now emerge -auvND world wants to re-install 3.5.7
I don't understand the mount option problem. My /etc/fstab has the following line:
Code: | /dev/sda2 / ext4 noatime 0 1 |
What is the latest stable safe kernel to run? Should I mask both 3.4.9 and 3.5.7 ?
What mount options are safe to use? I don't remember the full line when I created the file system years ago. How can I display this?
Should I tar off the system and reformat the drive with ext3? Random data loss is a scary thing. I applaud wholeheartedly those individuals who take the risk to test these kernels, but I don't want risk on my personal system. |
You observed the 3.5.7 -> 3.4.9 -> 3.5.7 flip flop because
1/ 3.5.7 was flagged stable
2/ 3.5.7, by precaution following the problem object of this thread, 3.5.7 was reflagged ~arch => 3.4.9 became last stable
3/ 3.5.7, the problem object of this thread is believed marginal => 3.5.7 comes back stable.
Last x86_64 gentoo stable today is 3.5.7
You do not have to worry with the mount options which probably triggered this option as long as you use default mount options.
Safe mount options are default mount options, that is why... they are default...
This is what I get for example for an ext4 in my system.
Code: | LABEL=M_1_G64_VAR /var ext4 defaults,noatime,nodiratime 0 2 |
The user having the problem was *not* using default mount options.
(BTW, there is no problem with noatime, nodiratime either, even if they are not default) _________________
|
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
Tony0945 Watchman
![Watchman Watchman](/images/ranks/rank-G-2-watchman.gif)
Joined: 25 Jul 2006 Posts: 5127 Location: Illinois, USA
|
Posted: Fri Nov 02, 2012 8:35 pm Post subject: |
|
|
Thanks for the prompt response. |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
|