Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
D state process stack trace
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo
View previous topic :: View next topic  
Author Message
sayusi
n00b
n00b


Joined: 10 Aug 2006
Posts: 26
Location: Hungary

PostPosted: Wed Jun 29, 2011 5:25 pm    Post subject: D state process stack trace Reply with quote

Hi Guys,

I have a very special problem. Firstly, I would like to describe what happens. Secondly, what I did. Thirdly, my questions.

I have an amd64 system ACCEPT_KEYWORDS="~amd64" and I would like to use Chromium which is my favorite browser. But, Chromium sometimes make a D state process as you can see on this picture [1]. The side-effects of this are the next:
- high load average. More than 10. Because of 1 or 2 D state process!
- To kill Chromium is not possible because next time will not start again. The D state process is blocking.
- I have to restart my machine
- After the chromium's D state process has been maded htop and grep command aren't working. They are not able to write to consol. When I'm speaking of console in this case doesn't matter that we speaking of Konsole (KDE application) or pure console (leaving X). Htop goes D state as well.

I wrote an email to gentoo-users mailing list and they said that the root cause is the flash itself. I uninstalled flash and nspluginwrapper as well but the Chromium continues this "randomly make a D process and kill the whole machine" stuff. Here I think that this problem is little bit more than linux vs. flash issue. I've heard that there is possible that Chromium delivered bundled flash. I contacted with the Gentoo Chromium Group and they said that flash is not bundled in Chromium. By the way they suggested that this forum could be the proper place to discuss my problem.

A little bit later I asked my friends who are well versed *nix administrators. They suggested that I should examine the call trace (echo t > /proc/sysrq-trigger) but the kernel log didn't contain the D state process call trace, however, I could see the D state process in the output of the top. I changed the syslog-ng settings but just the little part of the process tree was in the log. By the way if htop became D state process, I can find the call trace of the D state htop process.

I tried with strace but after the D state process has been created the strace not able to write to the consol.

This Chromium issue not depends on versions. And yes, I checked at "about:plugins" page the flash is disabled or not. I made it with all of versions in the actual portage tree. Every other browser working fine with and without flash. I don't have any problem with my machine. The problem is always that process which deals with "--extra-plugin-dir".

So, I have few questions:
- Anybody else experienced like this regarding Chromium or regarding anything else?
- How I'm able to log the D state process to debug and figure out what a hell happens?
- How can I kill a D state process. I saw that Opera scanning the plugins directory as well and if the D state process has already made by Chromium than the Opera's scanning process going to D state for a while as well and a little bit later this D state process owned by Opera is disappearing. The question is that how Opera do that?

What I should do now:
- running a memtest whether my machine is dying (I hope not!)

So, here is the time to say a thank you for your time and your suggestions in advance!

András


[1] - http://sayusi.hu/blog/picture_about_htop
_________________
- -
-- Csanyi Andras/Sayusi Ando -- http://sayusi.hu -- http://facebook.com/andras.csanyi
-- "Trust in God and keep your gunpowder dry!!" - Cromwell
Back to top
View user's profile Send private message
x22
Apprentice
Apprentice


Joined: 24 Apr 2006
Posts: 208

PostPosted: Wed Jun 29, 2011 6:59 pm    Post subject: Reply with quote

Logging stuck processes:
Magic sysrq should work (if /proc/sysrq-trigger exists):
Code:
echo w > /proc/sysrq-trigger
dmesg | less

("t" logs all processes, "w" only those which are blocked.)

(or press alt+sysRq+w and then use dmesg command)

Processes can become stuck in D state due to kernel bug, suddenly disconnected NFS (or similar) server or some HW failure. D means uninterruptible sleep and it really is uninterruptible - there is no way to kill process stuck in D state (if it is really stuck and not just very busy doing disk I/O).
Back to top
View user's profile Send private message
manlius
n00b
n00b


Joined: 27 Jul 2011
Posts: 1

PostPosted: Wed Jul 27, 2011 3:56 am    Post subject: maybe disk suspend bug Reply with quote

I had a problem similar to yours: Chromium randomly(not exactly) falling into D state and a reboot needed to resolve it.
At first I suspected the notorious Flash plugin though no good.
"Maybe Chromium is buggy?" So I tried "emerge -uDN world", reemerging Chromium and emerging Firefox.
I left for work and back I found gcc was frozen just blinking cursor on the console.
At some point during the compile time gcc got into D state!
"Oh, this must be a disk related problem!" I converted the file system ext3 to ext4 and performed fsck also no good.
But then again I realized the problem has some pattern: it only occurs after certain amount of time.

So I turned off(rc-config delete hdparm boot) the hdparm service which had this spin-down switch, namely "-S96".
And the problem has "almost" gone. I now seldom see the D state Chromium.
hdparm was causing disk spin-down without any regard to disk activity!

But yesterday I got this kernel process [khugepaged] stuck into D state.
And again almost every new processes including Chromium, Firefox, XeTeX and etc. got into D state.

I suspect this be a kernel bug with power saving disk.
Maybe the kernel changed handling of suspending disk.
Or maybe the kernel loses disk activity state.
Just I am imagining things with my short knowledge.
And also this could be the mistake I used hdparm on /dev/sda disk not sdparm,
which later l found recommended with SATA I'm using.

But I still have the problem after some time(about several hours, AFK or not sure) even though hdparm/sdparm is off/not installed.

I hope you find any help.
Cheers.

UPDATE: the khugepaged mentioned above is a kernel process related to the memory compaction feature
and turning it off in kernel configuration really helps this time.
Experienced no more D processes ever since the kernel with that feature off has been up on my system.
Googling "khugepaged" shows the memory compaction bug which seems to have been fixed, but maybe it's not enough yet.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Other Things Gentoo All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum