Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Root processes fail to end
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Skinjob2707
n00b
n00b


Joined: 07 Aug 2013
Posts: 57

PostPosted: Thu Jun 22, 2017 6:13 pm    Post subject: Root processes fail to end Reply with quote

I am having a problem where I'm not sure the right place to being troubleshooting. After running for awhile (at least 45 minutes to overnight) root processes fail to close. If I run emerge --newuse --update --ask @world, the first item in the list will be installed, but emerge will stall at the end and never move unto to the next package in the list. If I run gvim as root, click the x in the upper right hand corner to close it, it will sit a little while before the process will not end dialogue comes up. If I then click terminate, the gvim window has everything grayed out, but will not go away without a reboot.

The output from ps aux | grep gvim reads:
Code:
root      9967  0.0  0.1 172616 24908 ?        Ds   13:57   0:00 gvim nvidia-cudnn-bin-8.0-r6.ebuild
.

The only way to get it to go away is to reboot. On restarting, the shutdown process often hangs because there is an open file on either the /boot or the root partition.

When I look in journalctl --pager-end I don't see any error messages that look relevant.

My current Kernel config is at:
https://pastebin.com/VEMNFAvv

Thanks in advance for your assistance!
Back to top
View user's profile Send private message
Skinjob2707
n00b
n00b


Joined: 07 Aug 2013
Posts: 57

PostPosted: Thu Jun 22, 2017 6:23 pm    Post subject: Reply with quote

Looking through the logs, I see the following, but it isn't clear whether it is the cause of the problem:
Code:
Jun 22 13:14:51 bluemeanie kernel: ------------[ cut here ]------------
Jun 22 13:14:51 bluemeanie kernel: kernel BUG at fs/f2fs/gc.c:899!
Jun 22 13:14:51 bluemeanie kernel: invalid opcode: 0000 [#1] PREEMPT SMP
Jun 22 13:14:51 bluemeanie kernel: Modules linked in: nvidia_drm(PO) arc4 ath9k ath9k_common ath9k_hw mac80211 input_le
Jun 22 13:14:51 bluemeanie kernel: CPU: 0 PID: 1044 Comm: f2fs_gc-8:3 Tainted: P           O    4.11.6-gentoo #1
Jun 22 13:14:51 bluemeanie kernel: Hardware name: Micro-Star International Co., Ltd MS-7A34/B350 PC MATE(MS-7A34), BIOS
Jun 22 13:14:51 bluemeanie kernel: task: ffff88040de8cf40 task.stack: ffffc90000468000
Jun 22 13:14:51 bluemeanie kernel: RIP: 0010:do_garbage_collect+0x9e1/0xb00
Jun 22 13:14:51 bluemeanie kernel: RSP: 0018:ffffc9000046bcb0 EFLAGS: 00010297
Jun 22 13:14:51 bluemeanie kernel: RAX: ffff8801944aa000 RBX: 0000000000000000 RCX: 0000000000000000
Jun 22 13:14:51 bluemeanie kernel: RDX: ffff880000000000 RSI: 0000000000000003 RDI: ffffea0005870530
Jun 22 13:14:51 bluemeanie kernel: RBP: ffffc9000046bdb0 R08: ffff880207505b90 R09: ffffea000587054c
Jun 22 13:14:51 bluemeanie kernel: R10: ffffc9000046bc38 R11: 0000000000000040 R12: 0000000000000006
Jun 22 13:14:51 bluemeanie kernel: R13: ffff88040d033708 R14: ffff88040dc0e800 R15: ffffea0005870530
Jun 22 13:14:51 bluemeanie kernel: FS:  0000000000000000(0000) GS:ffff88041ec00000(0000) knlGS:0000000000000000
Jun 22 13:14:51 bluemeanie kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 22 13:14:51 bluemeanie kernel: CR2: 00007ffac2da9010 CR3: 00000001a9e2c000 CR4: 00000000003406f0
Jun 22 13:14:51 bluemeanie kernel: Call Trace:
Jun 22 13:14:51 bluemeanie kernel:  ? _raw_spin_lock+0x12/0x40
Jun 22 13:14:51 bluemeanie kernel:  ? pick_next_task_fair+0x585/0x9c0
Jun 22 13:14:51 bluemeanie kernel:  ? find_next_bit+0xb/0x10
Jun 22 13:14:51 bluemeanie kernel:  f2fs_gc+0x19f/0x470
Jun 22 13:14:51 bluemeanie kernel:  ? f2fs_gc+0x19f/0x470
Jun 22 13:14:51 bluemeanie kernel:  ? del_timer_sync+0x30/0x50
Jun 22 13:14:51 bluemeanie kernel:  ? preempt_count_add+0xa3/0xc0
Jun 22 13:14:51 bluemeanie kernel:  gc_thread_func+0x2eb/0x340
Jun 22 13:14:51 bluemeanie kernel:  ? gc_thread_func+0x2eb/0x340
Jun 22 13:14:51 bluemeanie kernel:  ? wake_atomic_t_function+0x50/0x50
Jun 22 13:14:51 bluemeanie kernel:  kthread+0xff/0x140
Jun 22 13:14:51 bluemeanie kernel:  ? f2fs_gc+0x470/0x470
Jun 22 13:14:51 bluemeanie kernel:  ? kthread_create_on_node+0x40/0x40
Jun 22 13:14:51 bluemeanie kernel:  ? umh_complete+0x40/0x40
Jun 22 13:14:51 bluemeanie kernel:  ? call_usermodehelper_exec_async+0x137/0x140
Jun 22 13:14:51 bluemeanie kernel:  ret_from_fork+0x29/0x40
Jun 22 13:14:51 bluemeanie kernel: Code: ff e9 d3 fd ff ff 8b 55 8c 44 89 f1 4c 89 ef e8 b6 ee ff ff e9 78 fe ff ff 8b
Jun 22 13:14:51 bluemeanie kernel: RIP: do_garbage_collect+0x9e1/0xb00 RSP: ffffc9000046bcb0
Jun 22 13:14:51 bluemeanie kernel: ---[ end trace 71b180caf0c5dabb ]---
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 23062

PostPosted: Fri Jun 23, 2017 1:34 am    Post subject: Reply with quote

This is:
bdac9683a6553b04e7456ed959162fe6bc696dde:fs/f2fs/gc.c:
 860 static int do_garbage_collect(struct f2fs_sb_info *sbi,
 865   struct f2fs_summary_block *sum;
 870   unsigned char type = IS_DATASEG(get_seg_entry(sbi, segno)->type) ?
 871                  SUM_TYPE_DATA : SUM_TYPE_NODE;
 898      sum = page_address(sum_page);
 899      f2fs_bug_on(sbi, type != GET_SUM_TYPE((&sum->footer)));
Normally, I would suggest that you try to reproduce the problem with an untainted kernel. I doubt that will matter here, but it's still good practice, if for no other reason than that you are unlikely to get much support upstream for a tainted kernel.
Back to top
View user's profile Send private message
Skinjob2707
n00b
n00b


Joined: 07 Aug 2013
Posts: 57

PostPosted: Sat Jun 24, 2017 7:15 pm    Post subject: Reply with quote

You make a good point about the tainted kernel. Here is the same problem happening with an untainted kernel:

Code:
[  494.543223] kernel BUG at fs/f2fs/gc.c:899!
[  494.543812] invalid opcode: 0000 [#1] PREEMPT SMP
[  494.544396] Modules linked in: arc4 ath9k ath9k_common ath9k_hw mac80211 input_leds ath cfg80211 nouveau snd_hda_codec_realtek snd_hda_codec_generic video led_class i2c_algo_bit hwmon drm_kms_helper syscopyarea sysfillrect snd_hda_intel sysimgblt fb_sys_fops snd_hda_codec ttm kvm snd_hwdep snd_hda_core irqbypass snd_pcm drm snd_timer pcspkr i2c_piix4 snd wmi 8250 8250_base serial_core button acpi_cpufreq r8169 mii efivarfs
[  494.545753] CPU: 0 PID: 1120 Comm: f2fs_gc-8:3 Not tainted 4.11.6-gentoo #1
[  494.546424] Hardware name: Micro-Star International Co., Ltd MS-7A34/B350 PC MATE(MS-7A34), BIOS A.30 04/19/2017
[  494.547107] task: ffff88040e48ae80 task.stack: ffffc90000418000
[  494.547798] RIP: 0010:do_garbage_collect+0x9e1/0xb00
[  494.548483] RSP: 0018:ffffc9000041bcb0 EFLAGS: 00010297
[  494.549162] RAX: ffff880255dc8000 RBX: 0000000000000000 RCX: 0000000000000000
[  494.549845] RDX: ffff880000000000 RSI: 0000000000000003 RDI: ffffea00082c83c0
[  494.550524] RBP: ffffc9000041bdb0 R08: ffff88040767b4d8 R09: ffffea00082c83dc
[  494.551201] R10: ffffc9000041bc38 R11: 0000000000000040 R12: 0000000000000007
[  494.551877] R13: ffff88040e3b04c8 R14: ffff88040d1d9800 R15: ffffea00082c83c0
[  494.552552] FS:  0000000000000000(0000) GS:ffff88041ec00000(0000) knlGS:0000000000000000
[  494.553230] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  494.553906] CR2: 00007f41d4caf038 CR3: 0000000407be6000 CR4: 00000000003406f0
[  494.554592] Call Trace:
[  494.555275]  ? find_next_bit+0xb/0x10
[  494.555953]  f2fs_gc+0x19f/0x470
[  494.556626]  ? f2fs_gc+0x19f/0x470
[  494.557296]  ? del_timer_sync+0x20/0x50
[  494.557960]  ? preempt_count_add+0xa3/0xc0
[  494.558620]  gc_thread_func+0x2eb/0x340
[  494.559278]  ? gc_thread_func+0x2eb/0x340
[  494.559939]  ? wake_atomic_t_function+0x50/0x50
[  494.560603]  kthread+0xff/0x140
[  494.561258]  ? f2fs_gc+0x470/0x470
[  494.561916]  ? kthread_create_on_node+0x40/0x40
[  494.562569]  ret_from_fork+0x29/0x40
[  494.563216] Code: ff e9 d3 fd ff ff 8b 55 8c 44 89 f1 4c 89 ef e8 b6 ee ff ff e9 78 fe ff ff 8b 75 94 4c 89 ff e8 56 7f 00 00 e9 06 fc ff ff 0f 0b <0f> 0b 44 89 fe 48 89 cf e8 52 8e 00 00 e9 db f8 ff ff 0f b6 b5
[  494.563935] RIP: do_garbage_collect+0x9e1/0xb00 RSP: ffffc9000041bcb0
[  494.570544] ---[ end trace bb6a511b13617ddd ]---


The Google produced this mailing list entry with at least the same source code line in common: https://www.mail-archive.com/linux-f2fs-devel@lists.sourceforge.net/msg06152.html.

Is the f2fs-devel mail list the right place to inquire about this issue? Or should I try and apply the patch using portage's patch facility?

Thanks for your help![/code]
Back to top
View user's profile Send private message
Hu
Administrator
Administrator


Joined: 06 Mar 2007
Posts: 23062

PostPosted: Sat Jun 24, 2017 9:02 pm    Post subject: Reply with quote

I am not sufficiently familiar with f2fs to know whether that patch is useful. I suggest contacting the f2fs mailing list, explaining your problem, and asking their opinion on the patch.
Back to top
View user's profile Send private message
Skinjob2707
n00b
n00b


Joined: 07 Aug 2013
Posts: 57

PostPosted: Tue Jun 27, 2017 4:14 pm    Post subject: Reply with quote

I upgraded on Sunday morning to kernel 4.11.7 and haven't had the problem since. Unless it recurs, I'm going to consider the problem solved.

Thanks for your help!
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum