Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Xen Dom0 crashes regularlay since Kernel 2.6.18-xen-r12
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
rfolkerts
n00b
n00b


Joined: 24 Jan 2008
Posts: 6

PostPosted: Fri Jan 30, 2009 4:52 pm    Post subject: Xen Dom0 crashes regularlay since Kernel 2.6.18-xen-r12 Reply with quote

Hi,

in December we updated our Gentoo-Xen Dom0-Machine; amongst these Updates was the latest Xen 3.3.0.

After this Update we booted the "old" 2.6.21-Xen Kernel. It did (and does) boot fine but after running a few minutes the System loses Network; there is no message in /var/log/messages or dmesg, but neither the Hypervisor nor it's DomUs can be reached (ping, ssh). We also were unable to compile the 2.6.21, as there seems to be a Problem with installed Header-Files.

No Problem, as the 2.6.21 was "deprecated" we choose the 2.6.18 (r12) which compiled and booted 1a.

Unfortunately, since then this machine crashes every few days (see below for stack trace).

We tried to update world to a more recent GCC, following the Gentoo Documentation (from 3.4.6 to i686-pc-linux-gnu-4.1.2) -- but that didn't help.

Next we searched for the "swiotlb_map_sg" Crash-Point in the Net and found several references. The hints there where to add a "swiotlb=n" Kernel-Parameter.

From what we understand this Parameter controls the size of a Table that's being used by the Code for use as DMA Buffer.

However, we didn't find a "rule" explaining what value to use under which conditions.

So, we added "swiotlb=512".

(On a Novell-Site were hints that asked to set to 2, in some Forums/MailingLists people reported to have set it up to 4096).

However, the Problem still occurs.

Now, before we blindly set "swiotlb" to some unrealistic Values, does someone have a hint on what might be going on there? The System did run 1a rock-solid with Kernel 2.6.21, so I hope to get it somewhat stable again...

The System is a 16G Dual Xeon-Server with Intel MB (unfortunately) still running x86 w. PAE (we didn't change that sine we started with Xen 3.0 several Years ago, it definitely should be updated to x64 -- however, before that step it should just ran stable again). Disks are connected via a 3Ware 9550SX Controller. The System hosts 14 DomUs running x86/PAE Linux. It also runs a NFS-Server for sharing Data between the DomUs which these mount.

We didn't change enything re. this setup, i.e. this System "as is" just with older "world" ran 2a with Kernel 2.6.21.

Any hint would be really great!

Cheers,
_ralf_

Code:

Oops: 0000 [#1]
SMP
Modules linked in: uhci_hcd ehci_hcd usbcore e1000
CPU:    0
EIP:    0061:[<c0109f45>]    Not tainted VLI
EFLAGS: 00010002   (2.6.18-xen-r12 #5)
EIP is at range_straddles_page_boundary+0x30/0xee
eax: c04e0000   ebx: eb126000   ecx: 000eb126   edx: 000eb126
esi: 00000000   edi: 00002000   ebp: 00000003   esp: ecf25a9c
ds: 007b   es: 007b   ss: 0069
Process nfsd (pid: 3610, ti=ecf24000 task=ec09f550 task.ti=ecf24000)
Stack: 00000030 e27e9ec0 eb126000 00000000 0ff26000 00000003 c022ee00 00001000
       00000000 00002000 00000000 00000002 e27e9ec0 ed744048 00000000 00000000
       00000000 ed744048 00000002 da1c71c0 ed2ee880 c02cc553 00000000 00000000
Call Trace:
 [<c022ee00>] swiotlb_map_sg+0x13c/0x26c
 [<c02cc553>] twa_scsiop_execute_scsi+0x3a5/0x6e1
 [<c02c0a8f>] scsi_done+0x0/0x16
 [<c02cc8f7>] twa_scsi_queue+0x68/0xe3
 [<c02c0ec4>] scsi_dispatch_cmd+0x130/0x210
 [<c02c4e42>] scsi_request_fn+0x183/0x346
 [<c021d15b>] __generic_unplug_device+0x1f/0x25
 [<c021e100>] __make_request+0xee/0x370
 [<c014014b>] mempool_alloc+0x1f/0xcb
 [<c021c555>] generic_make_request+0xea/0x156
 [<c0164d45>] bio_clone+0x28/0x2d
 [<c02e99b3>] __map_bio+0x2e/0x73
 [<c02ea357>] __split_bio+0x284/0x358
 [<c02c0939>] scsi_finish_command+0x3c/0x40
 [<c02ea5f1>] dm_request+0xbc/0xf2
 [<c021c555>] generic_make_request+0xea/0x156
 [<c0298f35>] evtchn_do_upcall+0xc7/0x1e2
 [<c015c33c>] kmem_cache_alloc+0xb4/0xba
 [<c021e57d>] submit_bio+0x6b/0x109
 [<c016400c>] bio_alloc_bioset+0x78/0x134
 [<c0160c11>] submit_bh+0xc0/0x10d
 [<c0162425>] __block_write_full_page+0x1b0/0x328
 [<c019ab2e>] ext3_get_block+0x0/0xcb
 [<c0162859>] block_write_full_page+0xf8/0x100
 [<c019ab2e>] ext3_get_block+0x0/0xcb
 [<c019c37c>] ext3_ordered_writepage+0xe5/0x1ad
 [<c019921d>] bget_one+0x0/0x7
 [<c0147827>] dec_zone_page_state+0x30/0x5f
 [<c017ff7c>] mpage_writepages+0x149/0x3a2
 [<c019c297>] ext3_ordered_writepage+0x0/0x1ad
 [<c0142b30>] do_writepages+0x35/0x37
 [<c013df99>] __filemap_fdatawrite_range+0x66/0x72
 [<c013e1cb>] filemap_fdatawrite+0x23/0x27
 [<c01dfb1a>] nfsd_sync+0x3e/0x96
 [<c01e0284>] nfsd_open+0xe4/0x132
 [<c01e043e>] nfsd_commit+0x93/0xa7
 [<c01e6ce1>] nfsd3_proc_commit+0xde/0xf7
 [<c01dc6b2>] nfsd_dispatch+0x82/0x1b9
 [<c0366e0a>] _spin_lock_bh+0x8/0x18
 [<c0356e31>] svc_process+0x3de/0x6ba
 [<c0366e0a>] _spin_lock_bh+0x8/0x18
 [<c0359732>] svc_recv+0x3d7/0x4ad
 [<c01dcc42>] nfsd+0x19e/0x32c
 [<c01dcaa4>] nfsd+0x0/0x32c
 [<c0102ac5>] kernel_thread_helper+0x5/0xb
Code: ec 08 89 c3 25 ff 0f 00 00 8d 3c 08 81 ff 00 10 00 00 77 0a 31 c0 83 c4 08
 5b 5e 5f 5d c3 89 d9 0f ac d1 0c 89 ca a1 20 96 47 c0 <0f> a3 08 19 c0 85 c0 75
 e0 0f b6 05 22 49 43 c0 88 44 24 07 89
EIP: [<c0109f45>] range_straddles_page_boundary+0x30/0xee SS:ESP 0069:ecf25a9c
Back to top
View user's profile Send private message
trikolon
Apprentice
Apprentice


Joined: 04 Dec 2004
Posts: 297
Location: Erlangen

PostPosted: Sat Jan 31, 2009 9:34 am    Post subject: Reply with quote

did you try this kernel too: http://code.google.com/p/gentoo-xen-kernel/downloads/list
Back to top
View user's profile Send private message
rfolkerts
n00b
n00b


Joined: 24 Jan 2008
Posts: 6

PostPosted: Sat Jan 31, 2009 10:39 am    Post subject: Reply with quote

trikolon wrote:
did you try this kernel too: http://code.google.com/p/gentoo-xen-kernel/downloads/list


Hi,

wow, no! I was not aware of that Project; only looked for more up-to-date xen-kernels in Portage (and checked the Kernel-Log on Heise - OpenSource).

Will check that and give it a try! Will post my experience ;-)

Thanks!
_ralf_
Back to top
View user's profile Send private message
rfolkerts
n00b
n00b


Joined: 24 Jan 2008
Posts: 6

PostPosted: Sun Mar 01, 2009 4:56 pm    Post subject: Reply with quote

Hi,

just a short update:

I did have a look at the Google-Gentoo-Xen-Kernel Project but was a bit reluctant to give it a try.

However, I remembered that with the Update to Xen 3.3 I removed the "dom0_mem" Line from Xen's Grub-Config.

So, I added -using the "old" Value- that line -- and the machine did not crash again yet (while it used to crash at least once a week w/o that Parameter it keeps running since a few weeks now).

The Parameter was (and now again is) set to: dom0_mem=262144

W/o that Paramter the Dom0-Machine did have ~1.8G RAM available.

Just write this here in case someone else runs into the same Problem!

Cheers,
_ralf_
Back to top
View user's profile Send private message
linuxtuxhellsinki
l33t
l33t


Joined: 15 Nov 2004
Posts: 700
Location: Hellsinki

PostPosted: Wed Mar 04, 2009 7:22 pm    Post subject: Reply with quote

I used to have some problems with "dynamic" memory in dom0 and e1000 nic, but they went away with static memory allocation. You can also use dom0_mem=256M with xen-3.* versions and it's easy to increase memory of dom0 with xm if needed.
_________________
1st use 'Search' & lastly add [Solved] to
the subject of your first post in the thread.
Back to top
View user's profile Send private message
rfolkerts
n00b
n00b


Joined: 24 Jan 2008
Posts: 6

PostPosted: Wed Mar 04, 2009 8:02 pm    Post subject: Reply with quote

Hi,

thanks for the reply!

Well, I had in mind the "m" suffix but was to lazy to look it up (and as the machine kept crashing ~once a week and I would not have bet that the "solution" would help at all I just put in the old entry quickly). Nevertheless, thanks for pointing me to that!

Cheers,
_ralf_
(Much more relaxed as the Hypervisor uses to work rock-solid again).
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum