Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[WORKAROUND] kernel BUG at drivers/pci/intel-iommu.c:1373!
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
JohnBlbec
Guru
Guru


Joined: 08 Feb 2003
Posts: 306

PostPosted: Sun Dec 14, 2008 12:46 am    Post subject: [WORKAROUND] kernel BUG at drivers/pci/intel-iommu.c:1373! Reply with quote

hi everybody,

I am able to reproduce this bug everytime I run bonnie++ or when I copy or move file(s) of size more then about 4GB. I am not able to get a kernel's core dump or some text output on my disk because press the hard reset button on my computer is the only one possibility what I can do after this issue. well, I have taken a photo of my lcd with the bug. I do not know what other information kernel gurus wants so I can uptade this topic and I will add everything what you will want. the bug is realy very annoying :o(

the bug photo

my kernel config

$ uname -a
Code:

Linux rpc-linux 2.6.26-gentoo-r4 #1 SMP Sat Dec 13 23:50:20 CET 2008 x86_64 Intel(R) Core(TM)2 Quad CPU Q9450 @ 2.66GHz GenuineIntel GNU/Linux


$ free -m
Code:

             total       used       free     shared    buffers     cached
Mem:          4011       1503       2508          0         18       1004
-/+ buffers/cache:        480       3531
Swap:         4095          0       4095


# lspci
Code:

00:00.0 Host bridge: Intel Corporation DRAM Controller (rev 01)
00:01.0 PCI bridge: Intel Corporation Host-Primary PCI Express Bridge (rev 01)
00:19.0 Ethernet controller: Intel Corporation 82566DC-2 Gigabit Network Connection (rev 02)
00:1a.0 USB Controller: Intel Corporation USB UHCI Controller #4 (rev 02)
00:1a.1 USB Controller: Intel Corporation USB UHCI Controller #5 (rev 02)
00:1a.2 USB Controller: Intel Corporation USB UHCI Controller #6 (rev 02)
00:1a.7 USB Controller: Intel Corporation USB2 EHCI Controller #2 (rev 02)
00:1b.0 Audio device: Intel Corporation HD Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation PCI Express Port 1 (rev 02)
00:1c.4 PCI bridge: Intel Corporation PCI Express Port 5 (rev 02)
00:1d.0 USB Controller: Intel Corporation USB UHCI Controller #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation USB UHCI Controller #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation USB UHCI Controller #3 (rev 02)
00:1d.7 USB Controller: Intel Corporation USB2 EHCI Controller #1 (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92)
00:1f.0 ISA bridge: Intel Corporation LPC Interface Controller (rev 02)
00:1f.2 SATA controller: Intel Corporation 6 port SATA AHCI Controller (rev 02)
00:1f.3 SMBus: Intel Corporation SMBus Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation Device 05e2 (rev a1)
02:00.0 RAID bus controller: 3ware Inc 9650SE SATA-II RAID (rev 01)
03:00.0 IDE interface: Marvell Technology Group Ltd. Device 6121 (rev b2)
04:03.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22/A IEEE-1394a-2000 Controller (PHY/Link)


Last edited by JohnBlbec on Sun Dec 14, 2008 10:35 pm; edited 1 time in total
Back to top
View user's profile Send private message
Neo2
Apprentice
Apprentice


Joined: 25 Sep 2006
Posts: 224
Location: Italy

PostPosted: Sun Dec 14, 2008 10:54 am    Post subject: Reply with quote

Before requesting anything to the kernel devs, remove any proprietary/binary module that taints the kernel ("Tainted: P" at the end of module listing) else you'll get no support at all. In this case it seems you're using the "nvidia" module. Remove this module and re-run bonnie++.
If the system hangs again, to generate a good bugreport you should at least enable CONFIG_KALLSYMS ("CONFIG_KALLSYMS=y") in your kernel config and recompile. This way you can fill all the spaces next to the long list of addresses below the "Call trace:" line and provide the name of the function that caused the bug.
If removing the module does help then you should consider updating the module or (since the problem would probably lie in the graphical rendering module) switch to another DRM system (like the opensource DRI).
Maybe updating the kernel would help too, but I can't say much more without additional information.
_________________
Neo2
Unofficial minimal liveCD for x86/amd64 w/reiser4+truecrypt
Back to top
View user's profile Send private message
JohnBlbec
Guru
Guru


Joined: 08 Feb 2003
Posts: 306

PostPosted: Sun Dec 14, 2008 12:01 pm    Post subject: Reply with quote

thanks for advice, neo2. I have done everything you wrote and the result is the same, kernel crash again...

new kernel crash photo

lshw output

Note: I am able to emerge whole world (emerge -e world) without any troubles but cp or mv big files stuck my linux.
Back to top
View user's profile Send private message
Neo2
Apprentice
Apprentice


Joined: 25 Sep 2006
Posts: 224
Location: Italy

PostPosted: Sun Dec 14, 2008 4:38 pm    Post subject: Reply with quote

Well, I've investigated a little in the kernel sources. Apparently the domain_page_mapping function that gets invoked from the intel_map_sg function (which is contained in drivers/pci/intel-iommu.c) fails, and this leads to deadlock. I've searched through the kernel ChangeLog and there seems to be nothing regarding those two functions from 2.6.26.4->2.6.27.9, thus it is a probably unknown bug. I think this is the code leading to the deadlock (especially the BUG_ON macro; extracted from intel-iommu.c):

Code:
static int
domain_page_mapping(struct dmar_domain *domain, dma_addr_t iova,
                        u64 hpa, size_t size, int prot)
{
        u64 start_pfn, end_pfn;
        struct dma_pte *pte;
        int index;
        int addr_width = agaw_to_width(domain->agaw);

        hpa &= (((u64)1) << addr_width) - 1;

        if ((prot & (DMA_PTE_READ|DMA_PTE_WRITE)) == 0)
                return -EINVAL;
        iova &= PAGE_MASK;
        start_pfn = ((u64)hpa) >> VTD_PAGE_SHIFT;
        end_pfn = (VTD_PAGE_ALIGN(((u64)hpa) + size)) >> VTD_PAGE_SHIFT;
        index = 0;
        while (start_pfn < end_pfn) {
                pte = addr_to_dma_pte(domain, iova + VTD_PAGE_SIZE * index);
                if (!pte)
                        return -ENOMEM;
                /* We don't need lock here, nobody else
                 * touches the iova range
                 */
                BUG_ON(dma_pte_addr(*pte));
                dma_set_pte_addr(*pte, start_pfn << VTD_PAGE_SHIFT);
                dma_set_pte_prot(*pte, prot);
                __iommu_flush_cache(domain->iommu, pte, sizeof(*pte));
                start_pfn++;
                index++;
        }
        return 0;
}

Honestly, I don't know where to patch the source to get your problem fixed, and at this point you may want to file a bug at the kernel devs. This page contains useful info from where to start from: http://www.kernel.org/pub/linux/docs/lkml/reporting-bugs.html
Anyway, if your system runs fine for everyday use (browsing, mail, etc), I guess this can be considered a minor issue (until you need to copy >4Gb file, of course).
Hope to have helped :wink:

Cheers,
Neo2
_________________
Neo2
Unofficial minimal liveCD for x86/amd64 w/reiser4+truecrypt
Back to top
View user's profile Send private message
JohnBlbec
Guru
Guru


Joined: 08 Feb 2003
Posts: 306

PostPosted: Sun Dec 14, 2008 5:10 pm    Post subject: Reply with quote

hi neo2. I am going to report a kernel bug as you advice me, thanks. unfortunately, it is not a minor issue because I am not able to clone my 1TB db. fucking business. well, thanks once again and I will inform gentoo forum users when I find a patch out...

the bug has been submitted: Kernel Bug Tracker Bug 12222

workaround: to use"intel_iommu=off" as a linux kernel boot parameter, but I do not know what performance impact should we expect...
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum