Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
SCSI issues in 2.6.6 - unbootable - worked previously
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Soltis
n00b
n00b


Joined: 26 May 2004
Posts: 3

PostPosted: Tue Jun 01, 2004 4:48 am    Post subject: SCSI issues in 2.6.6 - unbootable - worked previously Reply with quote

Note that I've been able to boot into other 2.6 kernels in the past, even very recent ones, so it's something related to this specific version, I'm convinced. I've not installed any new hardware, or done anything drastic to the kernel config(I can't get to it at the moment, but I can prolly answer from memory any questions that need answering about my config), and every bit of the hardware still works in older versions of Linux as well as Windows.

Anyway, here's my hardware:
Athlon-XP 2500
1024 MB DDR RAM
A7N8X 2.0
Onboard sound and ethernet
Geforce FX 5900
LSIU160 SCSI controller
Onboard SiI3112 SATA controller
WD Raptor 10K 70 GB HDD
Atlas 10K II SCSI HDD

Here's my issue.

I installed Gentoo a few days ago. This was a bumpy business, for several reasons, but I got it done. I installed the 2.6.5 gentoo-dev-sources kernel, since I wanted a stability patch for my mobo that I thought was included in recent revisions of 2.6.

I was able to boot up fine, but the patch was not included, and my computer kept locking up as a result.

I then tried to install the 2.6.6-rc1 kernel, which I am pretty sure has the fix I need, and compiled it with the same options as 2.6.5(well, almost -- there are some options in the 2.6.6 kernel, related to POSIX compliance, which I enabled, but save for that, everything was the same), and rebooted.

I did enable Fusion MPT, as recommended by LSI Logic, and I'm using the SYM53C8XX_2 driver (Y in kernel config). I also tried without it, to no avail.

The output when I boot I had to write down on paper, but I've got the last(hopefully pertinent) section here:

sym0:<1010-33> rev 0x1 PCI 0000:01:0a.0 irq 16
sym0: using 64 bit DMA addressing
sym0: Symbios NVRAM ID7, fast 80, LVD, parity checking
sym0: Open drain IRQ driver, using on-chip SRAM
sym0: using LOAD/STORE based firmware
sym0: handling phase mismatch from SCRIPTS
sym0: scan at boot disabled for targets 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
sym0: scan for LUNS disabled for targets 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
scsi0: Sym-2.1.18j
Vendor: QUANTUM Model: ATLAS10K2-TY734J
Type: Direct-access ANSI SCSI revision: 03
scsi(0:0:0:0): Beginning Domain Validation
sym0:0:Wide asynchronous
sym0:0: ABORT operation started
sym0:0: ABORT operation timed out
sym0:0: DEVICE RESET operation started
sym0:0: DEVICE RESET operation completed
sym0: SCSI bus has been reset
sym0:0: ABORT operation started
sym0:0: ABORT operation timed out
sym0:0: DEVICE RESET operation started
sym0:0: DEVICE RESET operation completed
sym0: SCSI bus has been reset



After this, it just hangs for about three minutes, then the screen goes blank. Ctrl-Alt-Del does nothing. I had to power off.


Any help would be greatly appreciated.
Back to top
View user's profile Send private message
Soltis
n00b
n00b


Joined: 26 May 2004
Posts: 3

PostPosted: Thu Jun 03, 2004 8:06 am    Post subject: Reply with quote

Umm, hello? Help, please?
Back to top
View user's profile Send private message
MK
Tux's lil' helper
Tux's lil' helper


Joined: 27 Nov 2002
Posts: 97
Location: Bærum, Norway

PostPosted: Thu Jun 03, 2004 12:06 pm    Post subject: Reply with quote

Hello

I tested the 2.6.6 kernel earlier with the same SCSI card, and I didn't have that problem, but I had some other problems so I'm running 2.6.5 (gentoo dev sources).

What did you enable in the kernel config? (do grep SCSI .config) Also about Fusion MPT, do you know what it is? I've never seen it before and I don't use it either.
Back to top
View user's profile Send private message
ronmon
Veteran
Veteran


Joined: 15 Apr 2002
Posts: 1043
Location: Key West, FL

PostPosted: Thu Jun 03, 2004 1:23 pm    Post subject: Reply with quote

Yeah, I'm having a similar problem. My Tekram DC390U3W with a LSI 53c1010 chipset has been running under Gentoo for over two years with the same driver that you are using. It first ran on a VP6 with a pair of 1Ghz P3's and on an A7M266-D with two MP1800's since August 2002. The latest kernel that will boot for me is 2.6.5-rc2-mm5. Any newer kernels (both mm and vanilla) will initialize my two U160 HDDs on bus 0, but it hangs on bus 1 while trying to set up the SCSI CD-RW.

I've used my old working .config for the most part and I have experimented with some other options. Boot options acpi=off, pci=noacpi and noapic haven't helped. I tried those since some of the things I found seemed to point to interrupt mapping and/or IO_APIC problems. Intense Googling spanning more than a month has turned up others with the same problem here and here for example. There are plenty more, but none that I found had a solution.

phpsysinfo
boot error
running .config
new .config
current dmesg
lspci -vvx
_________________
Ask Questions the Smart Way - by ESR
Back to top
View user's profile Send private message
ronmon
Veteran
Veteran


Joined: 15 Apr 2002
Posts: 1043
Location: Key West, FL

PostPosted: Thu Jun 03, 2004 8:38 pm    Post subject: Reply with quote

I did some more experimenting and came up with a workaround. What I did was copy over all the files in drivers/scsi/sym53c8xx_2/ from an older source (I used 2.6.4) to the new source. It worked for 2.6.7-rc1-mm1 and 2.6.7-rc2 with bootsplash. They both build and run with no problem.

It may not be the prettiest solution, but it does work and shows that whatever they have changed in the sym53c8xx_2 drivers has caused the problem.
_________________
Ask Questions the Smart Way - by ESR
Back to top
View user's profile Send private message
Soltis
n00b
n00b


Joined: 26 May 2004
Posts: 3

PostPosted: Thu Jun 03, 2004 10:18 pm    Post subject: Reply with quote

Thanks for the feedback, folks. I'm going to e-mail a few kernel dev mailing lists about this, see what they can figure out.
Back to top
View user's profile Send private message
Boston_Mike
n00b
n00b


Joined: 17 Aug 2003
Posts: 12
Location: Boston, MA

PostPosted: Wed Jun 09, 2004 12:59 am    Post subject: Reply with quote

I can confirm this problem. I have the LSI Logic U160 card and several Atlas 10k II disk drives. I experienced the exact same error as the original poster. Dropping a revision to 2.6.5 fixed the problem.

EDIT:
A search on the internet revealed that if you install the Fusion-MPT driver, the controller card will start working and the situation will be resolved.
Back to top
View user's profile Send private message
Flandry
n00b
n00b


Joined: 27 Feb 2004
Posts: 52
Location: Boston, MA

PostPosted: Mon Jun 21, 2004 2:22 am    Post subject: Reply with quote

Did anyone ever notify the driver-dev people, so it will be fixed next release? I'm yet another Tekram U160 owner with problems like this.
Back to top
View user's profile Send private message
camillo
n00b
n00b


Joined: 30 Sep 2003
Posts: 45
Location: Torino, Italy

PostPosted: Tue Jun 22, 2004 5:24 pm    Post subject: Reply with quote

I have the same problem with an old fireport 40 (Symbios Logic 53c875J)
Now I will try the kernel 2.6.5
Back to top
View user's profile Send private message
Mazumoto
n00b
n00b


Joined: 28 Apr 2004
Posts: 25

PostPosted: Mon Jun 28, 2004 3:31 pm    Post subject: Reply with quote

Well, I also have problems with recent kernels and my scsi card (a Symbios Logic 53c875 (rev 26)), but I get slightly different errors. The driver seems to get into an infinite cycle, I cannot even read the messages, it just says something about "validation fault, dropping back". Ctrl-Alt-Del doesn't work.

A working kernel is the gentoo-dev-sources-2.6.3-r1.

Non working are gentoo-dev-sources-2.6.7, 2.6.7-r5, 2.6.7-r6.

I also noticed some version change from sym-2.1.18f to sym-2.1.18j ... whatever that is or if it is related to the problem.


At this very moment I try to compile a kernel with Fusion-MPT enabled - although I don't really like it I hope it works ...


UPDATE
The Fusion-MPT support doesn't change anything for me. Also I'm not sure anymore that the problem described in this post initially and mine are caused by the same bug/whatever.
The readable, cause repeating line of error messages I get at boot is exactly:
Code:
scsi(0:0:2:0): Domain validation detected failure, dropping back

In between are lines with changing content, something like sym0:2 FAST-## SCSI ...

Oh and I got it up and running one single time - when I entered the scsi-boot-detection of the card but didn't change anything. This was not reproduceable.


EDIT:
I suppose that this find includes the same messages I get:
http://lkml.org/lkml/2004/6/17/170
Back to top
View user's profile Send private message
camillo
n00b
n00b


Joined: 30 Sep 2003
Posts: 45
Location: Torino, Italy

PostPosted: Mon Jun 28, 2004 4:10 pm    Post subject: Reply with quote

I had the same problem with kernel 2.6.7.
The only solution that i found was using kernel 2.6.5
I hope that the problem will be fixed in the next kernel
Back to top
View user's profile Send private message
ronmon
Veteran
Veteran


Joined: 15 Apr 2002
Posts: 1043
Location: Key West, FL

PostPosted: Mon Jun 28, 2004 11:20 pm    Post subject: Reply with quote

The new kernels work just fine if you overwrite the sym2.1.18j drivers with the sym2.1.18i drivers. The change was apparently made right after 2.6.5, so I copied the directory and dump them into their home within the new source tree. No problems at all.

For convenience you can grab a clean set here. They go into: linux/drivers/scsi/sym53c8xx_2/
_________________
Ask Questions the Smart Way - by ESR
Back to top
View user's profile Send private message
Mazumoto
n00b
n00b


Joined: 28 Apr 2004
Posts: 25

PostPosted: Tue Jun 29, 2004 6:24 pm    Post subject: Reply with quote

just a short report about kernel 2.6.7-bk12 (vanilla):
The loop is gone (meaning that the domain validation failure cycle (going down from 20MB/s to 6MB/s or something) is just performed once) but reiserfschk fails during boot (I think mounting the hd fails).

UPDATE:
sym-2.1.18i doesn't work either. There is no loop or domain validation failure but reiserfschk fails (no device in /dev/).
Back to top
View user's profile Send private message
AngelKnight
Tux's lil' helper
Tux's lil' helper


Joined: 14 Jan 2003
Posts: 127

PostPosted: Mon Jul 19, 2004 6:21 pm    Post subject: sym53c8xx_2 driver and 53c1010 cards Reply with quote

Just a "me too" on the problems.

It turns out that the sym-2.1.18j driver does not handle domain validation properly with the 53c1010-based cards (duhh).

My "workaround" if you need to keep using a kernel later than 2.6.5 is to stub out the domain validation function. The stub-out is a one-liner in $KERNELSRC/drivers/scsi/scsi_transport_spi.c.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum