View previous topic :: View next topic |
Author |
Message |
commonloon Tux's lil' helper
Joined: 13 Apr 2004 Posts: 133 Location: Marin, CA
|
Posted: Thu Sep 29, 2005 4:42 pm Post subject: Kernel sbp2 module croaking while writing firewire drive |
|
|
I have a server that we use to backup copious amounts of GIS-related data, on the order of 300-400GB at one time. The backup is a simple rsync running off a cron. We are attempting to backup this data to LaCie BigDisk Extreme 500GB firewire drives for offsite storage. What happens is that after some time the "SCSI bus," i.e., the firewire controller, goes offline. I have tried limiting the rsyncs bandwidth, thinking maybe the machine was trying to send too much data too quickly into the firewire, etc.
I'm wondering if anyone else has run into this, and if someone has found a solution. I've found some similar sounding problems on LKML, but no solutions yet...
dmesg output:
Code: |
ieee1394: sbp2: Logged into SBP-2 device
ieee1394: Node 0-00:1023: Max speed [S400] - Max payload [2048]
Vendor: LaCie Model: BigDisk Extreme Rev: 912
Type: Direct-Access ANSI SCSI revision: 06
SCSI device sdd: 980469376 512-byte hdwr sectors (502000 MB)
sdd: asking for cache data failed
sdd: assuming drive cache: write through
SCSI device sdd: 980469376 512-byte hdwr sectors (502000 MB)
sdd: asking for cache data failed
sdd: assuming drive cache: write through
/dev/scsi/host7/bus0/target0/lun0: p1
Attached scsi disk sdd at scsi7, channel 0, id 0, lun 0
Attached scsi generic sg3 at scsi7, channel 0, id 0, lun 0, type 0
XFS mounting filesystem sdd1
Ending clean XFS mount for filesystem: sdd1
|
dmesg error output
Code: |
ieee1394: unsolicited response packet received - no tlabel match
ieee1394: sbp2: aborting sbp2 command
cdb[0]=0x2a 2a 00 25 4c a9 b7 00 00 f8 00
ieee1394: unsolicited response packet received - no tlabel match
ieee1394: Error parsing configrom for node 0-01:1023
ieee1394: Error parsing configrom for node 0-02:1023
ieee1394: Node suspended: ID:BUS[0-01:1023] GUID[00d04b4b1d04eda6]
ieee1394: sbp2: aborting sbp2 command
cdb[0]=0x0 00 00 00 00 00 00
ieee1394: sbp2: hpsb_node_write failed.
ieee1394: sbp2: aborting sbp2 command
cdb[0]=0x2a 2a 00 25 4c aa af 00 00 f8 00
ieee1394: sbp2: hpsb_node_write failed.
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: aborting sbp2 command
cdb[0]=0x2a 2a 00 25 4c ab a7 00 00 f8 00
ieee1394: sbp2: hpsb_node_write failed.
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: aborting sbp2 command
cdb[0]=0x2a 2a 00 25 4c ac 9f 00 00 f8 00
ieee1394: sbp2: hpsb_node_write failed.
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: aborting sbp2 command
cdb[0]=0x2a 2a 00 25 4c ad 97 00 00 f8 00
ieee1394: sbp2: hpsb_node_write failed.
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: aborting sbp2 command
cdb[0]=0x2a 2a 00 25 4c ae 8f 00 00 f8 00
ieee1394: sbp2: hpsb_node_write failed.
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: aborting sbp2 command
cdb[0]=0x2a 2a 00 25 4c af 87 00 00 f8 00
ieee1394: sbp2: hpsb_node_write failed.
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: aborting sbp2 command
cdb[0]=0x2a 2a 00 25 4c b0 7f 00 00 f8 00
ieee1394: sbp2: hpsb_node_write failed.
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: reset requested
ieee1394: sbp2: Generating sbp2 fetch agent reset
ieee1394: sbp2: hpsb_node_write failed.
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: reset requested
ieee1394: sbp2: Generating sbp2 fetch agent reset
ieee1394: sbp2: hpsb_node_write failed.
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
ieee1394: sbp2: Bus reset in progress - rejecting command
scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 0 lun 0
scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 0 lun 0
scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 0 lun 0
scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 0 lun 0
scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 0 lun 0
scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 0 lun 0
scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 0 lun 0
scsi: Device offlined - not ready after error recovery: host 5 channel 0 id 0 lun 0
|
Related stuff:
Code: |
jack ~ # uname -a
Linux jack 2.6.11-gentoo-r7 #1 SMP Sat May 28 19:43:39 PDT 2005 x86_64 AMD Opteron(tm) Processor 242 AuthenticAMD GNU/Linux
jack ~ # lsmod
Module Size Used by
sbp2 27080 2
ohci1394 35908 0
ieee1394 120472 2 sbp2,ohci1394
uhci_hcd 34784 0
ehci_hcd 36296 0
ohci_hcd 23304 0
i2c_amd8111 7296 0
cyclades 185560 0
tg3 89412 0
amd8111e 24968 0
e100 38656 0
mii 6016 2 amd8111e,e100
jack ~ # lspci
0000:00:06.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8111 PCI (rev 07)
0000:00:07.0 ISA bridge: Advanced Micro Devices [AMD] AMD-8111 LPC (rev 05)
0000:00:07.1 IDE interface: Advanced Micro Devices [AMD] AMD-8111 IDE (rev 03)
0000:00:07.2 SMBus: Advanced Micro Devices [AMD] AMD-8111 SMBus 2.0 (rev 02)
0000:00:07.3 Bridge: Advanced Micro Devices [AMD] AMD-8111 ACPI (rev 05)
0000:00:0a.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12)
0000:00:0a.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
0000:00:0b.0 PCI bridge: Advanced Micro Devices [AMD] AMD-8131 PCI-X Bridge (rev 12)
0000:00:0b.1 PIC: Advanced Micro Devices [AMD] AMD-8131 PCI-X IOAPIC (rev 01)
0000:00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
0000:00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
0000:00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
0000:00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
0000:00:19.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
0000:00:19.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
0000:00:19.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
0000:00:19.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
0000:01:03.0 RAID bus controller: 3ware Inc 3ware Inc 3ware 7xxx/8xxx-series PATA/SATA-RAID (rev 01)
0000:02:09.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03)
0000:02:09.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5704 Gigabit Ethernet (rev 03)
0000:03:00.0 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b)
0000:03:00.1 USB Controller: Advanced Micro Devices [AMD] AMD-8111 USB (rev 0b)
0000:03:04.0 FireWire (IEEE 1394): VIA Technologies, Inc. IEEE 1394 Host Controller (rev 46)
0000:03:05.0 Unknown mass storage controller: Silicon Image, Inc. SiI 3114 [SATALink/SATARaid] Serial ATA Controller (rev 02)
0000:03:06.0 VGA compatible controller: ATI Technologies Inc Rage XL (rev 27)
0000:03:08.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 10)
|
crontab lines used:
Code: |
# Copy over from /backups to the firewire drive
# I think the LaCie supports: 100, 200, 400 and 800 Mbits/s
# 200 Mbits/s
#KBPS=25600
# 400 Mbits/s
KBPS=51200
TMP=/backups/tmp
# choose which bigdisk
BIGDISK=5
30 2 * * 2,4,6 nice rsync -r --temp-dir=$TMP --bwlimit=$KBPS --exclude 'tmp' /backups /mnt/bigdisk/extreme$BIGDISK
|
_________________ Calling fly fishing a hobbie is like calling brain surgery a job. |
|
Back to top |
|
|
frostschutz Advocate
Joined: 22 Feb 2005 Posts: 2977 Location: Germany
|
Posted: Fri Sep 30, 2005 4:01 pm Post subject: |
|
|
Check the linux1394 mailing list for recent postings concerning 'aborting sbp2 command'. Use latest kernel (2.6.13 or .14), latest linux1394 drivers (SVN from www.linux1394.org), and try your luck with the disable_irm=1 and serialize_io=1 kernel / module parameters. |
|
Back to top |
|
|
commonloon Tux's lil' helper
Joined: 13 Apr 2004 Posts: 133 Location: Marin, CA
|
Posted: Fri Sep 30, 2005 4:56 pm Post subject: |
|
|
Thanks much...
I just changed the scheduler to cfq, and added disable_irm=1 and serialize_io=1 to ieee1394 and sbp2. I'm going to try things like that, and then update the kernel/module later. I'm about to go on vacation _________________ Calling fly fishing a hobbie is like calling brain surgery a job. |
|
Back to top |
|
|
frostschutz Advocate
Joined: 22 Feb 2005 Posts: 2977 Location: Germany
|
Posted: Sat Oct 01, 2005 12:15 am Post subject: |
|
|
Actually, the linux1394 developers are recommending deadline over cfq as scheduler... |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|