[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Problemy z kontrolerami Adaptec



 Witam!

Może ktoś już się natknął na podobny problem.
Jest sobie macierz INFORTREND A08U-G2421 (SATA-to-SCSI) oraz trzy kontrolery:
01:07.0 SCSI storage controller: Adaptec ASC-29320A U320 (rev 10)
04:04.0 SCSI storage controller: Adaptec ASC-29320ALP U320 (rev 10)
02:03.0 SCSI storage controller: Adaptec ASC-29320A U320 (rev 10)

Na każdym z nich objaw jest ten sam - w trybie Ultra-SCSI 320 urządzenie blokowe znika z systemu przy próbie dostępu do niego:

/[ 141.010512] sd 5:0:2:0: [sde] Attempting to queue an ABORT message:CDB: 0x28 0x0 0x0 0x0 0xe 0x0 0x0 0x1 0x0 0x0
[  141.012180] scsi5: At time of recovery, card was not paused
[  141.012190] >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
[  141.012193] scsi5: Dumping Card State at program address 0x20 Mode 0x22
[  141.012197] Card was paused
[  141.012202] INTSTAT[0x0] SELOID[0x2] SELID[0x20]
[  141.012216] HS_MAILBOX[0x0] INTCTL[0x80]:(SWTMINTMASK)
[  141.012228] SEQINTSTAT[0x0] SAVED_MODE[0x11]
[  141.012237] DFFSTAT[0x33]:(CURRFIFO_NONE|FIFO0FREE|FIFO1FREE)
[  141.012247] SCSISIGI[0x25]:(P_DATAOUT_DT|ACKI|BSYI)
[ 141.012258] SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE)
[  141.012273] SCSISEQ0[0x40]:(ENSELO) SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI)
[  141.012285] SEQCTL0[0x0] SEQINTCTL[0x0] SEQ_FLAGS[0x0]
[  141.012297] SEQ_FLAGS2[0x4]:(SELECTOUT_QFROZEN)
[  141.012306] QFREEZE_COUNT[0x1] KERNEL_QFREEZE_COUNT[0x1]
[  141.012317] MK_MESSAGE_SCB[0xff00] MK_MESSAGE_SCSIID[0xff]
[  141.012326] SSTAT0[0x10]:(SELINGO) SSTAT1[0x0]
[  141.012336] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0xc0]:(HIPERR|HIZERO)
[  141.012351] SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|ENSCSIRST|ENSELTIMO)
[  141.012362] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x80]:(PACKETIZED)
[  141.012375] LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x40]
[  141.012396]
[ 141.012397] SCB Count = 4 CMDS_PENDING = 2 LASTSCB 0xffff CURRSCB 0x2 NEXTSCB 0xff80
[  141.012407] qinstart = 159 qinfifonext = 159
[  141.012409] QINFIFO:
[  141.012413] WAITING_TID_QUEUES:
[  141.012426]        2 ( 0x2 0x3 )
[  141.012442] Pending list:
[  141.012446]   3 FIFO_USE[0x0] SCB_CONTROL[0x60]:(TAG_ENB|DISCENB)
[  141.012459] SCB_SCSIID[0x27]
[  141.012464]   2 FIFO_USE[0x0] SCB_CONTROL[0x60]:(TAG_ENB|DISCENB)
[  141.012475] SCB_SCSIID[0x27]
[  141.012479] Total 2
[  141.012481] Kernel Free SCB list: 1 0
[  141.012487] Sequencer Complete DMA-inprog list:
[  141.012493] Sequencer Complete list:
[  141.012499] Sequencer DMA-Up and Complete list:
[  141.012505] Sequencer On QFreeze and Complete list:
[  141.012519]
[  141.012520]
[  141.012521] scsi5: FIFO0 Free, LONGJMP == 0x8254, SCB 0x3
[ 141.012526] SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS) [ 141.012540] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
[  141.012556] SG_CACHE_SHADOW[0x2]:(LAST_SEG)
[  141.012563] SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0]
[  141.012574] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0
[  141.012607] HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
[  141.012621]
[  141.012622] scsi5: FIFO1 Free, LONGJMP == 0x8063, SCB 0x3
[ 141.012627] SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS) [ 141.012640] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
[  141.012655] SG_CACHE_SHADOW[0x2]:(LAST_SEG)
[  141.012662] SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0]
[  141.012673] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0
[  141.012705] HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
[ 141.012712] LQIN: 0x8 0x0 0x0 0x3 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
[  141.012758] scsi5: LQISTATE = 0x1, LQOSTATE = 0x1a, OPTIONMODE = 0x52
[  141.012765] scsi5: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
[  141.012771] scsi5: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0
[  141.012775]
[  141.012778] SIMODE0[0xc]:(ENOVERRUN|ENIOERR)
[  141.012786] CCSCBCTL[0x4]:(CCSCBDIR)
[  141.012799] scsi5: REG0 == 0x3, SINDEX = 0x106, DINDEX = 0x106
[  141.012810] scsi5: SCBPTR == 0x2, SCB_NEXT == 0x3, SCB_NEXT2 == 0xff80
[  141.012821] CDB 28 0 0 0 e 0
[  141.012824] STACK: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
[  141.012852] <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
[  141.012891] scsi5:0:2:0: Cmd aborted from QINFIFO
[ 151.010164] sd 5:0:2:0: [sde] Attempting to queue an ABORT message:CDB: 0x0 0x0 0x0 0x0 0x0 0x0
[  151.010758] scsi5: At time of recovery, card was not paused
[  151.010768] >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
[  151.010771] scsi5: Dumping Card State at program address 0x20 Mode 0x22
[  151.010776] Card was paused
[  151.010780] INTSTAT[0x0] SELOID[0x2] SELID[0x20]
[  151.010795] HS_MAILBOX[0x0] INTCTL[0x80]:(SWTMINTMASK)
[  151.010806] SEQINTSTAT[0x0] SAVED_MODE[0x11]
[  151.010815] DFFSTAT[0x33]:(CURRFIFO_NONE|FIFO0FREE|FIFO1FREE)
[  151.010826] SCSISIGI[0x25]:(P_DATAOUT_DT|ACKI|BSYI)
[ 151.010836] SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE)
[  151.010851] SCSISEQ0[0x40]:(ENSELO) SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI)
[  151.010864] SEQCTL0[0x0] SEQINTCTL[0x0] SEQ_FLAGS[0x0]
[  151.010876] SEQ_FLAGS2[0x4]:(SELECTOUT_QFROZEN)
[  151.010884] QFREEZE_COUNT[0x1] KERNEL_QFREEZE_COUNT[0x1]
[  151.010896] MK_MESSAGE_SCB[0xff00] MK_MESSAGE_SCSIID[0xff]
[  151.010905] SSTAT0[0x10]:(SELINGO) SSTAT1[0x0]
[  151.010915] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0xc0]:(HIPERR|HIZERO)
[  151.010930] SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|ENSCSIRST|ENSELTIMO)
[  151.010940] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x80]:(PACKETIZED)
[  151.010954] LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x40]
[  151.010975]
[ 151.010977] SCB Count = 4 CMDS_PENDING = 0 LASTSCB 0xffff CURRSCB 0x2 NEXTSCB 0xff80
[  151.010987] qinstart = 159 qinfifonext = 160
[  151.010989] QINFIFO: 0x3
[  151.010994] WAITING_TID_QUEUES:
[  151.011006] Pending list:
[  151.011011]   3 FIFO_USE[0x0] SCB_CONTROL[0x60]:(TAG_ENB|DISCENB)
[  151.011023] SCB_SCSIID[0x27]
[  151.011027] Total 1
[  151.011030] Kernel Free SCB list: 2 1 0
[  151.011036] Sequencer Complete DMA-inprog list:
[  151.011042] Sequencer Complete list:
[  151.011048] Sequencer DMA-Up and Complete list:
[  151.011054] Sequencer On QFreeze and Complete list:
[  151.011068]
[  151.011069]
[  151.011071] scsi5: FIFO0 Free, LONGJMP == 0x8254, SCB 0x3
[ 151.011076] SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS) [ 151.011089] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
[  151.011105] SG_CACHE_SHADOW[0x2]:(LAST_SEG)
[  151.011112] SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0]
[  151.011124] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0
[  151.011157] HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
[  151.011171]
[  151.011172] scsi5: FIFO1 Free, LONGJMP == 0x8063, SCB 0x3
[ 151.011177] SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS) [ 151.011189] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
[  151.011205] SG_CACHE_SHADOW[0x2]:(LAST_SEG)
[  151.011211] SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0]
[  151.011223] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0
[  151.011255] HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
[ 151.011262] LQIN: 0x8 0x0 0x0 0x3 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
[  151.011310] scsi5: LQISTATE = 0x1, LQOSTATE = 0x1a, OPTIONMODE = 0x52
[  151.011317] scsi5: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
[  151.011324] scsi5: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0
[  151.011329]
[  151.011331] SIMODE0[0xc]:(ENOVERRUN|ENIOERR)
[  151.011339] CCSCBCTL[0x4]:(CCSCBDIR)
[  151.011352] scsi5: REG0 == 0x3, SINDEX = 0x106, DINDEX = 0x106
[  151.011363] scsi5: SCBPTR == 0x2, SCB_NEXT == 0x3, SCB_NEXT2 == 0xff80
[  151.011375] CDB 28 0 0 0 e 0
[  151.011377] STACK: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
[  151.011406] <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
[  151.011425] scsi5:0:2:0: Cmd aborted from QINFIFO
[ 151.011455] sd 5:0:2:0: [sde] Attempting to queue an ABORT message:CDB: 0x28 0x0 0x0 0x0 0xf 0x0 0x0 0x1 0x0 0x0
[  151.011470] sd 5:0:2:0: [sde] Command not found
[ 161.020136] sd 5:0:2:0: [sde] Attempting to queue an ABORT message:CDB: 0x0 0x0 0x0 0x0 0x0 0x0
[  161.021710] scsi5: At time of recovery, card was not paused
[  161.021720] >>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
[  161.021724] scsi5: Dumping Card State at program address 0x20 Mode 0x22
[  161.021728] Card was paused
[  161.021733] INTSTAT[0x0] SELOID[0x2] SELID[0x20]
[  161.021747] HS_MAILBOX[0x0] INTCTL[0x80]:(SWTMINTMASK)
[  161.021758] SEQINTSTAT[0x0] SAVED_MODE[0x11]
[  161.021767] DFFSTAT[0x33]:(CURRFIFO_NONE|FIFO0FREE|FIFO1FREE)
[  161.021777] SCSISIGI[0x25]:(P_DATAOUT_DT|ACKI|BSYI)
[ 161.021787] SCSIPHASE[0x0] SCSIBUS[0x0] LASTPHASE[0x1]:(P_DATAOUT|P_BUSFREE)
[  161.021802] SCSISEQ0[0x40]:(ENSELO) SCSISEQ1[0x12]:(ENAUTOATNP|ENRSELI)
[  161.021815] SEQCTL0[0x0] SEQINTCTL[0x0] SEQ_FLAGS[0x0]
[  161.021827] SEQ_FLAGS2[0x4]:(SELECTOUT_QFROZEN)
[  161.021835] QFREEZE_COUNT[0x1] KERNEL_QFREEZE_COUNT[0x1]
[  161.021847] MK_MESSAGE_SCB[0xff00] MK_MESSAGE_SCSIID[0xff]
[  161.021855] SSTAT0[0x10]:(SELINGO) SSTAT1[0x0]
[  161.021865] SSTAT2[0x0] SSTAT3[0x0] PERRDIAG[0xc0]:(HIPERR|HIZERO)
[  161.021880] SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|ENSCSIRST|ENSELTIMO)
[  161.021890] LQISTAT0[0x0] LQISTAT1[0x0] LQISTAT2[0x80]:(PACKETIZED)
[  161.021904] LQOSTAT0[0x0] LQOSTAT1[0x0] LQOSTAT2[0x40]
[  161.021925]
[ 161.021926] SCB Count = 4 CMDS_PENDING = 0 LASTSCB 0xffff CURRSCB 0x2 NEXTSCB 0xff80
[  161.021937] qinstart = 159 qinfifonext = 160
[  161.021939] QINFIFO: 0x3
[  161.021945] WAITING_TID_QUEUES:
[  161.021957] Pending list:
[  161.021961]   3 FIFO_USE[0x0] SCB_CONTROL[0x60]:(TAG_ENB|DISCENB)
[  161.021974] SCB_SCSIID[0x27]
[  161.021977] Total 1
[  161.021980] Kernel Free SCB list: 2 1 0
[  161.021987] Sequencer Complete DMA-inprog list:
[  161.021993] Sequencer Complete list:
[  161.021998] Sequencer DMA-Up and Complete list:
[  161.022004] Sequencer On QFreeze and Complete list:
[  161.022019]
[  161.022020]
[  161.022021] scsi5: FIFO0 Free, LONGJMP == 0x8254, SCB 0x3
[ 161.022026] SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS) [ 161.022039] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
[  161.022056] SG_CACHE_SHADOW[0x2]:(LAST_SEG)
[  161.022062] SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0]
[  161.022074] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0
[  161.022107] HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
[  161.022121]
[  161.022122] scsi5: FIFO1 Free, LONGJMP == 0x8063, SCB 0x3
[ 161.022127] SEQIMODE[0x3f]:(ENCFG4TCMD|ENCFG4ICMD|ENCFG4TSTAT|ENCFG4ISTAT|ENCFG4DATA|ENSAVEPTRS) [ 161.022140] SEQINTSRC[0x0] DFCNTRL[0x0] DFSTATUS[0x89]:(FIFOEMP|HDONE|PRELOAD_AVAIL)
[  161.022155] SG_CACHE_SHADOW[0x2]:(LAST_SEG)
[  161.022162] SG_STATE[0x0] DFFSXFRCTL[0x0] SOFFCNT[0x0]
[  161.022173] MDFFSTAT[0x5]:(FIFOFREE|DLZERO) SHADDR = 0x00, SHCNT = 0x0
[  161.022205] HADDR = 0x00, HCNT = 0x0 CCSGCTL[0x10]:(SG_CACHE_AVAIL)
[ 161.022213] LQIN: 0x8 0x0 0x0 0x3 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
[  161.022259] scsi5: LQISTATE = 0x1, LQOSTATE = 0x1a, OPTIONMODE = 0x52
[  161.022265] scsi5: OS_SPACE_CNT = 0x20 MAXCMDCNT = 0x0
[  161.022271] scsi5: SAVED_SCSIID = 0x0 SAVED_LUN = 0x0
[  161.022275]
[  161.022278] SIMODE0[0xc]:(ENOVERRUN|ENIOERR)
[  161.022286] CCSCBCTL[0x4]:(CCSCBDIR)
[  161.022299] scsi5: REG0 == 0x3, SINDEX = 0x106, DINDEX = 0x106
[  161.022310] scsi5: SCBPTR == 0x2, SCB_NEXT == 0x3, SCB_NEXT2 == 0xff80
[  161.022322] CDB 28 0 0 0 e 0
[  161.022324] STACK: 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
[  161.022353] <<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
[  161.022372] scsi5:0:2:0: Cmd aborted from QINFIFO
[ 161.022407] sd 5:0:2:0: [sde] Attempting to queue a TARGET RESET message:CDB: 0x28 0x0 0x0 0x0 0xe 0x0 0x0 0x1 0x0 0x0
[  161.022423] scsi5: Device reset code sleeping
[  166.010113] scsi5: Device reset timer expired (active 1)
[  166.010121] scsi5: Device reset returning 0x2003
[  166.010225] Recovery SCB completes
[  176.029843] sd 5:0:2:0: Device offlined - not ready after error recovery
[  176.029868] sd 5:0:2:0: [sde] Unhandled error code
[ 176.029873] sd 5:0:2:0: [sde] Result: hostbyte=DID_ABORT driverbyte=DRIVER_OK [ 176.029882] sd 5:0:2:0: [sde] CDB: Read(10): 28 00 00 00 0f 00 00 01 00 00
[  176.029901] end_request: I/O error, dev sde, sector 3840
[  176.029929] Buffer I/O error on device sde, logical block 480
[  176.029951] Buffer I/O error on device sde, logical block 481
[  176.029970] Buffer I/O error on device sde, logical block 482
[  176.029988] Buffer I/O error on device sde, logical block 483
[  176.030006] Buffer I/O error on device sde, logical block 484
[  176.030023] Buffer I/O error on device sde, logical block 485
[  176.030041] Buffer I/O error on device sde, logical block 486
[  176.030058] Buffer I/O error on device sde, logical block 487
[  176.030075] Buffer I/O error on device sde, logical block 488
[  176.030092] Buffer I/O error on device sde, logical block 489
[  176.030152] sd 5:0:2:0: [sde] Unhandled error code
[ 176.030156] sd 5:0:2:0: [sde] Result: hostbyte=DID_BUS_BUSY driverbyte=DRIVER_OK [ 176.030162] sd 5:0:2:0: [sde] CDB: Read(10): 28 00 00 00 0e 00 00 01 00 00
[  176.030178] end_request: I/O error, dev sde, sector 3584

/BIOS każdego kontrolera ustawiony jest według zaleceń producenta tj.:
SCSI HBA BIOS settings:
1. All defaults settings from Adaptec HBA (Try to load the defaults settings on the HBA).
2. Disable "QAS" (function not supported by RAID controller).
3. Enabled "BIOS Multiple LUN Support".
4. Disable "Domain Validation".
5. Disable "Host RAID".
6. Drive Predictable Failure Mode (SMART): Enabled

SCSI ID kontrolera Adaptec: 7, ID macierzy: 2 oraz 5 (ma 2 kontrolery wbudowane), zatem są inne. Na każdym z kontrolerów zarówno macierzy jak i Adaptec'a jest taki sam efekt. Kable zostały zmienione i jest to samo.
Nie było tego pod Windows Server, więc to problem systemowy.
Pomaga jedynie ustawienie:
- "Allow disconnecting" na DISABLED
- i ustawienie trybu SCSI-160 w BIOSie kontrolera Adaptec.

Ale przy takich ustawieniach wydajność macierzy jest bardzo niska (80 MB/s odczyt - 5 dysków 500GB w środku), w porównaniu do software RAID5 przez Mdadm - z 4 dysków 250GB wyciska 190 MB/s.

Standardowe jajka 2.6.32-24-generic / 2.6.32-24-server.
Z czego może wynikać problem?


Reply to: