Dear List,I have been having data corruption problems for the last two months on 7 servers.
After extensive testing, I have finally narrowed the problem down to Debian Etch 2.6.18-5 kernel with the 3ware PCI controller. The same machine using the onboard SATA controller does not
corrupt data.The machines would also hang occasionally - no errors displayed on screen.
I upgraded to a 2.6.23-13 kernel.org kernel 24 hours ago, and have not been able to reproduce these problems since then - Previously it would take about 10 minutes for the problem to appear.
I could reproduce these problems by using a java program to insert logs (30,000,000 records)
into a local postgres 8.2.5 database - After this I would see"DETAIL: Could not open file "pg_clog/0495": No such file or directory."
type messages in my postgres logs.I had also managed to corrupt my SVN repository - md5s of the files no longer matched
what was in the SVN database... (svnadmin verfify /path/to/repository) Has anyone seen these problems? Below - details as to my raid controller. Regards Andrew ---03:05.0 RAID bus controller: 3ware Inc 7xxx/8xxx-series PATA/SATA-RAID (rev 01)
Subsystem: 3ware Inc 7xxx/8xxx-series PATA/SATA-RAIDControl: I/O+ Mem+ BusMaster+ SpecCycle+ MemWINV+ VGASnoop- ParErr- Stepping- SERR+ FastB2B- Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
Latency: 64 (2250ns min), Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 22 Region 0: I/O ports at e800 [size=16] Region 1: Memory at febffc00 (32-bit, non-prefetchable) [size=16] Region 2: Memory at fe000000 (32-bit, non-prefetchable) [size=8M] Expansion ROM at f0100000 [disabled] [size=64K] Capabilities: [40] Power Management version 1Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 PME-Enable- DSel=0 DScale=0 PME-
Attachment:
smime.p7s
Description: S/MIME cryptographic signature