[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

How to fix I/O errors?



I apologize for this being so long, but since the problem occurs sporadically I wanted to get as much information in this post as possible because I don't know when it will happen again.

This problem started a bout two weeks ago. I woke up to find a black screen and a kernel panic. I rebooted and was presented with many fsck errors that could not be handled automatically so I ran it manually, as directed. I took all the defaults. Any time that I was shown a file name it seemed to be a flash file in my daughters /home directory or otherwise related to flash. Afterwards, the only partition that I found anything in lost+found was /home and all of the files there were, indeed, showing my daughter as owner. I shutdown and rebooted to get everything clean and it seemed good for a while. Since then, however, every day or two things just stop working properly. Menus cease to do anything, pages don't load in the browser, etc. If I exit from X and work at a console, some commands (like ls) seem to work fine, others do not, giving me I/O error messages. I can't even do a typescript, or redirect the output to a file that I could attach here, since I just get errors. I can't even do a ctl-alt-del to reboot, as I get an error saying:

INIT: cannot execute "/sbin/shutdown"


I have no choice but to power down with the power button, which I really don't like to do.

It happened again, today, and I manually copied down the errors so I hope that I got it all correct. This is what I did before shutting down:

marc@quixote:~$ mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,relatime,size=10240k,nr_inodes=3081484,mode=755) devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=2472496k,mode=755)
/dev/sda2 on / type ext3 (ro,relatime,errors=remount-ro,data=ordered)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
pstore on /sys/fs/pstore type pstore (rw,relatime)
tmpfs on /run/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=6622700k)
/dev/mapper/vg1-home on /home type ext3 (ro,relatime,data=ordered)
/dev/mapper/vg1-tmp--jessie on /tmp type ext3 (ro,relatime,data=ordered)
/dev/mapper/vg1-usr--jessie on /usr type ext3 (ro,relatime,data=ordered)
/dev/mapper/vg1-usrlocal on /usr/local type ext3 (ro,relatime,data=ordered)
/dev/mapper/vg1-photos on /usr/local/photos type ext3 (rw,relatime,data=ordered) /dev/mapper/vg1-vDisks on /usr/local/vdisks type ext3 (rw,relatime,data=ordered)
/dev/mapper/vg1-var--jessie on /var type ext3 (ro,relatime,data=ordered)
rpc_pipefs on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
fusectl on /sys/fs/fuse/connections type fusectl (rw,relatime)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup type tmpfs (rw,relatime,size=12k)
cgmfs on /run/cgmanager/fs type tmpfs (rw,relatime,size=100k,mode=755)
systemd on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,release_agent=/usr/lib/x86_64-linux-gnu/systemd-shim-cgroup-release-agent,name=systemd) tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=2472496k,mode=700,uid=1000,gid=1000) gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)

Note that almost all real filesystems are readonly.


I logged out and back in as root. From /root I attempted to copy a text file to /usr/local/photos (which still shows as rw):


cp wheezy1.script /usr/local/photos

[] sd: 0:0:0:0: [sda] Unhandled error code

[] sd: 0:0:0:0: [sda]

[] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

[] sd: 0:0:0:0: [sda] CDB:

[] Read(10): 28 00 00 3e bc 68 00 00 08 00

[] end_request: I/O error, dev sda, sector 4111464

[] sd: 0:0:0:0: [sda] Unhandled error code

[] sd: 0:0:0:0: [sda]

[] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

[] sd: 0:0:0:0: [sda] CDB:

[] Read(10): 28 00 00 3e bc 68 00 00 08 00

[] end_request: I/O error, dev sda, sector 4111464

-bash /bin/cp: Input/output error


NOTE: all the empty brackets on the left actually had timestamps in them. The same is true in all following cases, as well.


I then changed directory to /usr/local/photos and tried to create a new file with touch:


touch tempfile

[] Write(10): 2a 00 08 56 9e 0c 00 00 08 00

[] sd: 0:0:0:0: [sda] Unhandled error code

[] sd: 0:0:0:0: [sda]

[] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

[] sd: 0:0:0:0: [sda] CDB:

[] Read(10): 28 00 08 56 05 1c 00 00 08 00

[] sd: 0:0:0:0: [sda] Unhandled error code

[] sd: 0:0:0:0: [sda]

[] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

[] sd: 0:0:0:0: [sda] CDB:


Finally, I tried to unmount /home with the intention of remounting it to see if it would come back as rw:


umount /home

[] sd: 0:0:0:0: [sda] Unhandled error code

[] sd: 0:0:0:0: [sda]

[] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK

[] sd: 0:0:0:0: [sda] CDB:

[] Write(10) 2a 00 00 75 5e 0c 00 00 08 00

[] end_request: I/O error, dev sda, sector 209018380

[] Buffer I/O errpr on device dm-6, logical block 0

[] lost page write due to I/O error on dm-6

[] EXT4-fs error (device dm-6): ext4_put_super: 795: Couldn't clean up the journal


NOTE: NONE of my filesystems are EXT4.  They are ALL EXT3.

Due to the errors I did not try to remount /home.

Then I shutdown with the power button and rebooted. Everything is working, now. All the filesystems that should be rw are rw, but within a day or two this will almost certainly happen again.


If anyone can give me a clue how to correct this I would be most grateful. If further info is necessary it will probably have to wait until this happens again.

BTW: I have plenty of space available in the LV, so I could create a new partition for /home and copy everything from the current partition while it is not giving me errors, if that is likely to fix the problem, but I would still like to know just went wrong and how to prevent it in the future.

Marc









Reply to: