broken system after srm -r -d /tmp/.* (user login and several services not working)
Hello,
I've broken my debian/unstable system by executing as root the command
# srm -r -d /tmp/.*
I aborted the command after something between 20 and 40 seconds, but
since then, my system behaves strange:
If I try to login as normal user on the console, I get the error
"Unable to cd to '/home/user'" and the login aborts.
Login as root works without any issues.
When I try to su to a normal user from root, I get: "Cannot execute
/bin/bash: Permission denied." The permissions for /bin/bash are ok:
-rwxr-xr-x 1 root root 797784 2008-05-12 19:00 /bin/bash
I even tried to add a new user with 'adduser username', which seemed
to work without problems, but if I try to login/su to that new user,
the same errors occur. I already checked for /etc/passwd, /etc/group and
/etc/pam.d being available.
Additionally several services fail to start, please see below for exact
error messages.
Next, I rebooted to see whether the system still was able to boot at
all. And indeed, the system booted without issues, except that the
services mentioned below didn't start, and the normal user login still
didn't work as well.
So I started to try to recover the system (I don't have system backups,
I only backup /home):
First, I did a 'dpkg-reconfigure' for all installed packages, but as that
didn't change anything, I reinstalled all installed packages:
# packages=$(dpkg -l | grep "^i" |awk '{print $2}')
# apt-get -o Dpkg::Options::="--force-confmiss" --reinstall install $packages
That one succeeded for all packages except the ones that contain the
services which don't start anymore, and obviosly the reinstall didn't
fix that.
So now, i've a system which has all packages reinstalled with the dpkg
option to restore removed configfiles, but still user login and serveral
services fail.
Do you have any suggestions how to go on with debugging the issue? Maybe
you even discovered similar issues in the past?
It would be really painful to setup a clean new system and configure it
the way I like, as this system is already several years old and has
serveral individual configurations.
Now the exact error messages of the services that do fail to start:
# /etc/init.d/exim4 start
Starting MTA:2008-07-01 20:49:06 Exim configuration error in line 662 of /var/lib/exim4/config.autogenerated.tmp:
user mail was not found
exim: could not open panic log - aborting: see message(s) above
Invalid new configfile /var/lib/exim4/config.autogenerated.tmp, not installing
/var/lib/exim4/config.autogenerated.tmp to /var/lib/exim4/config.autogenerated
# /etc/init.d/gdm start
Starting GNOME Display Manager: gdmgdm[16973]: WARNING: gdm_config_parse: Authdir /var/lib/gdm does not exist. Aborting.
gdm_config_parse: Authdir /var/lib/gdm does not exist. Aborting.
# /etc/init.d/mysql start
Starting MySQL database server: mysqld . . . . . . . . . . . . . . failed!
# tail -n5 /var/log/syslog
Jul 1 22:55:25 jonas /etc/init.d/mysql[17210]: 0 processes alive and '/usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping' resulted in
Jul 1 22:55:25 jonas /etc/init.d/mysql[17210]: /usr/bin/mysqladmin: connect to server at 'localhost' failed
Jul 1 22:55:25 jonas /etc/init.d/mysql[17210]: error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)'
Jul 1 22:55:25 jonas /etc/init.d/mysql[17210]: Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists!
Jul 1 22:55:25 jonas /etc/init.d/mysql[17210]:
# /etc/init.d/avahi start
Starting Avahi mDNS/DNS-SD Daemon: avahi-daemon failed!
# tail -n4 /var/log/syslog
Jul 1 22:56:53 jonas avahi-daemon[17228]: Found user 'avahi' (UID 110) and group 'avahi' (GID 115).
Jul 1 22:56:53 jonas avahi-daemon[17228]: Successfully dropped root privileges.
Jul 1 20:56:53 jonas avahi-daemon[17228]: open(/var/run/avahi-daemon//pid): Permission denied
Jul 1 20:56:53 jonas avahi-daemon[17228]: Failed to create PID file: Permission denied
Just to prevent any flamewar about why I (should not) have invoked the
command in the first time, I know that I did make a mistake, but still
the reason why I invoked the command was that I intended to wipe all
data from /tmp to replace the /tmp on my rootfs with a tmpfs. srm
from the package secure_delete is a tool for secure file deletion,
just like shred or wipe. According to the manpage, the cmdline option
-d makes srm "ignore the two dot special files "." and ".."", so I
thought that the command was safe.
Few seconds later I had to learn that it was not. The execution took far
longer than it should for wiping some small files from /tmp, and I
aborted the execution after something between 20 and 40 seconds.
Afterwards my first guess was that srm might have started to delete the
files in /boot, as these are the first one in alphabetical order on /.
But it seems like all files in /boot where still present. So I went on
searching for missed files, but the only one I could identify was
/root/.bashrc.
So I rebooted to see whether everything was ok, and indeed the system
booted without issues, only a few services failed to start. Among others
these are avahi, hal, exim4, privoxy, mysql and gdm. But I already
mentioned that above.
If you have any suggestions, please tell me. I would be really glad to
save my system.
My current guess is that it must have something to do with permissions
and/or pam, as both login and su for normal users fail, and services
which start as a user other than root fail to start. On the other hand,
apache2 for example starts without issues, and it runs as a nonroot user
(www-data) as well.
If it helps, here is the output of ps -ef:
# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 21:39 ? 00:00:00 init [2]
root 2 0 0 21:39 ? 00:00:00 [kthreadd]
root 3 2 0 21:39 ? 00:00:00 [migration/0]
root 4 2 0 21:39 ? 00:00:00 [ksoftirqd/0]
root 5 2 0 21:39 ? 00:00:00 [migration/1]
root 6 2 0 21:39 ? 00:00:00 [ksoftirqd/1]
root 7 2 0 21:39 ? 00:00:00 [events/0]
root 8 2 0 21:39 ? 00:00:45 [events/1]
root 9 2 0 21:39 ? 00:00:00 [khelper]
root 97 2 0 21:39 ? 00:00:00 [kblockd/0]
root 98 2 0 21:39 ? 00:00:00 [kblockd/1]
root 100 2 0 21:39 ? 00:00:00 [kacpid]
root 101 2 0 21:39 ? 00:00:00 [kacpi_notify]
root 190 2 0 21:39 ? 00:00:00 [ata/0]
root 191 2 0 21:39 ? 00:00:00 [ata/1]
root 192 2 0 21:39 ? 00:00:00 [ata_aux]
root 193 2 0 21:39 ? 00:00:00 [ksuspend_usbd]
root 199 2 0 21:39 ? 00:00:00 [khubd]
root 202 2 0 21:39 ? 00:00:00 [kseriod]
root 250 2 0 21:39 ? 00:00:00 [pdflush]
root 251 2 0 21:39 ? 00:00:00 [pdflush]
root 252 2 0 21:39 ? 00:00:00 [kswapd0]
root 253 2 0 21:39 ? 00:00:00 [aio/0]
root 254 2 0 21:39 ? 00:00:00 [aio/1]
root 404 2 0 21:39 ? 00:00:00 [ivtv0/0]
root 405 2 0 21:39 ? 00:00:00 [ivtv0/1]
root 411 2 0 21:39 ? 00:00:00 [msp34xx]
root 436 2 0 21:39 ? 00:00:00 [firewire_sbp2]
root 462 2 0 21:39 ? 00:00:00 [scsi_eh_0]
root 516 2 0 21:39 ? 00:00:00 [kpsmoused]
root 522 2 0 21:39 ? 00:00:00 [kstriped]
root 1686 2 0 21:39 ? 00:00:00 [scsi_eh_1]
root 1692 2 0 21:39 ? 00:00:00 [scsi_eh_2]
root 1786 2 0 21:39 ? 00:00:00 [scsi_eh_3]
root 1787 2 0 21:39 ? 00:00:00 [scsi_eh_4]
root 1792 2 0 21:39 ? 00:00:00 [scsi_eh_5]
root 1793 2 0 21:39 ? 00:00:00 [scsi_eh_6]
root 1794 2 0 21:39 ? 00:00:00 [scsi_eh_7]
root 1795 2 0 21:39 ? 00:00:00 [scsi_eh_8]
root 1860 2 0 21:39 ? 00:00:00 [ksnapd]
root 1881 2 0 21:39 ? 00:00:00 [kjournald]
root 1980 1 0 21:39 ? 00:00:00 udevd --daemon
root 3040 2 0 21:40 ? 00:00:00 [kdmflush]
root 3050 2 0 21:40 ? 00:00:00 [kcryptd_io]
root 3051 2 0 21:40 ? 00:00:00 [kcryptd]
root 3527 2 0 21:43 ? 00:00:00 [kdmflush]
root 3528 2 0 21:43 ? 00:00:00 [kcryptd_io]
root 3529 2 0 21:43 ? 00:00:05 [kcryptd]
root 3577 2 0 21:43 ? 00:00:00 [kdmflush]
root 3578 2 0 21:43 ? 00:00:00 [kcryptd_io]
root 3579 2 0 21:43 ? 00:00:00 [kcryptd]
root 3651 2 0 21:43 ? 00:00:00 [kdmflush]
root 3653 2 0 21:43 ? 00:00:00 [kdmflush]
root 3656 2 0 21:43 ? 00:00:00 [kdmflush]
root 3659 2 0 21:43 ? 00:00:00 [kdmflush]
root 3761 2 0 21:43 ? 00:00:00 [kjournald]
root 3762 2 0 21:43 ? 00:00:00 [kjournald]
root 3763 2 0 21:43 ? 00:00:00 [kjournald]
root 3767 2 0 21:43 ? 00:00:00 [kjournald]
root 3768 2 0 21:43 ? 00:00:00 [kjournald]
daemon 3943 1 0 21:43 ? 00:00:00 /sbin/portmap -i 127.0.0.1
root 4236 1 0 21:43 ? 00:00:00 /sbin/syslog-ng -p /var/run/syslog-ng.pid
root 4271 1 0 21:43 ? 00:00:00 /usr/sbin/sshd
root 4295 1 0 21:43 ? 00:00:00 /usr/sbin/inetd
root 4311 1 0 21:43 ? 00:00:00 /usr/sbin/cron
root 4329 1 0 21:43 ? 00:00:00 /usr/sbin/atieventsd
root 4400 1 0 21:43 ? 00:00:00 /usr/sbin/acpid -c /etc/acpi/events
root 4405 1 0 21:43 ? 00:00:00 /usr/sbin/apache2 -k start
root 4457 1 0 21:43 ? 00:00:00 /usr/sbin/cupsd
www-data 4539 4405 0 21:43 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 4540 4405 0 21:43 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 4541 4405 0 21:43 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 4543 4405 0 21:43 ? 00:00:00 /usr/sbin/apache2 -k start
www-data 4544 4405 0 21:43 ? 00:00:00 /usr/sbin/apache2 -k start
root 4962 1 0 21:43 ? 00:00:00 /usr/sbin/nmbd -D
root 5026 1 0 21:43 ? 00:00:00 /usr/sbin/smbd -D
root 5061 5026 0 21:43 ? 00:00:00 /usr/sbin/smbd -D
root 5071 1 0 21:43 tty1 00:00:00 /bin/login --
root 5074 1 0 21:43 tty6 00:00:00 /bin/login --
root 5191 5071 0 22:13 tty1 00:00:00 -bash
root 5211 5074 0 22:13 tty6 00:00:00 -bash
root 8116 5211 0 22:20 tty6 00:00:00 mutt
root 10531 8116 0 22:32 tty6 00:00:00 sh -c vim -f '+/^$' -c 'set ft=mail tw=72' '/tmp/mutt-jonas-0-8116-79'
root 10532 10531 0 22:32 tty6 00:00:00 vim -f +/^$ -c set ft=mail tw=72 /tmp/mutt-jonas-0-8116-79
root 14650 1 0 22:46 tty2 00:00:00 /bin/login --
root 15027 14650 0 22:49 tty2 00:00:00 -bash
root 16959 1 0 22:52 ? 00:00:00 /usr/sbin/gpm -m /dev/input/mice -t exps2
105 17458 1 0 22:58 ? 00:00:00 /usr/bin/dbus-daemon --system
root 17657 5191 0 23:10 tty1 00:00:00 ps -ef
greetings, and thanks in advance,
jonas
Reply to: