[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Changing binaries



Andrew, thanks for the reply, there's only one filesystem and it's root so I won't be able to mount it read only.  I haven't tried changing permissions, but I will, and update the output so I can try and figure out what's happening.

As the processes, yes they all respond to kill -9 so thats not a problem, just the fact they are hanging around.

Actually another problem I have seen which may be releated is that when I scp files to the server today, the scp never completes, the files are sent, but it never finishes so I have to Ctrl C it.  So it's like my server isn't completing commands correctly. 

I'm due to go down to the office this week so I will reboot it then.  Hopefully it'll stay up until then.

Thanks for the suggestions

Giles

Andrew Sackville-West <andrew@farwestbilliards.com> wrote:
On Mon, Jul 31, 2006 at 09:41:51PM +0100, Giles McGarry wrote:
> Dear all, I've just inherited a debian system. I'm affraid I'm not very experienced with Debian, coming more from a Solaris background so please be patient if the questions are numpty.
>
> I have a problem at the moment, strangely various binaries in the /bin directory are changing size and becoming corrupt. When I restore the original they work ok, and then at some time later they change size and stop working. I've now restored all of the files (there's about a dozen) into /bin2 which I can use when the ones in /bin get corrupt. The original (and working file in /bin2 is as follows:

I am very inexperienced in these things, but from a simple
"problem-solving" point of view, I suggest the following...

I'd be afraid you've either been rooted, or you have failing hardware
that is causing this. Hardware problems should show up elsewhere in
the file tree as well though.

can you remount the partition read-only? then if you have changes show
up, you can see if its been remounted again... sure sign of rooting,
IMVHO.

>
> /bin2/ls -l /bin2/ls
> -rwxr-xr-x 1 root root 75948 Jul 31 17:17 /bin2/ls
>
> but the one in /bin is different ie
>
> /tmp/ls -al /bin/ls
> -rwxr-xr-x 1 root root 80044 Jul 31 21:29 /bin/ls
>
> As you can see it changed size recently tonight. Looking in /bin all of the following files are also larger than they were and have all changed size at the same time:
>
> -rwxr-xr-x 1 root root 80044 Jul 31 21:29 vdir
> -rwxr-xr-x 1 root root 34456 Jul 31 21:29 touch
> -rwxr-xr-x 1 root root 9716 Jul 31 21:29 tempfile
> -rwxr-xr-x 1 root root 16312 Jul 31 21:29 sync
> -rwxr-xr-x 1 root root 17944 Jul 31 21:29 rmdir
> -rwxr-xr-x 1 root root 34808 Jul 31 21:29 rm
> -rwxr-xr-x 1 root root 17944 Jul 31 21:29 readlink
> -rwxr-xr-x 1 root root 9672 Jul 31 21:29 mktemp
> -rwxr-xr-x 1 root root 23276 Jul 31 21:29 mknod
> -rwxr-xr-x 1 root root 24984 Jul 31 21:29 mkdir
> -rwxr-xr-x 1 root root 80044 Jul 31 21:29 ls
> -rwxr-xr-x 1 root root 27192 Jul 31 21:29 ln
> -rwxr-xr-x 1 root root 57772 Jul 31 21:29 gzip
> -rwxr-xr-x 1 root root 80044 Jul 31 21:29 dir
> -rwxr-xr-x 1 root root 35820 Jul 31 21:29 df
> -rwxr-xr-x 1 root root 32684 Jul 31 21:29 dd
> -rwxr-xr-x 1 root root 55308 Jul 31 21:29 cp
> -rwxr-xr-x 1 root root 38668 Jul 31 21:29 chown
> -rwxr-xr-x 1 root root 35308 Jul 31 21:29 chmod
>
> all of them slightly larger than what they should be. When I run the currupt verion of /bin/ls I get the following:

what process ran at that time? maybe an automatic fsck that is
fsck'ing (heh!) the drive?


>
> # /bin/ls
> Segmentation fault
>
> I've just written a script to watch the files changing so it restores them, but that's no fix at all I've tried to ascertain why they are changing but cannot get to the bottom of it, sometimes it's actually while I'm on the system. Strangley I also have a copy in /tmp that I've had there all day and that's never been corrupted it has same ownership permissions etc as the one in /bin/ls.

have you tried changing ownership/permissions to see if you can narrow
down the source of this? Also, using your script above, get a snapshot
of what processes are running at the time you see the corruption.

I think you probably need to get a really good run of ps outputs so
you can find something running at the time of corruption.

>
> Also I've got various commands hanging around in a ps listing, either supposedly still running or defunct, eg
>
> root 2142 1 0 Jul27 ? 00:00:00 mv ls.corrupt ls
>
> From the other day when this occured, and
>
> root 2143 2142 0 Jul27 ? 00:00:00 [mv]
>
> And I have a few hundred lines like this.

I would assume you need to get rid of these guys... do they respond to
a kill or kill -9 ?

could these be caused due to disk corruption at the time of the mv
causing the process to hang around?



>
> Very strange and I'm pulling my hair out at the m oment trying to figure it out. I've not rebooted the system as I'm remote from it and I don't want to take the chance of it not coming back while I'm not there.

you might have to take that trip and get it rebooted, especially if
you end up with unkillable processes.

>
> As I say I have inherited the system and have no real prior knowledge of the box or what our old admin did on there, so any help greatly appreciated.
>

multi-user setup? can you start locking out users and see if one of
them is somehow causing it?

hth

A


Reply to: