[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

(solved) Re: / 100% used




Reporting solution.

There was basically 

-----------------------------------------------------------
* Immediate problem:
-----------------------------------------------------------
1) Identify what was filling /var and stop it.

In a terminal, trying to keep system useful, this command saved the day.
# while true; do echo clean syslog; cat /dev/null > syslog ; sleep 10; done

While at it, reading syslogs to understand the problem was a wireless driver. Blacklisting it and disabling it from bios did the job.

System usable again.

-----------------------------------------------------------
* Long term solution to improve the system:
-----------------------------------------------------------
1) Shrink a hot /home (/dev/sda3) to free space for /var
-----------------------------------------------------------

All done via ssh (remotely)
Tip: you cant #fuser -km /home, because you'll kick yourself out.

write down info from commands:
# fdisk -l
# df -h
# df -B 4k
# mount -l
# du -chd1
# fdisk -s /dev/sda3
# dumpe2fs -h /dev/sda3

This is the "check" block. Do it again after critical commands, just to see if its all ok and the way you suppose it should be.

To unmount /home you need to login as root. Not "sudoing" to root. So, if you are in need, edit /etc/ssh/sshd_config and -> AllowRootLogin yes
# systemctl restart ssh
# ssh root@yourserver

Make sure noone else can login, isolate the server. (You can lock passwords, or lock logins, whatever you may like.)
# umount /home
# fsck -n /dev/sda3
# resize2fs /dev/sda3 850G
now 222822400 (4k) blocks long

Generate how many bytes multiplying this number by 4*1024
-> 912,680,550,400 bytes
If your sector is 512 bytes, then divide it by 512 to get number of sectors (to be used in fdisk)

1,782,579,200 sectors

Now your partition /dev/sda3 will start still in the same sector (113672192=initial), but it will end at:

1,782,579,200 + 113672192 = 1896251392

-----------
# fdisk /dev/sda
delete sda3
new partition 3 (primary in this case)
same start point (113672192)
end point = 1896251392

This would be all good, and it actually works as I rebooted and tested. But to save you one boot, I regret this number because when I create the new /var partition (sda4=extended, sda5=logical var), it suggested to start at 1896253440. So, just try to create a new partition and see where it suggests to start. Subtract one from it. Delete all again, and create again sda3. (Till now, no "w", nothing written, just fdisk in memory. A simple "q" will quit all without changes).

So: summarizing: create new partition, check where it suggests starts (1896253440), delete it. Delete again sda3.

new partition 3 (primary)
same start point (113672192)
end point = 1896253439

new partition 4 (extended)
all size
new partition 5 (logical)
+14G for /var
new partition 6 (logical)
all the rest (~13G) for /tmp

"w" (fdisk write and quit)

-----------

# fsck -n /dev/sda3
# mount /home

use all the "check" block of commands from above

# reboot (not really necessary, but nice to see all working)

# vi /etc/ssh/sshd_config
AllowRootLogin no
# systemctl restart ssh

-----------

-----------------------------------------------------------
2) Moving /var to /dev/sda5
-----------------------------------------------------------

# mkfs.ext4 /dev/sda5
# mkdir /mnt/var
# mount /dev/sda5 /mnt/var

Now, very important, noone can be writing to var. Research on the matter suggests the better option is to login using single mode (init 1). But via ssh, I tried something a bit "risk". If you follow this instructions, its at your own risk. Be warned!

# lsof | grep /var
And you will see a lot of process writing into /var.

Well, if you are SURE noone but you can access the system at this point, you "might" risk losing some seconds of VAR (logs and other stuff), but the system will recovery ok. So, ignoring the services writing to var, just do:

# cp -ax /var/* /mnt/var

# ls -l /dev/disk/by-uuid
to get the UUID to use in fstab

#vi /etc/fstab
Add the line:

UUID=181839181821...bla...bla...bla   /var   ext4   defaults   0   0

Just update the new var one last time before reboot:

# rsync -ihrpgu --stats --progress --delete /var /mnt/var

# reboot

---------

System will lose data from the seconds between your last rsync and the reboot. But it will boot ok, unless something very critical happens in this moment. Your system, you should know better if you can ignore this "seconds" (may you have apache running at full charge, or anything else that are demanding the server? Then you must avoid this and use single-mode. Google for it.)

-------

System back online, time to use all "check" commands again. Specially:

# mount -l

And after checking everything, your system is ready. (I did a last "clean" reboot just to check dmesg, all ok)


I hope this email helps anyone searching for a hot change of partition remotely.

It was difficult to gather information on this process.


Thanks all that helped.
Beco

PS. I'll do the same with /tmp now. And /tmp is far less critical than var, so that should be ok.



--
Dr Beco
A.I. researcher

"I know you think you understand what you thought I said but I'm not sure you realize that what you heard is not what I meant" -- Alan Greenspan

Creation date: pgp.mit.edu ID as of 2014-11-09

Reply to: