[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: priority of find



Jerry Quinn wrote:

Steve Lamb writes:
> On Tue, 09 Sep 2003 01:07:15 +0200
> Joachim.Klamann@t-online.de (joachim klamann) wrote:
> > Im using debian 3.0 on a laptop with 64 mb ram and 4 gb harddisk. > > Processor is Intel Celeron 366. Every time I boot >>find<< is running,
Most likely because you have anacron installed and it sees that it's been over 24 hours since /etc/cron.daily/find has been run.

> > consuming most of the power and blocking every other process. How > > nessecary is find? Do I need it? And if I need it, how can I change > > priority from +10 to lets say -10? And if I do this, will the system be > > stabilly running?

If you've never found yourself typing 'locate <somefilepattern>", then it is probably not very usefull for you. You could simply remove the find file from /etc/cron.daily (I'd move it to /root.)

I am not aware of any programs that use the updatedb database other than the locate command. If there are some, then maybe it is more important than I thought. If you wanted to use locate occasionally, you would just run the command and it would warn you that the database is old. You could then choose to run updatedb and wait a minute for it to update. It might go faster when you run it by hand than it does at boot because everything else is already loaded and you're sitting waiting for one process, updatedb.

As for 'how to nice find', you could modify /etc/cron.daily/find to include "nice -n 19" when it calls updatedb, but as Jerry points out this may not help things much. Updatedb is scanning over every inode outside of the /etc/updatedb.conf PRUNEPATHS list, while at the same time your other computing needs are asking for the contents of specific files.

This line in /etc/cron.daily/find:
 cd / && updatedb 2>/dev/null
becomes:
 cd / && nice -n 19 updatedb 2>/dev/null


Nice doesn't work very well here because find is hammering the disk.
So anything you try to do works passably until it wants something from
disk as well.  Although, with only 64mb, it's possible that regular
processes are getting swapped out and block trying to swap back in
when you want to do something.

>     This is updatedb updating the locate database.  It runs daily.  You can
> adjust this if you like but know that your locate db will not be as up to date
> if you set it to a longer interval.

It's nice to have the db update run, but the main issue is that it
hammers the disk bringing everything to a standstill.  I notice this
especially on my laptop, a ThinkPad T30.  This is a reasonably fast
machine, but the disk is slow (yes I have DMA on).
What we need is a way to nice disk activity, ala nice for cpu.

Jerry Quinn


If the main issue for laptop users with their slower drives is "update is nice" but "that it hammers the disk", then maybe the following line of thought would be a quicker solution than "nice for disk activity", unless that already exists.

The updatedb program by default on the distributions that I've used runs from / and excludes a few temp file directories. My home gateway machine cranks away at this dutifully every morning and I wonder "Why am I scanning the whole system every day when practically nothing changes, especially in /usr". Since it is a desktop machine and I'm not using it at that time, the thought process has always stopped there, well before the action stage. Now I'm thinking about it.

How often does /etc and /usr excluding /usr/local change? They are primarily managed by the distributions package system, and /usr is huge.

The same goes for / excluding /etc, /usr, much of /var and /home. Root is pretty stable, upgrading with base system and kernel updates.

For a non-server system, do I really care to find the daily changes in /var? Aren't I mostly interested in the latest changes in /home? To stretch this further, couldn't I use my package system to find where distribution files _should_ be? (I don't think I would go this far, but it could be over locate used to some extent.)

So, re-write /etc/cron.daily/find to run updatedb only on /home. Grab a copy of it and put it in /cron.weekly modified for /var and /usr/local, and in /cron.monthly modified for / excluding /usr/local, /home and /var. Have each run of update create it's own db file in /var/cache/locate. Then in your user .bash_profile (and in the /etc/skel/ copy for future users) EXPORT LOCATE_PATH=/var/cache/locate/homedb:/var/cache/locate/rootdb:<etc>. Adjust the paths and timings based on your needs.

Even cooler would be to keep the cron.monthly updatedb calls in cron.daily but conditional on some file that lets you determine that new packages have been installed since the last time the 'large' update locations have been run so they only run sometime after installing something and not every day.

Another idea would be to remove all /etc/init.d/anacron runlevels and run it by hand. That would be easy to forget to do, so it is pretty low on the totem of good ideas. You could also uninstall anacron and just be hit-or-miss if your laptop happens to be on when the normal cron would be iterating runparts.

With the laptop, you generally want to boot and get to the business at hand. The less daemons and other tasks that start at boot, the quicker you're doing 'your' stuff. If you aren't plugged in, the anacron init.d script already has a provision to not start the anacron daemon. It would be nice if there was another way to have it wait ten or so minutes before kicking in and even better to not start or run the next item on it's list until system activity is below some threshold like batch or nqs are suppose to do. This delay shouldn't go into the init.d script. You want that to return as quickly as possible.

To recap, the biggest speed-up to /etc/cron.daily/find would be to not index /usr excluding /usr/local every day. The contents of that path should only be changing when software (specifically distro. software) is installed into the system. It may be nice to have an option for disk or cpu intensive (ana)cron run programs to back off to a near halt when there is user activity on non-server systems or to not start until a certian threshold of system inactivity is met like batch or nqs claim to do (batch doc recomends nqs), and especialy to not run full-tilt right when the system boots.

Jacob



Reply to: