[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: No space left on device (28) but device is NOT full!



On Tue, Nov 05, 2013 at 07:15:19PM +0400, Reco wrote:
> On Tue, Nov 05, 2013 at 02:29:10PM +0000, Jonathan Dowland wrote:
> > On Tue, Nov 05, 2013 at 03:13:10PM +0400, Reco wrote:
> > > find . -type f -name 'popularity-*' -print0 | xargs -0rn 20 rm -f
> > 
> > I idly wonder (don't know) to what extend find might parallelize the
> > unlinks with -delete. A cursory scan of the semantics would suggest it
> > could potentially do so: it's not clear that a single unlink failing
> > should stop future unlinks (merely spew errors and consider the -delete
> > operation as a whole to have failed)
> 
> xargs parallelism is optional. The point is that you have one process
> which finds files, and another one (or another group of) who are
> deleting files. Helps utilizing multiple cpus.

I know about xargs and parallelism. I was wondering whether find
implemented parallelism internally, when it could, and afaics the
semantics of -delete do not proclude it doing so. I did not investigate
whether it does, but…

> $ time find -type f -delete
…
> real    4m27.799s

…suggests it doesn't. (I'm appalled by that!)

> It's not the binary size which matters, it's the algorithm:

The binary size effects the initial load-up time which, for small
numbers of files/short execution times, may be the lions share of
the total execution time. However as you point out, for orders of
magnitute like 500,000; it's dwarfed by the algorithm.

I'm quite amazed how much faster your perl implementation was. I
can only imagine that nobody has ever been troubled by find's
performance enough to work on it. This points to find not taking
advantage of parallelism (and also to potential improvements in
speed even for your perl implementation).

> Basically, the difference is in the fact that find uses fstatat64
> syscall for each file, and this perl one-liner uses lstat64 and stat64
> syscalls. Use strace to check it in your environment.  On another OS
> results could be different.

So you believe the discrepancy is entirely down to the difference
between fstat64 and lstat/stat64? I find that hard to believe. I
suspect find is just not very efficient.


Reply to: