[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#605009: serious performance regression with ext4



Hi Ted!

On Sun, 2010-11-28 at 23:11:52 -0500, Ted Ts'o wrote:
> I did some experimenting, and I figured out what was going on.  You're
> right, (c) doesn't quite work, because delayed allocation meant that
> the writeout didn't take place until the fsync() for each file
> happened.  I didn't see this at first; my apologies.

Thanks for the analysis.

> However, this *does* work:
> 
>     extract(a);
>     sync_file_range(fd.a, 0, 0, SYNC_FILE_RANGE_WRITE); 
>     extract(b.dpkg-new);
>     sync_file_range(fd.b, 0, 0, SYNC_FILE_RANGE_WRITE); 
>     extract(c.dpkg-new);
>     sync_file_range(fd.c, 0, 0, SYNC_FILE_RANGE_WRITE); 

The man page and the kernel sources seem to indicate this might block
depending on the size of the request queue?

>     sync_file_range(fd.a, 0, 0, SYNC_FILE_RANGE_WAIT_BEFORE); 
>     sync_file_range(fd.b, 0, 0, SYNC_FILE_RANGE_WAIT_BEFORE); 
>     sync_file_range(fd.c, 0, 0, SYNC_FILE_RANGE_WAIT_BEFORE); 

>     fdatasync(a);
>     fdatasync(b.dpkg-new);
>     fdatasync(c.dpkg-new);
> 
>     rename(b.dpkg-new, b);
>     rename(c.dpkg-new, c);

> What's going on here?  sync_file_range() is a Linux specific system
> call that has been around for a while.  It allows program to control
> when writeback happens in a very low-level fashion.  The first set of
> sync_file_range() system calls causes the system to start writing back
> each file once it has finished being extracted.  It doesn't actually
> wait for the write to finish; it just starts the writeback.

Hmm, ok so what about posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED)
instead, skimming over the kernel source seems to indicate it might
end up doing more or less the same thing but in a portable way?

Could someone with ext4/btrfs/xfs/etc test w/ and w/o the attached patch
against dpkg?

thanks,
guillem
diff --git a/src/archives.c b/src/archives.c
index a2cba6a..a94096f 100644
--- a/src/archives.c
+++ b/src/archives.c
@@ -683,6 +683,9 @@ tarobject(void *ctx, struct tar_entry *ti)
                _("backend dpkg-deb during `%.255s'"),
                path_quote_filename(fnamebuf, ti->name, 256));
     }
+
+    posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED);
+
     r = ti->size % TARBLKSZ;
     if (r > 0)
       if (safe_read(tc->backendpipe, databuf, TARBLKSZ - r) == -1)

Reply to: