[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: hwcap supporting architectures?

On Mon, Jan 17, 2005 at 05:52:04PM +0900, GOTO Masanori wrote:

 > > > > Yes, and if ev67 is instruction upper compatible with ev56 (I
 > > > > guess so), I think it's acceptable to add a symlink "ln -sf
 > > > > lib/ev67/libfoo.so lib/ev56/libfoo.so".
 > > > 
 > > > Ugh... that pushes the burden of maitaining support for new
 > > > architectures to the package.
 > Yeah - I think it's trade off - whether we support library
 > optimization package or we don't get a bit performance improvement.

 So, you are trading maintainance cost for a rather subjective speed
 improvent?  Or should I say, preventing some performance degradation?

 Keep reading.

 > > >  Please bear with me, but I'm trying to understand the issue: is
 > > >  the cost of calling access(2) or stat(2) really so high?
 > > 
 > > I'd consider it quite acceptable in this case. However, as I tried
 > > to express, it's not possible with glibc's current "design", and I
 > > didn't feel like changing that.
 > Note that we should keep in mind: imagine most binaries on all debian
 > system over the world start to consume access(2)/stat(2) system call
 > cost in each binary execution time - "Many a little makes a mickle".

 Ok, I stopped buying this kind of argument long ago.  There's a
 SIGGRAPH paper (2001 IIRC) which justifies certain kind of rather
 complex optimization because a (graphics) context switch is "too
 expensive", without actually defining the situation that triggers the
 context switch in a clear fashion.  In my own testing context switches
 of the kind described in that paper are at least a factor of 100
 _faster_ than what the authors claim.

 Attached is a program that measures the time a single stat(2) call
 takes.  I get circa 5 microseconds per stat(2) call on my computer (AMD
 Athlon 1600+, can't recall what kind of memory it has right now). Note
 that the code that doesn't directly have to do with the stat(2) has a
 rather low overhead (circa 1 ns on my system).

 What that means is that you need to make about 2000 stat(2) calls to
 get _anywhere_ near what's measurable by a human and about 20000 to
 start getting said human annoyed.

 If a biggish GNOME program (Epiphany Browser) links to 60 libraries,
 you need to perform a lookup in ~ 30 paths for the start up delay to be
 measurable and ~ 300 for it to be annoying.  ls(1) links to 6
 libraries.  That's one order of magnitude less, IOW, you need a path
 with ~ 3000 components to start being annoying.

 So, what exactly are you talking about?

 > > > I see for example that on start up the file /etc/ld.so.nohwcap is
 > > > accessed multiple times (and it's not present, isn't that a race?
 > > > what happens if the file suddenly appears in the middle of
 > > > program start up? what's that file anyway, I can't find it
 > > > mentioned in the documentation).
 > > 
 > > It's supposed to disable the use of hwcaps. Stating it multiple
 > > times seems like a bug.

 The contents does not matter?

 > Debian glibc has been applied a special patch to check
 > /etc/ld.so.nohwcap before loading libraries each time.  You can see
 > it in debian-glibc package ldso-disable-hwcap.dpatch written by Ben
 > and Daniel.  It enables us to upgrade smoothly even if we use
 > optimized libraries - this effort is one of debian's nice features.
 > But the drawback is it needs to pay access(2) lookup cost as you
 > pointed out.
 > Checking /etc/ld.so.nohwcap each time (some binaries call multiple
 > times) is the current patch design

 Why?  I just can't see a valid reason for "wanting" the file to
 suddenly pop up while the program is running.

 > I think this is safer than checking /etc/ld.so.nohwcap once in
 > program startup time.

 Safer in what way?

 Again, I just don't buy that "system calls are too expensive" argument.
 Anyone writing shell scripts cares about a whole lot of things *but*
 performance.  And I'm not talking about increasing running time by a
 factor of anything, I'm talking about adding a bunch of microseconds,
 which get lost in the middle of filesystem stalls, page faults and
 other rather common events.

#include <cmath>
#include <cstdio>
#include <ctime>
#include <sys/stat.h>
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>

int main(int argc, char * argv[])
    const int N = 6;
    char name[N+1];

    for(int i=0; i < N; ++i)
        name[i] = '0';
    name[N] = 0;

    struct timeval t0, t1;

    gettimeofday(&t0, NULL);
    for(int i=0; i < N;)
        struct stat buf;
        stat(name, &buf);
        for(i=0; i != N && ++name[i] == '9'+1; ++i)
    gettimeofday(&t1, NULL);

    float dt = (t1.tv_sec - t0.tv_sec) + (t1.tv_usec - t0.tv_usec)*1E-6;

    printf("%g\n", dt/powf(10, N));

    return 0;

Reply to: