[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Atlas proposal

On Tue, 2010-08-17 at 23:56 +0200, Sylvestre Ledru wrote:
> Le mardi 17 août 2010 à 22:45 +0100, Roger Leigh a écrit :
> > Disabling threading is also suspect: how can the optimal number of
> > threads possibly be determined at build time?  This should also be
> > configurable or at least auto-detectable at runtime.
> C/C from the FAQ:
> http://math-atlas.sourceforge.net/faq.html#tnum
> "Can I vary the number of threads ATLAS uses dynamically?
> No. The maximum number of threads to use is determined at compile time.
> ATLAS will never use more than this, but may use less if the problem
> sizes are too small to get speedup from the additional parallelism."

Can we set a large maximum at build time and then reduce it at run-time
based on the number of hardware threads found at run-time?  Or does it
do that already?  Given the phrase 'may use less', it is clear that the
code does support varying the number of threads used at run-time.

> > In short, Atlas' approach to optimisation by detecting everything at
> > build time is wrong.  Rather than working around this limitation by
> > totally crippling the library to work on a least-common-denominator
> > system by removing all optimisations and threading, it should be
> > actually fixed, probably best if done in collaboration with upstream.
> OK, I forgot to add something.
> I know upstream is doing it "wrong" from our distro point of view.
> However, I am not upstream, I don't plan to patch atlas to manage this
> and I don't think upstream is interested in it. It is not the approach
> of upstream and I don't think the current build system will allow the
> introduction of such features easily.

The dynamic linker does the run-time selection for you.  All you need to
do is to install the optimised libraries in subdirectories that specify
the hardware they require.  Currently the following platform and
capability flag names are recognised for i386:

    "i386", "i486", "i586", "i686",
    "fpu", "vme", "de", "pse", "tsc", "msr", "pae", "mce",
    "cx8", "apic", "10", "sep", "mtrr", "pge", "mca", "cmov",
    "pat", "pse36", "pn", "clflush", "20", "dts", "acpi", "mmx",
    "fxsr", "sse", "sse2", "ss", "ht", "tm", "ia64", "pbe"

Use nested subdirectories to specify multiple flags.  The library in the
most specific directory (i.e. the one which selects the most flags, all
satisfied by the current hardware) will be used.


Ben Hutchings
Once a job is fouled up, anything done to improve it makes it worse.

Attachment: signature.asc
Description: This is a digitally signed message part

Reply to: