[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: atlas3 on hppa: unexpected reloc type



Greetings!  Am forwarding this from the atlas list.  Clint, did you
see any explicit benefit to the -mpa-risc-2-0 flag as compared to the
earlier abi versions or the gcc default?  The prefetch is an obvious
win, though in the less important cases.  To me the worth of an extra
build might hinge on the performance difference in dgemm with and
without the pa-risc-2-0 gcc flag.

Take care,

=============================================================================
Guys,

OK, I got access to an hp 9000/800, whatever the hell that means.  Because I
find the whole platform completely confusing (how do the CPU types map to
the model names, and how do these relate to the PA-RISC numbers, and
are all cpus of a given PA-RISC level basically the same chip with differing
clock speeds and cache sizes?), I was unable to add good config support,
or anything else.  However, I found out a few things that might be helpful
if we ever want to support these guys explicitly.  I give it below in case
anyone wants to play with it themselves, and as something we can go back to if
we decide it's worth the trouble to support.

One piece of good news is that gcc can get the same or better performance
than HP's cc.  For me, the magic flags were: "-mpa-risc-2-0 -O3".

>From reading, it appears that PA-RISC 2.0 has prefetch, but 1.1 did not.
Here's an appropriate cut of atlas_prefetch.h:

#elif defined(ATL_PARISC2) && defined(__GNUC__)
      #define ATL_pfl1R(mem) \
         __asm__ __volatile__ ("ldw %0, %%r0" : : "m" (*((char *)(mem))))
      #define ATL_pfl1W(mem) \
         __asm__ __volatile__ ("ldd %0, %%r0" : : "m" (*((char *)(mem))))
#elif defined(ATL_AltiVec)

You need to define AT_PARISC2 in your make.inc by hand; config will
not detect for you.  These guys definitely speed up the Level 1; I didn't
get great results quickly trying them for the Level 3, and didn't check
with Level 2.

A good dgemm case can be had with the following parameters:
   make mmcase mu=6 nu=3 nb=72 ku=72 MCC=gcc MMFLAGS="-mpa-risc-2-0 -O3"
The search didn't find this automatically, for me (you need to time it
in-cache to reliably find it is better than the one the search takes).

Regards,
Clint

P.S.: Camm, if anyone on that list is interested in such esotera, feel
      free to forward this on to your HP list.


-------------------------------------------------------
This SF.Net email is sponsored by the new InstallShield X.
>From Windows to Linux, servers to mobile, InstallShield X is the
one installation-authoring solution that does it all. Learn more and
evaluate today! http://www.installshield.com/Dev2Dev/0504
_______________________________________________
Math-atlas-devel mailing list
Math-atlas-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/math-atlas-devel



=============================================================================

Randolph Chung <tausq@debian.org> writes:

> > Keep in mind the original PA20 book had the prefetch for "read only"
> > mixed up with the prefetch for "read/write". There should be an
> > errata to document the correction. I don't know which version of the
> > parisc2.0.pdf has it corrected.
> 
> the one linked from the website is correct; it notes the errata as a
> comment in the pdf.
> 
> randolph
> -- 
> Randolph Chung
> Debian GNU/Linux Developer, hppa/ia64 ports
> http://www.tausq.org/
> 
> 
> 

-- 
Camm Maguire			     			camm@enhanced.com
==========================================================================
"The earth is but one country, and mankind its citizens."  --  Baha'u'llah



Reply to: