[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: 64-bit subtract from vector unsigned int



On Wed, Apr 8, 2020 at 7:31 AM Mathieu Malaterre <malat@debian.org> wrote:
>
> Jeffrey,
>
> On Wed, Apr 8, 2020 at 11:56 AM Jeffrey Walton <noloader@gmail.com> wrote:
> >
> > On Tue, Apr 7, 2020 at 8:27 AM Lennart Sorensen
> > <lsorense@csclub.uwaterloo.ca> wrote:
> > >
> > > On Tue, Apr 07, 2020 at 05:51:54AM -0400, Jeffrey Walton wrote:
> > > > Hi Everyone,
> > > >
> > > > I'm porting a 64-bit algorithm to 32-bit PowerPC (an old PowerMac).
> > > > The algorithm is simple when 64-bit is available, but it gets a little
> > > > ugly under 32-bit.
> > > >
> > > > PowerPC has a "Vector Subtract Carryout Unsigned Word" (vsubcuw),
> > > > https://www.nxp.com/docs/en/reference-manual/ALTIVECPEM.pdf. The
> > > > altivec intrinsics are vec_vsubcuw and vec_subc.
> > > >
> > > > The problem is, I don't know how to use it. I've been experimenting
> > > > with it but I don't see the use (yet).
> > > >
> > > > How does one use vsubcuw to implement a subtract with borrow?
> > >
> > > Does your 32 bit powerpc have altivec?  A lot do not.  It is certainly
> > > not a universal feature.  As far as I remember, G4 and G5 powermacs have
> > > it, but nothing older.
> >
> > Yes, this is an old PowerMac G4 with Power4. It has a Altivec unit,
> > but it is only 32-bit. Add, subtract, shift and rotate (and friends)
> > on 64-bit values are missing.
> >
> > As old as the hardware is (circa 2000), that old PowerPC chip
> > outperforms some modern hardware, like Atoms, Celerons and low-end ARM
> > cpu's in modern gadgets.
> >
> > Testing some algorithms, like Simon-128 and Speck-128, show a need for
> > Altivec. For example, Integer-based Speck-128 was running at about 70
> > cpb. Altivec-based Speck-128 dropped to 10 cpb even with me doing all
> > the 64-bit fixups. (Speck-128 runs around 2.5 cpb when the native
> > hardware supports 64-bit operations, like on Power8).
>
> [Somewhat off-topic here.]
>
> Did you ever tried crc32 with altivec ? crc32 with altivec in the
> kernel is only for ppc64.

No. I think the CRC32 support comes from Power8 and in-core crypto
using polynomial multiplies. Here's the fellow who has the reference
implementation and tutorial:
https://github.com/antonblanchard/crc32-vpmsum.

I don't use CRC32 much. I do have GCM mode using polynomial multiples
(along with Power8 AES). It runs around 1.3 cpb.

Jeff


Reply to: