how best to package when using hardware vectorization with vector-unit specific code?

To: debian-mentors@lists.debian.org
Subject: how best to package when using hardware vectorization with vector-unit specific code?
From: "Kay F. Jahnke" <_kfj@yahoo.com>
Date: Wed, 10 May 2017 09:17:53 +0200
Message-id: <[🔎] da5b2a17-8e62-8339-476f-4da1fdf190d2@yahoo.com>

Hi group!

I have code which optionally makes use of hardware vectorization. Thisis done generically by using Vc:


https://github.com/VcDevel/Vc

When compiling with Vc, the resultant machine code is for a specificvector unit only, like AVX or SSE. There are several possible ways ofdealing with these processor-dependent binaries:

- create a set of complete target-specific executables and select whichone to deploy/run on the target machine

- create a single binary with all variants linked in, calling onlytarget-specific code at run time

- create a set of shared libraries, deploy one or all and load thetarget-specific one at run-time

- create only one compromise binary using some commonly available vectorunit

The first alternative is nice because the binary is small and simple,but the binary will only run on a specific target, so there would haveto be a way to do target-specific deployment, or, alternatively, apopulation of additional superfluous binaries cluttering .../bin. Sofar, I have only seen architecture-dependent packages, and I haven'tmanaged to figure out if the package installation process can be mademore specific to deploy only code for a specific vector unit. But I'dlike to go along this path if possible.

The second alternative would require case-switching inside the code(which has to be maintained for every new vector unit coming along),makes the build more complex (would need to create a set of object fileswith individually named versions of the code to be called, linking mightbe difficult) - and it would bloat the binary code. Yet it would providea single binary useful for all targets, so packaging should be simpler.

The third alternative is also an interesting option, but it wouldrequire tearing the code apart into the 'main' program and some librarydoing the number crunching. The case-switching inside the 'main' codewould also require maintenance over time, and deploying all versions ofthe .so would also be a waste of space.

The fourth alternative is often used to create a target using only SSEinstructions, which are available on most machines. Yet this sacrificesthe power of better vector units and makes performance on newerprocessors suboptimal, so it's not really a good option.

I'd like some advice on how to proceed to get my code to be easilypackaged and deployed under the constraints I've outlined. If it helpsyou to understand more clearly what this is all about, my project (aviewer for panoramic images) is here:


https://bitbucket.org/kfj/pv

With regards

Kay F. Jahnke

Reply to:

Follow-Ups:
- Re: how best to package when using hardware vectorization with vector-unit specific code?
  - From: Paul Wise <pabs@debian.org>
- Re: how best to package when using hardware vectorization with vector-unit specific code?
  - From: Wookey <wookey@wookware.org>
- Re: how best to package when using hardware vectorization with vector-unit specific code?
  - From: Wookey <wookey@wookware.org>

Prev by Date: Bug#862223: RFS: xtensor/0.10.1-1
Next by Date: Re: how best to package when using hardware vectorization with vector-unit specific code?
Previous by thread: Bug#862223: marked as done (RFS: xtensor/0.10.1-1)
Next by thread: Re: how best to package when using hardware vectorization with vector-unit specific code?
Index(es):
- Date
- Thread