[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Reproducible, precompiled .o files: what say policy+gpl?



I am developing a very CPU-intensive, open-source error-correcting code.

The intention of this code is that you can split a large (> 5GB)
file across multiple packets. Whenever you receive enough packets that
their combined size = the file size, you can decode the packets to
recover the file, regardless of which packets you get.

This means a lot of calculation over gigabytes worth of data.
Therefore, speed of utmost importance in this application.

The project itself includes an ocaml compiler (derived from fftw) which
generates C code to perform various types of Fast Galois Transforms. Some
of the output C code uses SSE2 exclusively. This C code is then compiled 
and linked in with the other C sources that make up the application.

Now, on to the dilemma: icc produces object files which run ~2* faster
than the object files produced by gcc when SSE2 is used. (The non-SSE2
versions are also faster, but so significantly) Both gcc and icc can 
compile the generated C files. My University will shortly own a licence
for icc which allows us to distribute binaries.

So, when it comes time to release this and include it in a .deb, I ask
myself: what would happen if I included (with the C source and ocaml
compiler) some precompiled object files for i386? As long as the build
target is i386, these object files could be linked in instead of using
gcc to produce (slower) object files. This would mean a 2* speedup for
users, which is vital in order to reach line-speed. Other platforms 
recompile as normal.

On the other hand, is this still open source?
Is this allowed by policy?
Can this go into main?

Some complaints and my answers below:

C: How do we know the object files aren't trojaned? 
A: Because I am both the upstream developer and (will be) the debian 
   maintainer, and I say they aren't.

C: You can't recompile the application without ICC, which is not free.
A: You can still rebuild it with gcc.

C: But you can't rebuild _exactly_ the same binary.
A: This is essentially *my* question: is this required by policy/gpl?
   Remember, you can always get ICC yourself. If there is a GPL problem, 
   then I think no MSVC application can be GPL either.

C: You're just too lazy to hand-optimize the assembler and include that.
A: You're right. Some of those auto-generated C files are > 64k of
   completely incomprehensible math. 
   I could include .S files instead of .o files, though, if that helps.

C: You're just too lazy to fix gcc.
A: I also wouldn't know where to begin, and I already file bugs.
   Even if I did know where to begin, gcc is not my responsibility.

C: A (security) bugfix won't get linked in.
A: A bug in the auto-generated C code is unlikely, and if they was one,
   changing the .c file makes it newer than the .o, which means gcc will
   rebuild it.

That's it!
What are the thoughts of GPL and policy experts?

PS. I will provide the source code to anyone who requests it, but not yet
under the GPL. Only after I publish a paper about the algorithm will the 
code be released under the GPL.

-- 
Wesley W. Terpstra <wesley@terpstra.ca>



Reply to: