Re: Rethinking tensorflow build (taking shortcuts)

To: debian-devel@lists.debian.org
Subject: Re: Rethinking tensorflow build (taking shortcuts)
From: Mo Zhou <lumin@debian.org>
Date: Fri, 25 Oct 2019 20:35:47 -0700
Message-id: <[🔎] f73192c4e41b3d9f5ef0044ecb0b9653@debian.org>
In-reply-to: <[🔎] 71f0d061c0505f39c11787d27d45eb54@debian.org>
References: <[🔎] 57355527c93644cb0155973a3f2642c0@debian.org> <[🔎] 20191004162521.axlffn2wfxnvyj3v@function> <[🔎] 71f0d061c0505f39c11787d27d45eb54@debian.org>

updates:

Indeed, parsing build sequence from Bazel buildlog -> rebuild
dependency graph -> generate Ninja build  looks like the best
way to realiably deal with libtensorflow*.so build, after attempted
to solve it from various angles. Advantages:

1. totally avoided to understand the source code structure.
   The package maintainer only needs to understand how hundreds of
   c++ files are automatically generated in different build stages...

2. result binary will be very close to official standard Bazel build.
   (Theoretically, if we replay the build sequence, we get the exactly
    the same result)

3. completely free of any (download during compilation) stuff, being
   very distribution-friendly. The bazel build builds nearly every
   single dependency locally incl protobuf, jpeg, or even zlib1g ...
   My build only require two convenient code copies: Eigen3 and
   abseil-cpp. TensorFlow may FTBFS against libeigen3-dev. Convenient
   code copies may be the best way to use abseil-cpp.[1]

4. Ninja builds are very explicit and easy for debugging. Another big
   plus is that ninja won't mess up the buildlog during parallel build
   like gnu make does -- even easier for debugging, especially when
   g++ dumps thousands of lines of complains.

The current build system is much reliable compared to those shipped
in previous two experimental uploads. The old ones add many useless
object files to the final shared object.

I'll upload the package to exp (NEW) later. It can enter unstable
once protobuf (>= 3.8.0) and a newer grpc did so.

Python package is indefinitely postponed, since there is a even more
complex .cc and .py file generation process. On the other hand, recently
pytorch gained more focus from the academia and industry, surpassing
tensorflow.

[1] google's abseil-cpp and facebook's folly both face similar problems
    before being packaged for Debian.

On 2019-10-04 16:37, Mo Zhou wrote:
> On 2019-10-04 16:25, Samuel Thibault wrote:
>> Hello,
>>
>> Mo Zhou, le ven. 04 oct. 2019 09:04:25 -0700, a ecrit:
>>> Another angle for addressing the building problem is in the
>>> reverse-engineering style: parse the bazel buildlog, rebuild the
>>> dependency graph and generate a ninja build for it. See [2].
>>
>> Interesting :) But then if one wants to add some files to the software,
>> which changes the dependencies, are we able to insert it correctly in
>> the build process? Being able to change the set of files to be built is
>> part of being able to modify the software, for it to be free :/
> 
> Minor modifications won't be too difficult. For example the object list
> for building libtensorflow.so is explicitly listed here[1]. During build
> the object list is passed to compiler via
> -Wl,@libtensorflow.......params
> I did manually edit the object list to customize the shared object, and
> the changes will be reflected in the dependency graph[2].
> (FYI: ninja writer's "implicit" argument stores dependencies)
> 
> On the other hand, ff the user wants significantly different stuff,
> they have to re-generate a buildlog and update the parser accordingly.
> 
> Not bad, right?
> 
> [1]
> https://salsa.debian.org/science-team/tensorflow/blob/lumin/debian/buildlogs/libtensorflow.so.2.0.0-2.params
> [2]
> https://salsa.debian.org/science-team/tensorflow/blob/lumin/debian/fakebazel.py#L413-417

Reply to:

References:
- Rethinking tensorflow build (taking shortcuts)
  - From: Mo Zhou <lumin@debian.org>
- Re: Rethinking tensorflow build (taking shortcuts)
  - From: Samuel Thibault <sthibault@debian.org>
- Re: Rethinking tensorflow build (taking shortcuts)
  - From: Mo Zhou <lumin@debian.org>

Prev by Date: Re: [SECURITY] [DSA 4549-1] firefox-esr security update
Next by Date: MBF: don't build against libatlas3-base if possible
Previous by thread: Re: Rethinking tensorflow build (taking shortcuts)
Next by thread: Re: Rethinking tensorflow build (taking shortcuts)
Index(es):
- Date
- Thread