[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#907192: pre-RFS: tensorflow/1.10.0+dfsg-A1 [ITP] -- debian archve += 1 million lines of code



Some updates about this pre-RFS:

Summary:
1. The README.Debian file is totally invalidated. Please don't review the repo.
2. I switched to use python plus ninja for building Debian's TF, which
     may have a chance to evolve into the final solution.

On Fri, Aug 24, 2018 at 22:58 Lumin <cdluminate@gmail.com> wrote:

 2. any help will be appreciated, especially for CMake.

Well, currently the packaging repo is totally a mess since there are 4
sets of build systems. Don't review the repo before I remove three of them,
because no one except for myself can understand what's happening there.

Bazel, the native build system for TF, is impossible to enter Debian release.

Initially I forked upstream's contributed cmake build because cmake
can build a complete libtensorflow.so and pywrap_tensorflow_internal.
However this set of makefile is too much complicated to read and maintain.

For simplicity I forked upstream's contributed makefile build. It only
compile a core set of functionality. However, it's not able to build python
wrapper. To extend the makefile build I wrote another set of makefile
from scratch, imitating the original makefile build. Finally I find it
obsecure to understand what happend when something goes wrong.

Eventually I started to write yet another build system with python
plus ninja-build with the experience obtained from the previous attempts.
I think there won't be the fifth build system since python plus ninja
just works like what I want.

As a result, the whole todo list written in debian directory was invalidated by
such frequent change in build system. Please don't review any file in the
packaging repo.

The python plus ninja build system can produce libtensorflow_framework.so now.


Details
-------

This not a real RFS, but sort of weak request for review/help.
I'm not proficient in CMake so I'm not sure whether I'm doing
the correct choice all the time. Anyway, The good news is that
I'm already able to build libtensorflow.so for Debian experimental,
on both amd64 and ppc64el architectures.

At the time of writing, debomatic-amd64 has nearly finished the build
but failed (maybe not enough memory):
http://debomatic-amd64.debian.net/distribution#experimental/tensorflow/1.10.0+dfsg-A1/buildlog
Note, this buildlog is as big as 107MB.


Here is a list of remaining TODOs for stage A:
---------------------------------------------------------------
(The list is copied from README.Debian at
 https://salsa.debian.org/science-team/tensorflow ,
 please lookup README.Debian for the full version)


- [x] prevent the build system from downloading anything.
- [x] deal with all the C/C++ lib dependencies.
- [x] produce libtensorflow.so.1.10 and install it into .deb package.
- [x] ambiguous FFT2D license.

- [ ] build tests files (googletest) and run the tests.
- [ ] make sure nothing from contrib is built. they are not officially supported.
- [ ] remove useless parts from cmake build.
- [ ] misc improvements to cmake build. (at least make it easier to read)
- [ ] is the resulting libtensorflow.so.1.10 correct and working?
  - [ ] write autopkgtest with some mini C/C++ programs.
  - [ ] working on amd64?
  - [ ] working on ppc64el?
- [ ] make sure libtensorflow/amd64 is linked against libmkldnn
- [ ] sort out this confusing lintian E
      source-is-missing tensorflow/compiler/aot/codegen_test_o.golden
- [ ] remaining lintian warnings and errors.
- [ ] traverse the 16000+ files in the source tree and complete d/copyright.
      ummmmmmmmmm.............
- [ ] Can't the blob be even smaller?
      -rwxr-xr-x 1 debian debian 3.6G Aug 24 13:53 libtensorflow.so.1.10.0 (unstripped)
      -rwxr-xr-x 1 debian debian 104M Aug 24 14:00 libtensorflow.so.1.10.0 (stripped)
- [ ] 16GB RAM + 16GB swap is not enough to avoid triggering OOM killer?
- [ ] get rid of static linking written for stupid windows
      /usr/bin/ld: error: benchmark_model(.debug_info) is too large (0x35a9f359 bytes)
      /usr/bin/ld: error: benchmark_model(.debug_str) is too large (0x6a545d15 bytes)
      /usr/bin/ld: error: benchmark_model(.debug_loc) is too large (0x1f5b1950 bytes)
      make[3]: Leaving directory '/<<BUILDDIR>>/tensorflow-1.10.0+dfsg/obj-x86_64-linux-gnu'
      [ 98%] Built target benchmark_model
      /usr/bin/ld: error: compare_graphs(.debug_info) is too large (0x366f36be bytes)
      /usr/bin/ld: error: compare_graphs(.debug_str) is too large (0x6a64010e bytes)
      /usr/bin/ld: error: compare_graphs(.debug_loc) is too large (0x1fd19fe0 bytes)
- [ ] how to prevent "make install" from building everything again?

- [ ] upload to experimental.


---------------------------------------------------------------
Changes:

tensorflow (1.10.0+dfsg-A1) UNRELEASED; urgency=medium

  * Initial release. (Closes: #804612)
  * Stage A (with Debian revision "A*") indicates that the source
    package only produce C and C++ library and development files.

--
Best,

Reply to: