[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Fwd: Packaging TensorFlow for Debian

Another tips about performance.

Tensorflow kernels are built upon libeigen3, which is known to be
poor in performance when built against the default AMD64 ISA baseline.

You cannot imagine that your machine is crunching numbers without using
the modern SIMD instructions sets.

My recommendation is to at least provide an option in d/rules to
enable optimized builds.

Unlike tensorflow, pytorch is built upon the mixture upon eigen3 and
BLAS/LAPACK. By making advantage of debian's update-alternatives
system, we can switch the underlying BLAS implementation to MKL
for pytorch and elimintate the major performance issue.

While for tensorflow ... projects based on pure eigen3 has to be
rebuilt using -march=native to gain the maximum performance ...
I don't know whether there is dynamic dispatching for newer versions
of eigen3.

On Fri, 2021-05-28 at 11:29 +0000, M. Zhou wrote:
> Hi Wookey,
> Thanks for your work and the updates. I think I can commont
> on some of the details.
> As I had ever written a hacky build system for tensorflow,
> I still remember some details of it.
> On Fri, 2021-05-28 at 02:07 +0100, Wookey wrote:
> > 
> > * Collected .h files into -dev package (this is done horribly with
> >   rsync because tensorflow/bazel doesn't have a 'make install' I
> > can
> >   just use - but it does know the list of headers so I'm sure there
> > is
> >   a better way).
> IIRC the list of header files that should be installed varies
> according to the configuration. For example, if we enable the
> component A, but disable the component B when compiling the shared
> objects, then headers for component B should be filtered out.
> Bazel is able to resolve the file-level dependency for a given
> target, where the "files" surely includes headers. (for example,
> in traditional Makefiles we also list headers in the dependency
> list in order to rebuild correctly.)
> We can dump that dependency using bazel, and grep the headers out.
> I've forgotten the concrete command to achieve that. Oops.
> > * Create symlinks to .so files (bazel does it for
> >   libtensorflow_framework2.so.* but not libtensorflow_cc2.so.* - I
> >   don't know why yet)
> Maybe the corresponding symlink target is not included in the
> dependency tree for the current building targets.
> > * Got it to use system copy of libpng, rather than statically
> > embedding a copy
> Great. When using a statically linked library that is also used
> in many other system components, we should be cautious to potential
> symbol clashes once the underlying libpng versions are not aligned.
> Tensorflow itself is already very good at introducing CVEs, e.g.
> https://github.com/tensorflow/tensorflow/releases/tag/v2.5.0
> and we won't have the energy to deal with the CVEs in its embedded
> libraries.
> > What guarantees does upstream make about backwards/forwards
> > compatibility? They are putting SONAMEs in and managing major,
> > minor,
> > patch versioning, which is better than many projects these days.
> > 
> > I'm wondering what the right strategy is for abi/api versioning. I
> > presume we will have quite a lot of packages using this so we
> > should
> > try and do it right.
> If the upstream tightly stick to the semantics versioning, we should
> probably directly use the upstream SONAMES. I'm doing this for other
> two packages I maintain: opencv and pytorch. Their soversions are
> both
> <MAJOR.MINOR>. Although we have to pass the NEW queue every time when
> there is a MINOR bump ... at least we won't easily break reverse
> deps.
> > However then this question of ABIs gets sidetracked by something I
> > noticed whilst looking at the symbols situation: The symbols file
> > for
> > libtensorflow_cc2 is 24MB (that's really quite fat) Is it worth
> > putting that in the package? I'm not sure anyone is going to
> > actually
> > 'maintain' it beyond autogenerating a new one each version. Symbols
> > files work OK for C but are bloated and awkward for C++. Even so
> > 24MB
> > seems huge.
> C++ symbols are known to be hard to track. As currently we don't
> expect many reverse dependencies of libtensorflow, maybe we should
> not track it manually, at least for now.
> > lintian only complained about an embedded libpng, but now
> > I look I am pretty sure there is a still a range of embedded
> > statically-linked libs hiding in there.
> > 
> > We have lots of symbols like:
> > ZN6google8protobuf3MapINSt7__*
> > _ZN4absl14lts_2020_02_*
> > AES_decrypt@Base
> > BORINGSSL_self_test@Base
> > _ZN3Aws22AmazonWebService*
> Maybe it's using static libraries for some of them?
> > So I think that means that despite turning off network downloads
> > it's
> > still embedding protobuf, boringssl, google_abls, highwayhash,
> > farmhash and some AWS stuff (at least). I'm not sure where it is
> > getting them from... Some of this is the stuff Yun told us about at
> > the start of the thread... But it shouldn't be embedding
> > com_google_protobuf or gif, because those are already listed in
> > --repo_env=TF_SYSTEM_LIBS=<list> bazel command line in the rules
> > file. I guess I'll have to pore over the logs some more and see how
> > the workspace is getting set up.
> > 
> > The build log is here:      
> > http://wookware.org/software/tensorflow/tensorflow_2.3.1-1_amd64.build
> > 
> > Most of this should be fixable in due course, but what is our view
> > on
> > uploading sooner vs expunging all embedded libs?  I am normally
> > something of a purist on this, but there is some demand for this so
> > maybe some embedded libs are OK for the time being?
> > Not sure if the ftpmasters will agree, even if we do...
> My recommendation is to pass the NEW queue first. Because it is
> expected to stay in experimental for a while, and the first upload
> could enhance everybody's morale.
> Actually there are many details to improve in the pytorch package,
> and I'm still fixing them bit by bit...
> BTW, please make sure to separate the tensorflow shared objects
> into separate binary packages, e.g.
> bin:libtensorflow-framework.*
> bin:libtensorflow.*
> bin:libtensorflow_cc.*
> bin:libtensorflow-dev
> This is because some customized Ops/Kernels only NEED
> libtensorflow_framework.so.*

Reply to: