[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Questions about packaging the 'googleapis' project



Dear mentors,

Wookey and I are trying to come up with a sane concept to package the
googleapis project [1]. During our initial investigation a few
questions came up that we would like to discuss publicly.


BACKGROUND:

'googleapis' is a collection of protocol buffer [2] files, an
interface description language for stable binary encoding of data, and
gRPC service files. From those files, bindings for a variety of target
languages (Python, Ruby, Java, C++ etc.) can be generated, using the
'protoc' compiler with gRPC plugins. They _do_ offer a Makefile for
generating those bindings, albeit quite out-dated apparently (due to
ignoring protos from subfolder 'grafeas'). However, this Makefile only
generates source files (and headers), which is fine for Python etc.
but not particularly useful for Java/C++ etc. Furthermore, compiling
these source files yourself can be quite tedious, because you need to
know the dependency structure within the project, and this structure
changes rather frequently. Example:

Depending on 'google/longrunnig/operations.pb.cc' requires you to also
compile and link
  - google/api/annotations.pb.cc
  - google/rpc/status.pb.cc
for an older version of the project (143084a2624b6591ee1f9d23e7f5241856642f4d).

Whereas on current master, you additionally need to compile and link
google/api/client.pb.cc.

Most users probably do not want to deal with such internal
dependencies and just like to do:
  apt install libgoogleapis-dev
  pkg-config --libs googleapis_longrunning

Therefore, Wookey's idea was to also compile the Java/C++ bindings and
package the resulting libraries. Here is where things become
difficult:
  - We do not have a build description for most of the bindings (some
subfolders have Bazel BUILD files, but most do not)
  - We are talking about ~3,500 proto files. Building all of them
results in extremely huge files.
    - jar: ~160MB
    - shared lib: ~3GB (with debug info) ~180MB (after stripping)
    - static lib: ~11GB (with debug info) ~600MB (after stripping)


QUESTIONS:

1. Due to the missing build description, is it ok if the maintainer
provides a Makefile for building the C++ libraries in ./debian?

2. With such large libraries, I guess it makes sense to split them up.
I think a good approach to group proto files (for separation to
different libraries) would be to look for their 'package' identifier
(like a namespace, can be read from the file). Some packages belong to
"sub packages" that might cause cyclic dependencies (e.g.,
grafeas.v1beta1.discovery). Therefore, I would suggest to use a
heuristic to cut-off the package ID on first segment that matches
'^v[1-9]+' (e.g., grafeas.v1beta1, resulting in
libgoogleapis_grafeas_v1beta1.{a,so}). Doing this will result in
'only' 413 different packages/libraries. What do you think about this
approach?

3. What granularity should we use for packaging? Should we provide
these separated libraries via
  - a single debian package and a single dev package?
  - a debian package and dev package per library?
  - a debian package per library, but a single dev package for all headers?

4. Such a Makefile (and control file) will be quite lengthy. My
current solution is to use a Python script for analysing the proto
files, grouping them according to their package id, building up a
dependency graph, checking it for cycles, and finally generating the
Makefile (and control file/pkg-config files etc.). With upcoming
library releases this script could be extended and rerun.

5. The Java bindings are considerably smaller. In my opinion, those
could be provided in a single debian package, containing a single jar
file. What do you think?

6. As the googleapis repository is not versioned, it is hard to judge
which protoc version is compatible with the current proto source base.
I was talking to an ex-Googler and he told me I should look at the
PiperOrigin-RevId (shown in some of the commits). That's their
internal linear commit counter. According to him, we should look up
the protobuf-compiler version that is currently packaged in that
release. Then we should look for that ID in the googleapis commits and
package the revision that fulfils the condition:
  PipeId(googleapis) <= PiperId(protoc)
According to him, that's what has been tested at Google internally and
is guaranteed to work. The same applies to the packaged
protobuf-compiler-grpc, which is also a build dependency to
googleapis. Do you think this is a valid approach?

7. With no version given, what version should we use for this package?


That's all for now. Any suggestions are very welcome. Many thanks!

Oliver


[1] https://github.com/googleapis/googleapis
[2] https://protobuf.dev/


Reply to: