[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Aw: Packaging of Apache Arrow - where are we?



Hello again,

> The Arrow library
> https://github.com/apache/arrow
> is about efficiently handling files with large tables.
> https://salsa.debian.org/go-team/packages/golang-github-apache-arrow-go/-/blob/debian/sid/debian/control?ref_type=heads
> describes it nicely.
> 
> >From what I observed this emerged as something that folks expect to be available, just Debian does not have it, yet.
> 
> There is some initial work from several people, and it was uploaded to the archive at least once by satta
> according to the changelog, and
> https://salsa.debian.org/science-team/arrow/-/blob/master/debian/README.Debian?ref_type=heads
> provides an overview on what remains to do for an upload:
> 
> [X] Update Maintainer/Uploader in d/control
> [ ] Update d/copyright
>   [ ] Inquire with upstream what files are under the licenses in LICENSE.txt
> [ ] Re-activate CUDA/Flight/Gandiva once dependencies in Debian are ready
> [ ] Optional: clean out embedded code (not too much IIR)
> 
> Don't know how you feel about it all, which is why I am asking with this email :-) ) , but I tend to think that
> beyond the d/copyright and license clarification there is nothing ultimately blocking, right?
> 
> Has anything wrt the license clarification with upstream already been initiated?
> 
> Is anybody (somewhat) actively working on this package?

I have updated to version 20.0.0. The situation got a bit worse as the Python interface now demands Apache Thrift
(https://thrift.apache.org/, https://github.com/apache/thrift) to be available, which it is not. And I still fail
to get the tests to work.


The license file (https://github.com/apache/arrow/blob/main/LICENSE.txt) is a monster and so is debian/copyright.
In most cases it is clearly stated which parts are borrowed from where, need to chase this up in more detail whenever
time permits. Or has someone already done that work?

Even if the Python interface needs an extra iteration because of Thrift, because of other reverse dependencies, like
the R package, I consider it worthwhile to proceed with Arrow's C library.

Best,
Steffen


Reply to: