[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Help for asking upstreams about free licenses urgently needed (Was: Help: Seeking source code of guppy base caller)

Hi Charles,

thanks a lot for your insight.

On Mon, May 04, 2020 at 10:37:22AM +0900, Charles Plessy wrote:
> Hi Andreas and everybody,
> I am a regular user of Guppy.  We use it to transform ("basecall") raw
> signal output from the sequencers manufactured by Oxford Nanopore
> Technologies (ONT) to nucleic acid sequence files in the FASTQ format
> accepted by many of the tools that we package in Debian Med.
> I think that even if ONT would free Guppy, packaging it would be
> a significant challlenge.
>  - Guppy is a moving target, and whichever version we would distribute
>    in Stable is unlikely to satisfy the users a year later.
>  - Upgrades are not drop-in replacements for each other and a laboratory
>    typycally needs to install several versions side-to-side.

I wonder how users of that software are dealing with this.
>  - In many cases, a GPU is needed to have Guppy end its computation in
>    a reasonable time.  But Debian does not have an infrastructure to
>    test GPU computations.
>  - As far as I know, Guppy is developed on amd64 and arm64 only.  We
>    can therefore expect the usual portability issues.

Regarding this item I would not see any problem.  We can easily restrict
the architectures as well.

>  - The conversion from raw to FASTQ is done by neural network algorithms
>    for which we do not have access to the training data, and therefore
>    the freedom to modify Guppy would be limited to the sugar around the
>    core algorithms.

That's a strong point actually.  However, we will face more and more
problems of this nature.  Mo's attempt to write a deep learning policy
might help here a bit.
> In that sense, I think that if we want to distribute a basecaller in
> Debian, we should better pick an alternative that is already free.  Some
> of them are reported to perform as well as Guppy.  But which one to
> pick, and how about long-term mainteance ?

Once I've started packaging deepbinner[1] which is stalled as long as we
do not have python3-tensorflow.  But may be that's at the horizon since
bazel packaging sounded quite promising.
> Altogether, I think that we will best serve our users by making sure
> that Free basecallers are easy to install on Debian, providing the
> standard tools for downstream analysis (we are quite good at this), and
> adding value by supporting bioinformatics workflow systems.

That's exactly my opinion here.

Kind regards


[1] https://salsa.debian.org/med-team/deepbinner 


Reply to: