[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Licensing terms for data derived from ARB/SILVA data



Hi Andreas,

>>> what aout putting barnap into main and barrnap-additional-data
>>> into non-free?
>>
>> Do you mean having barrnap depend on or suggest barrnap-additional-data
>> in that case?
> 
> Only *suggests* would be possible to keep barrnap in main.

I see -- makes sense.

>> The problem is that the non-free HMMs are not really "additional", as
>> barrnap would not work without them at all. Hence it would not make
>> sense to install one without the other.
>>
>> In theory, one could split the HMMs into a free and a non-free set, and
>> then patch barrnap to read its HMMs from multiple sources.
> 
> What do you mean by "multiple sources"?  The files of the different
> packages can perfectly end up in the same directory.

Well, I need to explain how barrnap reads its input HMMs then. There is
currently one .hmm file for each kingdom: euk(aryote), bac(terial),
arc(haea) and mito(chondrial). These four files are currently
distributed with barnap in the upstream tarball. Each of these files
contains multiple HMM definitions (in HMMER3 format), some of which are
taken from Rfam (free) and some of which are built from the non-free
SILVA alignments (the 23S and 28S rRNAs, to be exact).
When barrnap is called, the '--kingdom' parameter takes either 'euk',
'bac', 'arc' or 'mito', hence specifying the prefix of the .hmm file to
use in the search.
So if the .hmm files are split into two separate ones, e.g. euk.hmm and
euk_nonfree.hmm, one would need to patch barrnap to actually call nhmmer
with both files, if they exist.

>> However, if someone does not have non-free enabled, they will silently
>> miss all the results only available with the non-free HMM set. I can
>> imagine that would confuse the majority of users, who would want to rely
>> on the output of barrnap as it is.
> 
> I can not comment on this since I'm no end user.

That's just what I can imagine as a potential negative consequence --
the end-user might see the missing results as a flaw in barrnap itself.
If I was the upstream author, I would not be too happy about this in
this case.

Best
Sascha


Reply to: