[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: GMAP -- Align mRNA and EST sequences to a genome



Hi Charles,

On 11 May 2010 22:01, Charles Plessy <plessy@debian.org> wrote:
...
> Please consider adding a ‘pristine-tar’ branch if it is easy for you.
> http://wiki.debian.org/PackagingWithGit#pristine-tar

Since the pristine tar can be downloaded from either the upstream site
or from ftp.debian.org, it seems redundant to be able to rebuild the
pristine tar from version control as well. What's the purpose?

>> How about /var/cache/gmap?
>> http://www.pathname.com/fhs/pub/fhs-2.3.html#VARCACHEAPPLICATIONCACHEDATA
>
> I think that /var/cache can be erased anytime by the administrator. Perhaps
> /var/lib is more appropriate?

I thought about /var/lib at first, but the FHS states that the data is
specific to one host, which is not the case for GMAPDB, which can and
should be shared by multiple hosts, as its format is independent of
host architecture. After reading the descriptions of each directory
under /var, the only one that seems appropriate is /var/cache. The FHS
requires that the program be able to regenerate data stored in
/var/cache, and the GMAPDB may be regenerated using gmap_setup.

> Also, we are discussing the possibility to distribute data packages. For
> instance, we could distribute the human genome in FASTA format and indexed for
> various software including gmap. In that case, it may be interesting to change
> the default again to /usr/share/somewhere. The precise path has not been
> seriously discussed yet (it will be probably better to keep all the distributed
> data in the same tree, since it can require a large amount of disk space). You
> can find more information in a recent thread on the debian-science discussion
> list and on the Debian wiki:
>
> http://lists.debian.org/msgid-search/20100508140545.GA19587@meiner
> http://wiki.debian.org/DataPackages

It would certainly be useful, but the number of genomes * number of
aligners is a big number. I guess one would have to be selective and
choose the most popular data sets. Packaging the human genome in FASTA
format would be a great start.

>> I had changed the upstream version number from
>> gmap-2010-03-09
>> to
>> gmap-20100309
>>
>> This change isn't strictly necessary. Should I undo this?
>
> I would recommend to use the same version number as Upstream when possible.

I've changed the upstream version number to match upstream (2010-03-09).

>> > Can you recommend a license that is DFSG free and compatible with the
>> > upstream license?
> Here are a non-comprehensive list of “invariant” alternatives:
>
>  - The Boost Software License (http://www.boost.org/LICENSE_1_0.txt).
>  - The ISC license (https://www.isc.org/software/license).
>  - The FreeBSD license (http://www.freebsd.org/copyright/freebsd-license.html).
>  - The Gnu all-permissive license (http://www.gnu.org/licenses/license-list.html#GNUAllPermissive),
>   that is even simpler but still include a non-warranty disclaimer.

I've switched to the ISC license.

These changes have been committed to git. Any other suggestions before I upload?

Cheers,
Shaun


Reply to: