[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Intent to work on Cufflinks 2.0.0



On Sat, Jun 9, 2012 at 12:09 AM, Charles Plessy <plessy@debian.org> wrote:
> Le Fri, Jun 08, 2012 at 11:33:46AM -0400, Carlos Borroto a écrit :
>>
>> I intent to work on updating Cufflinks to 2.0.0. I see Charles Plessy
>> imported upstream 2.0.0 already but master is still on 1.3.0. Should I
>> merge master and upstream branches and start working? Charles how did
>> you import the new upstream version into upstream but kept master
>> untouched? I normally do a "git-import-orig
>> /path/to/<packagename>-<verionnumber>.tar.gz" which do both things.
>>
>> Finally is there a reason why we haven't updated to 2.0.0?
>
> Hi Calros,
>
> I imported version 2.0.0 when we worked on #672744, as I wanted to see if 2.0.0
> solved it.  But upstream, 2.0.0 is marked beta, so unless there are good reasaons,
> I would prefer to keep 1.3.0 in Wheezy and provide later versions as backports.  In
> the meantime, it is always possible to upload 2.0.0 to Experimental if you want.
>
> Have a nice week-end,
>

Hi Charles,

I'm happy to leave it UNRELEASED for now. Would having 2.0.0 on the
master branch make harder to maintain 1.3.0? Is there a way to keep
working with 1.3.0 if that would be necessary before the 2.0.0 line
gets ready for prime time?

Regarding Debian Med's PPA in Launchpad. Would it be OK to upload
2.0.0 there? I having some issues with 1.3.0 and I'm not the only
one[1].

[1]http://seqanswers.com/forums/showthread.php?t=17662

Even upstream acknowledged there are issues with 1.3.0 in 2.0.0's changelog[2].

[2]http://cufflinks.cbcb.umd.edu/
"Some users were reporting a high FAIL rate on gene and transcripts
quantification. This has been resolved according to a battery of tests
using real and simulated data. The root cause was that in conditions
with substantial overdispersion across replicates, the FPKM
variance-covariance matrices produced by the Cuffdiff model were not
always positive-definite. Cuffdiff was detecting this, and marking
those genes as having unreliable confidence intervals. Prior to 2.0.0,
the model contained a heuristic approximation of the covariances
between assigned fragment counts (which are necessary for calculating
the variance on each gene's expression level), and this approximation
was producing poorly conditioned matrices. We have replaced the
heuristic approximation with a direct sampling approach, in effect
"simulating" the assignment of fragments to each isoform many times
for each gene. By simulating fragment generation and assignment to
each transcript, we are reconstructing variance-covariance matrices
for assigned fragment counts that are always properly condition. This
sampling approach produces more accurate estimates of variance and
covariance as well, improving accuracy of transcript and gene level
differential analysis. Users should expect more accurate
quantification and shorter, more conservative lists of differentially
expressed genes and trasncripts."

Best,
Carlos


Reply to: