[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: metastudent and libgo-perl (Was: [outreachy] autopkgtest-pkg-perl in librg-utils-perl does nothing)



Hi Andreas,

2016-07-14 10:51 GMT+03:00 Andreas Tille <andreas@an3as.eu>:
[CC, ing other uploaders of reprof from rostlab]
Hi Tanya,

On Thu, Jul 14, 2016 at 10:32:44AM +0300, merlettaia wrote:
> Hi Andreas,
> I've added tests for 2 packages - metastudent and libgo-perl, which is used
> by metastudent and which produced error (patch in libgo-perl makes
> metastudent run work).

Wouldn't this mean we need a versioned Depends from libgo-perl in
metastudent?  It seems to be only an indirect dependency since I can't
see it in the dependencies but if it needs a certain version this should
be specified.

Probably I expressed idea in a bad way. Metastudent depends on libgo-perl, and when `metastudent` command is called, it calls perl script which contains the line
   use GO::IO::Dotty;
- that module belongs to libgo-perl and produced error, before I patched libgo-perl. I simply wanted to say that my test for metastudent will pass after this patch to to libgo-perl is applied. libgo-perl is already present in metastudent Depends field.


> BTW, I noticed that autopkgtest-pkg-perl skips modules syntax check, when
> specific file is not present, and debian/control contains "Suggests:" line.
> I'll check librg-utils-perl in case it is useful there.

OK, but we now would need a new package version (which is no problem).
However, I have not seen the typical t/ directory in librg-utils-perl.

autopkgtest-pkg-perl also runs syntax check.
https://pkg-perl.alioth.debian.org/autopkgtest.html#syntax_t - but in some conditions skips it.


> Now predictprotein run fails when reprof is called.
> That's why I added tests for binary package `reprof` (in addition to
> autopkgtest-pkg-perl tests).
> I added to debian/tests/installation-test, which calls reprof. For now it
> fails with following message:
> "Constructor failed at /usr/share/perl5/RG/Reprof.pm line 225."
> I looked at that file, it seems that for now problem in .model and
> .features files, accompanying reprof (which are installed to
> usr/share/reprof folder). That's why this test fails now and this reprof
> update is not ready for upload.

OK, I simply added a remark in d/changelog to remember this.

Thanks for your thorough work

       Andreas.

> 2016-07-13 22:21 GMT+03:00 Andreas Tille <andreas@an3as.eu>:
>
> > Hi Tanya,
> >
> > On Wed, Jul 13, 2016 at 07:24:12PM +0300, merlettaia wrote:
> > >
> > > I found a problem in which this package is involved also.
> > > Last weekend I started to work on predictprotein. The hardest problem was
> > > to make it work.
> > > https://wiki.debian.org/DebianMed/PredictProtein - at some point I found
> > > this instruction, spent some time downloading database, and when I
> > > downloaded and installed it, then run predictprotein, I've got multilple
> > > error messages (output_with_errors.txt). It turned out that when one of
> > the
> > > perl scripts in librg-utils-perl calls blastpgp on that database,
> > >   blastpgp -F F -a 1 -j 3 -b 3000 -e 1 -h 1e-3 -d
> > > /data/src/rostlab-data/data/big/big_80 -i query.fasta -o
> > > query.blastPsiOutTmp -C query.chk -Q query.blastPsiMat
> > >
> > > - blastpgp ends up with "Killed" message, and produces incorrect output
> > > file (query.blastPsiOutTmp is incomplete). Script in librg-utils-perl is
> > > correct, call in predictprotein is correct. Blastpgp fails with error.
> > >
> > > I thought that incorrect database format could be the reason for it.
> > > Because version of ncbi-blast+ (blastpgp belongs to this package) package
> > > uses latest version of that database, and database from RostLab's website
> > > probably isn't latest.
> > > I downloaded from NCBI FTP (ftp://ftp.ncbi.nlm.nih.gov/blast/db/) one of
> > > the databases, and tried to run predictprotein with that data. It worked!
> > > But now I've got error while metastudent run (output in some_output.txt)
> > -
> > > I'm working to fix it now.
> >
> > Thanks for your very thorough investigation.  I have put Laszlo in CC -
> > may be he has some contact information or can help himself even if he
> > is not active in Debian Med any more.
> >
> > > And there are two things I don't understand:
> > >
> > > Is there any package which contains copy of current version of blastp
> > > database? Or small part of it. It seems that autopkgtest testsuite should
> > > use smaller portion of blastp database.
> >
> > As far as I know there is no such package.  IMHO it might be a good idea
> > to ship something like a stripped down database since it could be used
> > as test data input for several other packages.  What do other think?
> >
> > > For now it seems unclear how to test predictprotein with autopkgtest,
> > since
> > > for correct run it requires also local copy of (possibly) huge database
> > > (~30GB in copy from RostLab's website), probably ncbi-blast+/ncbi-tools6
> > > should download and install it?
> >
> > For manual user tests this might be OK, but autopkgtest should be
> > offline.
> >
> > > Predictprotein has special parameters for
> > > different databases, and path to blast installation can be provided by
> > > hand, that makes possible to call it with smaller database in testsuite
> > > run.
> >
> > Sounds convincing.
> >
> > > But that will work only if blastpgp from ncbi-blast+ works correctly
> > > with the same version of database. That means that better way to
> > > install+test database usage from ncbi-blast+ tests, and use default
> > > database installed with ncbi-blast+ (if it will be installed).
> > >
> > > Could you also check that database from here:
> > > https://wiki.debian.org/DebianMed/PredictProtein - really doesn't work?
> > I
> > > have unstable internet connection and not sure if that file was not
> > > corrupted.
> >
> > Any volunteer for this?  My internet is currently also not the best.
> >
> > Kind regards
> >
> >        Andreas.
> >
> >
> > > cache merging is off at /usr/bin/predictprotein line 230.
> > > work_dir=/data/src/temp at /usr/bin/predictprotein line 336.
> > > make --no-builtin-rules INFILE=query.in -C /data/src/temp JOBID=query
> > -j 1 BLASTCORES=1 LIBRGUTILS=/usr/share/librg-utils-perl/
> > PPROOT=/usr/share/predictprotein/ PROFNUMRESMIN=17
> > PROFROOT=/usr/share/profphd/prof/
> > BIGBLASTDB=/data/src/rostlab-data/data/aa/pdbaa
> > BIG80BLASTDB=/data/src/rostlab-data/data/aa/pdbaa
> > PFAM2DB=/data/src/rostlab-data/data/pfam_legacy/Pfam_ls
> > PFAM3DB=/data/src/rostlab-data/data/pfam/Pfam-A.hmm
> > PROSITEDAT=/data/src/rostlab-data/data/prosite/prosite.dat
> > PROSITECONVDAT=/data/src/rostlab-data/data/prosite/prosite_convert.dat
> > PSICEXE=/usr/share/rost-runpsic/runNewPSIC.pl
> > SPKEYIDX=/data/src/rostlab-data/data/swissprot/keyindex_loctree.txt
> > SWISSBLASTDB=/data/src/rostlab-data/data/swissprot/uniprot_sprot
> > NORSPCTRL="--win=100" DEBUG=1 -f /usr/share/predictprotein/MakefilePP.mk
> > all norsp at /usr/bin/predictprotein line 383.
> > > make: Entering directory '/data/src/temp'
> > > metastudent -i query.fasta -o query.metastudent --silent  --debug
> > > mkdir -p /tmp/metastudentulQjHj/methodC;cd
> > /usr/lib/python2.7/dist-packages/metastudentPkg/lib/groupC;./CafaWrapper3.pl
> > /tmp/metastudentulQjHj/query.fasta_eval1.0_iters3_srcgoasp.mfo.blast
> > /tmp/metastudentulQjHj/methodC/output.MFO.txt 0
> > /tmp/metastudentulQjHj/methodC
> > > !!!Error!!! mkdir -p /tmp/metastudentulQjHj/methodC;cd
> > /usr/lib/python2.7/dist-packages/metastudentPkg/lib/groupC;./CafaWrapper3.pl
> > /tmp/metastudentulQjHj/query.fasta_eval1.0_iters3_srcgoasp.mfo.blast
> > /tmp/metastudentulQjHj/methodC/output.MFO.txt 0
> > /tmp/metastudentulQjHj/methodC
> > > 65280
> > > Can't use a hash as a reference at /usr/share/perl5/GO/IO/Dotty.pm line
> > 104.
> > > Compilation failed in require at ./treehandler.pl line 10.
> > > BEGIN failed--compilation aborted at ./treehandler.pl line 10.
> > > ./treehandler.pl -mfo transitiveClosure2014.txt -bpo
> > transitiveClosure2014.txt -cco transitiveClosure2014.txt -method 3 -pred
> > /tmp/metastudentulQjHj/methodC/blast.out -scoring 0 failed: 255 at
> > ./CafaWrapper3.pl line 16.
> > > Error occurred: IOError
> > > Traceback (most recent call last):
> > >   File "/usr/bin/metastudent", line 721, in <module>
> > >     runIt(tempfile, inputFastaFilePath, outputFilePath, outputBlast,
> > blastKickstartDatabasePaths, ontologies, blastOnly, keepTemp, allPreds,
> > debug, noNames, withImages)
> > >   File "/usr/bin/metastudent", line 187, in runIt
> > >     predLinesDict["C"] = runMethodC(blastKickstartDatabasePath,
> > fastaFilePathLocal, tmpDirPath, configMap["GROUP_C_SCORING_%s" % (ontology)
> > ], ontology, configMap, debug)
> > >   File "/usr/lib/python2.7/dist-packages/metastudentPkg/runMethods.py",
> > line 206, in runMethodC
> > >     with open(outputFilePath) as f:
> > > IOError: [Errno 2] No such file or directory:
> > '/tmp/metastudentulQjHj/methodC/output.MFO.txt'
> > > /usr/share/predictprotein/MakefilePP.mk:403: recipe for target
> > 'query.metastudent.BPO.txt' failed
> > > make: *** [query.metastudent.BPO.txt] Error 1
> > > make: Leaving directory '/data/src/temp'
> > > make --no-builtin-rules INFILE=query.in -C /data/src/temp JOBID=query
> > -j 1 BLASTCORES=1 LIBRGUTILS=/usr/share/librg-utils-perl/
> > PPROOT=/usr/share/predictprotein/ PROFNUMRESMIN=17
> > PROFROOT=/usr/share/profphd/prof/
> > BIGBLASTDB=/data/src/rostlab-data/data/aa/pdbaa
> > BIG80BLASTDB=/data/src/rostlab-data/data/aa/pdbaa
> > PFAM2DB=/data/src/rostlab-data/data/pfam_legacy/Pfam_ls
> > PFAM3DB=/data/src/rostlab-data/data/pfam/Pfam-A.hmm
> > PROSITEDAT=/data/src/rostlab-data/data/prosite/prosite.dat
> > PROSITECONVDAT=/data/src/rostlab-data/data/prosite/prosite_convert.dat
> > PSICEXE=/usr/share/rost-runpsic/runNewPSIC.pl
> > SPKEYIDX=/data/src/rostlab-data/data/swissprot/keyindex_loctree.txt
> > SWISSBLASTDB=/data/src/rostlab-data/data/swissprot/uniprot_sprot
> > NORSPCTRL="--win=100" DEBUG=1 -f /usr/share/predictprotein/MakefilePP.mk
> > all norsp failed: 512 at /usr/bin/predictprotein line 392.
> >
> > > cache merging is off at /usr/bin/predictprotein line 230.
> > > work_dir=/data/src/temp at /usr/bin/predictprotein line 336.
> > > make --no-builtin-rules INFILE=query.in -C /data/src/temp JOBID=query
> > -j 1 BLASTCORES=1 LIBRGUTILS=/usr/share/librg-utils-perl/
> > PPROOT=/usr/share/predictprotein/ PROFNUMRESMIN=17
> > PROFROOT=/usr/share/profphd/prof/
> > BIGBLASTDB=/data/src/rostlab-data/data/big/big
> > BIG80BLASTDB=/data/src/rostlab-data/data/big/big_80
> > PFAM2DB=/data/src/rostlab-data/data/pfam_legacy/Pfam_ls
> > PFAM3DB=/data/src/rostlab-data/data/pfam/Pfam-A.hmm
> > PROSITEDAT=/data/src/rostlab-data/data/prosite/prosite.dat
> > PROSITECONVDAT=/data/src/rostlab-data/data/prosite/prosite_convert.dat
> > PSICEXE=/usr/share/rost-runpsic/runNewPSIC.pl
> > SPKEYIDX=/data/src/rostlab-data/data/swissprot/keyindex_loctree.txt
> > SWISSBLASTDB=/data/src/rostlab-data/data/swissprot/uniprot_sprot
> > NORSPCTRL="--win=100" DEBUG=1 -f /usr/share/predictprotein/MakefilePP.mk
> > all norsp at /usr/bin/predictprotein line 383.
> > > make: Entering directory '/data/src/temp'
> > > make: Warning: File 'query.in' has modification time 3.2 s in the future
> > > /usr/share/librg-utils-perl//copf.pl query.in formatIn=fasta
> > formatOut=fasta fileOut=query.fasta exeConvertSeq=convert_seq
> > > /usr/share/librg-utils-perl//copf.pl query.in formatIn=fasta
> > formatOut=gcg fileOut=query.seqGCG exeConvertSeq=convert_seq
> > > ncbi-seg query.fasta -x > query.segNorm
> > > /usr/share/librg-utils-perl//copf.pl query.segNorm formatOut=gcg
> > fileOut=query.segNormGCG
> > > # blast call may throw warnings on STDERR - silence it when we are not
> > in debug mode; blastpgp and blastall create a normally 0-sized 'error.log'
> > - remove it
> > > trap "rm -f error.log" EXIT; \
> > > if ! ( blastpgp -F F -a 1 -j 3 -b 3000 -e 1 -h 1e-3 -d
> > /data/src/rostlab-data/data/big/big_80 -i query.fasta -o
> > query.blastPsiOutTmp -C query.chk -Q query.blastPsiMat   ); then \
> > >       EXIT=$?; cat error.log >&2; exit $EXIT; \
> > > fi
> > > Killed
> > > cat: error.log: No such file or directory
> > > # blast call may throw warnings on STDERR - silence it when we are not
> > in debug mode
> > > trap "rm -f error.log" EXIT; \
> > > if ! ( blastpgp -F F -a 1 -b 1000 -e 1 -d
> > /data/src/rostlab-data/data/big/big -i query.fasta -o query.blastPsiAli.nz
> > -R query.chk   ); then \
> > >       EXIT=$?; cat error.log >&2; exit $EXIT; \
> > > fi
> > > [blastpgp] WARNING: -t larger than 1 not supported when restarting from
> > a checkpoint; setting -t to 1
> > >
> > > [blastpgp] WARNING: posReadCheckpoint: Attempting to recover data from
> > previous checkpoint
> > >
> > > [blastpgp] WARNING: posReadPosFreqsStandard: Could not open checkpoint
> > file
> > >
> > > [blastpgp] WARNING: posReadCheckpoint: Data recovery failed
> > >
> > > [blastpgp] FATAL ERROR: blast: Error recovering from checkpoint
> > > cat: error.log: No such file or directory
> > > gzip -c -6 < 'query.blastPsiAli.nz' > 'query.blastPsiAli.gz'
> > > # lkajan: we have to switch off filtering (default for blastpgp) or
> > sequences like ASDSADADASDASDASDSADASA fail with
> > > # 'WARNING: query: Could not calculate ungapped Karlin-Altschul
> > parameters due to an invalid query sequence or its translation. Please
> > verify the query sequence(s) and/or filtering options'
> > > # Does switching off filtering hurt us? Loctree uses the results of this
> > for extracting keywords from swissprot, so I am not worried.
> > > # This blast call also often writes 'Selenocysteine (U) at position 59
> > replaced by X' - we are not really interested. Silence this in non-debug
> > mode.
> > > trap "rm -f error.log" EXIT; \
> > > if ! ( blastall -F F -a 1 -p blastp -d
> > /data/src/rostlab-data/data/swissprot/uniprot_sprot -b 1000 -e 100 -m 8 -i
> > query.fasta -o query.blastpSwissM8   ); then \
> > >       EXIT=$?; cat error.log >&2; exit $EXIT; \
> > > fi
> > > /usr/share/librg-utils-perl//blastpgp_to_saf.pl
> > fileInBlast=query.blastPsiOutTmp fileInQuery=query.fasta
> > fileOutRdb=query.blastPsi80Rdb fileOutSaf=query.safBlastPsi80 red=100
> > maxAli=3000 tile=0
> > > opened query.fasta at /usr/share/librg-utils-perl//blastpgp_to_saf.pl
> > line 126.
> > > blastfile: query.blastPsiOutTmp at /usr/share/librg-utils-perl//
> > blastpgp_to_saf.pl line 127.
> > > nohits: 0 at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 128.
> > > iter: 0 at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 129.
> > > blast+: 0 at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 130.
> > > Died at /usr/share/librg-utils-perl//blastpgp_to_saf.pl line 76.
> > > *** ERROR blastpgp_to_saf.pl : *** ERROR blastp_to_saf: blast file
> > format not recognized
> > > /usr/share/predictprotein/MakefilePP.mk:465: recipe for target
> > 'query.safBlastPsi80' failed
> > > make: *** [query.safBlastPsi80] Error 255
> > > rm query.blastPsi80Rdb query.blastPsiAli.nz
> > > make: Leaving directory '/data/src/temp'
> > > make --no-builtin-rules INFILE=query.in -C /data/src/temp JOBID=query
> > -j 1 BLASTCORES=1 LIBRGUTILS=/usr/share/librg-utils-perl/
> > PPROOT=/usr/share/predictprotein/ PROFNUMRESMIN=17
> > PROFROOT=/usr/share/profphd/prof/
> > BIGBLASTDB=/data/src/rostlab-data/data/big/big
> > BIG80BLASTDB=/data/src/rostlab-data/data/big/big_80
> > PFAM2DB=/data/src/rostlab-data/data/pfam_legacy/Pfam_ls
> > PFAM3DB=/data/src/rostlab-data/data/pfam/Pfam-A.hmm
> > PROSITEDAT=/data/src/rostlab-data/data/prosite/prosite.dat
> > PROSITECONVDAT=/data/src/rostlab-data/data/prosite/prosite_convert.dat
> > PSICEXE=/usr/share/rost-runpsic/runNewPSIC.pl
> > SPKEYIDX=/data/src/rostlab-data/data/swissprot/keyindex_loctree.txt
> > SWISSBLASTDB=/data/src/rostlab-data/data/swissprot/uniprot_sprot
> > NORSPCTRL="--win=100" DEBUG=1 -f /usr/share/predictprotein/MakefilePP.mk
> > all norsp failed: 512 at /usr/bin/predictprotein line 392.
> >
> >
> > --
> > http://fam-tille.de
> >
> >
>
>
> --
> Best wishes,
> Tanya.

--
http://fam-tille.de




--
Best wishes,
Tanya.

Reply to: