[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: VIP





On Wed, Jul 8, 2020 at 3:16 PM Tony Travis <tony.travis@minke-informatics.co.uk> wrote:
Re: https://github.com/keylabivdc/VIP

Hi,
 
Hey Tony,
 
Does anyone have experience of running VIP?

I don't, but I've taken a look.
 
I've been trying to get it working under Ubuntu 18.04/20.04 with
Debian-Med installed in a Bioconda environment, but I've only managed to
get it running in the author's Ubuntu 14.04 VIP Docker container...

Perhaps try the author's vip2 container? https://hub.docker.com/r/yang4li/vip2 Though I can find no source for either container, so it is a bit risky.

The yang4li/vip-docker container uses a very old version of mafft, from 2013.
 
It's running, but very slowly, and "mafft" is only running on one core
even though there are 24 cores available and the "mafft" command-line
specifies 12 cores. I think it might be caused by running it on an AMD
Opteron 6000 that does not support SSE2 instructions, but any advice

FYI: Mafft does not appear to use SSE2 in its codebase. I checked and the 32-bit mafft 7.467-1 binaries have no SSE2 operands. The 64-bit version (amd64) does use SSE2, probably due to optimizations by the compiler.

Anyhow, I thought the AMD Opterons are 64bit? Then the do support SSE2 for sure and https://en.wikipedia.org/wiki/List_of_AMD_Opteron_microprocessors#Opteron_6100-series_%22Magny-Cours%22_(45_nm) says they support SSE3 as well.

about VIP would be welcome - It's been running for a week now :-(

Could be that mafft has reached a point in the computation that is still single threaded.

Their script specifies 8 threads: https://github.com/keylabivdc/VIP/blob/d69b5e7615d8da76ef0dd66e51867c8ec42588d4/MSA.sh#L28

Though that is a different script than in the container, where it references `--thread $total_cores` where `total_cores` appears to be set to half of your available cores. Which matches your 12 of 24 cores report.

You could replace that `--threads $total_cores` with `--threads -1` to see if that helps.
 
We're using the pipeline to check Tetse fly microbiota for viruses, but
it may also be useful for Covid-19 work because the host databases are
human by default.

Thanks,

   Tony.

--
Michael R. Crusoe

Reply to: