Hello,
I had an exchange with Stian yesterday about what CWL workflow of his
database he would propose to use as an experience-gathering example. He
proposed the GATK workflow by Farah Zaib Khan et al. for being good to
cite about workflows and reproducibility.
https://doi.org/10.1186/s12859-017-1747-0
https://github.com/skanwal/GATK-CaseStudy/tree/master/CWL
We have BWA, GATK and Picard Toolkit already in Debian from what I
understand (not sure about the state of GATK). Stian had pointed to
https://github.com/h3abionet/h3agatk/blob/master/workflows/ GATK/GATK-complete-WES- Workflow-h3abionet.cwl
as a current variant of the same, but then again, I would not mind to
start with a smaller one. Any comments?
The main point for me is to have a small test case for running this
workflow repeatedly. We would hence also need to decide on appropriate
test data at some point. Should we also introduce a package like
"genome-human"?
Best,
Steffen