[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

How to direct users to packages in Debian Med Re: Please do not do heavy changes on metapackages without discussion (Was: [Git][blends-team/med][master] Update bio - substituted (most) recommendations with suggestions)




On 24.06.20 20:53, Andreas Tille wrote:
Hi Steffen,

On Wed, Jun 24, 2020 at 07:45:07PM +0200, Steffen Möller wrote:
On 24.06.20 18:44, Andreas Tille wrote:
your recent change to the bio task was quite invasive.  I think
we should discuss those things here first.  Leaving just daligner,
jalview (which is severely broken due to missing upgradres), qiime
and r-cran-rotl does not make sense at all.
you are right. I should not have done that. Sorry.
No problem - Git enabled an easy revert.
Keep that patch :o)

No, those leftovers should not be recommended either. ;o)

So please, before any massive change is done that changes the nature of
metapackages and might break user expectations that have grown over time
should be discussed here.

As I explained I have created a new branch demote-recommends that
can be used as a sandbox for a more fine grained metapackage layout.
But please do not to massive changes in the production branch without
any discussion.
In the past you explained (or I understood) you meant the "bio" task to
be a bit of a summary of everything available and a superset of bio-dev
and bio-ngs. And I agree that we need something like this - just for us
to see what should be updated, basically, and of others to see what is
out there, that is more then the Debian Med QA page. But to auto-derive
an installation of these packages - this is no longer feasible, I tend
to think and I feel that we agree on that.
Yes.  It might be that its a rare case that a user wants the whole task
med-bio.  I'd love to keep it basically untouched at least for the
moment.  Simply create one or more new tasks and once these are properly
designed we can strip down med-bio.
I like our tasks. And Bio-Linux has proven that a task list makes sense
also when interpreted as an instruction for installation. But that is all a
bit "yesterdayish". We are so much more connected now, so much more
"built on demand" cloud/docker-wise - minimal latency is what we
should promote.
My expectation is that our users are after on or two use cases/workflows
that will vertically install all the required tools. They don't install
five different workflows for RNA-seq analyses but decide for one (or
two). We have no means to express "please make a pick among those
packages that are equal", though.
So any workflow should have its own dependencies.  A user who wants to
work with exactly one workflow will not install med-bio.
Right. And even larger core facilities will only have a few workflows
that they run all the time.
So, the bio-task list I suggest not to install anything.
But why?  If a user does not want to install anything just not
installing med-bio is the easiest thing to do.  I do not see any sense
in providing a metapackage that does nothing.
You can install the suggested, still. So, nothing is taken away from anyone.
The metapackage for bio should be substituted with a guide tool to install
the right set of packages and maybe run tests with a local setup.

I would not mind to blend (pun intended) a bit of education with such a
tool.

And likely
neither the bio-dev task list. And then we need something expressing
(partial) functional equivalence to support a user-driven selection.
Please do so in med-bio-X1, med-bio-X2, etc.  and maintain these tasks
afterwards.  I consider all bio-something (something != dev) as not
properly maintained.  We have added lots of packages that most probably
need to go into those tasks but I can not do this.
Every workflow should represent itself as a regular package with its
own set of dependencies to be functional. So, in my model, the number
of packages you need for every workflow is - one. A bit tricky is the
selection of optional packages that come with extra functionality - just
think about what bcbio can all do. But - this can be described formally,
too.

What is missing is something like a equivalence-relationship to suggest
that packages could be used exchangeably. That will need to be somewhat
fuzzy, but I have little doubt that this can be achieved in a
straight-forward manner from what we already have. And then we have
your  med-bio-X1, ..., -Xn clusters and their formal description with it.

I did something like it a llloooonnngggg time ago
(https://doi.org/10.1093/bioinformatics/15.3.219
<https://doi.org/10.1093/bioinformatics/15.3.219>) and so have others
somewhat similarly before me
(https://doi.org/10.1016/0300-9084(96)84761-4) and after me (all the
semantics folks and the latest effort I am aware of is by Mr bio.tools
himself https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6378944/).


My
hunch is, and I think also the somewhat unspoken idea from the last
sprints, that the EDAM ontology directs us towards it. For the moment,
that annotation is too sparse and not detailed enough - but ... to
describe a few tens of packages properly from where we then derive
installation directions - that sounds doable. I imagine to derive the
kind of lists you are vaguely describing in an automated manner from the
UDD (which you had engineered to parse the d/u/edam annotation already).
If we could define criteria to auto-generate sensible tasks from UDD
by using EDAM ontology I'd be super happy since this would solve the
maintenance task.

Great. Am swamped now but later this years we should have something.
Let us also think about how to distinguish tasks that make users happy
when installing all the packages in there (and dependencies with it)
from tasks that make users happy by just installing one package in there.

Best,

Steffen



Reply to: