[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Integrating Machine Learning Software and Datasets withing debian



(apologies if quoting is broken, I wasn't subscribed yet at that point,
and the archives are only available via the web interface)

On Thu, 10 Jun 2010, Yaroslav Halchenko wrote:
>On Thu, 10 Jun 2010, Soeren Sonnenburg wrote:
>> we have been setting up freshmeat like repositories for machine learning
>> open source software ( http://mloss.org ) and data sets
>> ( http://mldata.org ) trying to make open source software/open data more
>> widely known within the machine learning community (also organizing
>> workshops and establishing in the ``biggest'' machine learning journal
>> http://jmlr.org).
>>
>>
>> I would wish to improve integration between the above repositories and
>> debian. So my question: Who is interested in packaging machine learning
>> packages?
>
> hm...  I guess at least
> you: shogun, weka
> me: mvpa, scikit-learn, mlpy, vowpal-wabbit

I would very much like to contribute here. As I am not a DD, I would
greatly appreciate sponsoring :-) I am very grateful for the work
Yaroslav already did as a mentor/sponsor for libfann.

The question is: where would I start packaging? Looking at mloss.org, I
can see it contains hundreds of registered projects. Apart from the
obvious (packages with a personal interest), this would require some
form of strategy.

There would be many factors to consider. Relevance of the project is of
course important. I do not see much merit in (for example) providing
every implementation or variant of X out there; in fact, I think that
would even hurt more than help. Diversity is good, but too much of it
would IMHO only lead to fragmentation and confusion.

Another factor to consider would be upstream activity. When updating
libfann, for example, it was obvious that upstream development ceased
around 2007. I only updated that package because it apparently still is
quite popular I couldn't find a suitable alternative.

> Since maintainer group requires people,
> atm it might be worth starting with a wiki page pointing to the blends
> task, and trying to formalize the longer standing goal for possible
> maintainer group (otherwise, without clear advantages, co-existence
> within debian-science seems to be logical way forward)

A formalized goal would help new contributors such as me very much by
pointing them in a general direction.

> In addition, having heard Andrea Tille's talk about debian blends
>  http://blends.alioth.debian.org/science/tasks/machine-learning
> I wonder if we could have pointers from mloss.org to the respective
> debian packages/wnpp's etc.
>
> In the long term, it would have been great if such inter-referencing
> between software portals, Debian blens, and Debian could become more
> pronounced at debian.org proper (e.g. packages listing etc).

Those ideas sound great!


Reply to: