[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: good multivariate regression packages?



On 11/25/05, Dirk Eddelbuettel <edd@debian.org> wrote:
> You didn't say anything about the functional form of your model.

The data samples I want to use for regression are obtained empirically
from a physical process (observations of a dynamical system).  In the
future I'd like to use my code with arbitrary dynamical systems, not
just the one whose plot I've linked to, and so in general I will not
have a good idea of the underlying model.  I would expect that quite
often it will not be linear.

I should mention that this has led me to believe I should be looking
at nonparametric models, and have thus looked at using libsvm (not in
Debian), a Support Vector Machine library/tools.  These seemed
excellent, but with my limitted knowledge of kernel machines I had a
hard time getting a useful regression.  It seems there is still a bit
of black magic on figuring out which model parameters will yield the
most faithful model; although you can do a grid search through the
possible parameter space, you only have correlation/mean-squared error
as quality metrics, and these seem to be rather inadequate a lot of
the time.

> That said, you probably should look at R as it provides a real environment
> for statistical computing, modeling, visualization, estimation, inference,
> simulation, ...  I think there's also a full blown R / CRAN mirror at U of T
> but I've forgotten where it is hosted.

I have used R in the past, mostly for simple statistical analysis, and
for plotting.  I have looked into using this package for my
regression, but I encountered two issues.  1) I had the feeling that R
is not particularly lean, in that it is voracious with regards to
memory, especially when dealing with large data sets, and is not
particularly speedy; 2) there is an outright overabundance of
regression methods in R, to the point where I am drowning in
information; with my meagre knowledge of regression methods I cannot
assess which methods would be most appropriate to my task, what the
tradeoffs, advantages and disadvantages are, etc.  I made a similar
posting to the R users list, but it did not yield any replies.

Because of these two issues I was thinking I might have more luck with
a more specialized package, one which centers only on regression, as
it might be more optimized, and have better regression-specific
documentation...

]> Sure, R can be driven from Python via RPy. And
>
>         $ apt-get install python-rpy r-base
>
> gets them both for you.

Used'em, loved'em. :)

> Greetings back to Ontario, and good luck,  Dirk

Whereabouts in Ontario?  Just curious.  Hmm, is there a map
illustrating the geo-distribution of Debian developers?  Wondering how
many of you are hiding in this neck of the woods...  :)

--
Maciej Kalisiak
<mkalisiak@gmail.com>
http://www.dgp.toronto.edu/~mac



Reply to: