Re: good multivariate regression packages?
On 11/25/05, Dirk Eddelbuettel <edd@debian.org> wrote:
> You didn't say anything about the functional form of your model.
The data samples I want to use for regression are obtained empirically
from a physical process (observations of a dynamical system). In the
future I'd like to use my code with arbitrary dynamical systems, not
just the one whose plot I've linked to, and so in general I will not
have a good idea of the underlying model. I would expect that quite
often it will not be linear.
I should mention that this has led me to believe I should be looking
at nonparametric models, and have thus looked at using libsvm (not in
Debian), a Support Vector Machine library/tools. These seemed
excellent, but with my limitted knowledge of kernel machines I had a
hard time getting a useful regression. It seems there is still a bit
of black magic on figuring out which model parameters will yield the
most faithful model; although you can do a grid search through the
possible parameter space, you only have correlation/mean-squared error
as quality metrics, and these seem to be rather inadequate a lot of
the time.
> That said, you probably should look at R as it provides a real environment
> for statistical computing, modeling, visualization, estimation, inference,
> simulation, ... I think there's also a full blown R / CRAN mirror at U of T
> but I've forgotten where it is hosted.
I have used R in the past, mostly for simple statistical analysis, and
for plotting. I have looked into using this package for my
regression, but I encountered two issues. 1) I had the feeling that R
is not particularly lean, in that it is voracious with regards to
memory, especially when dealing with large data sets, and is not
particularly speedy; 2) there is an outright overabundance of
regression methods in R, to the point where I am drowning in
information; with my meagre knowledge of regression methods I cannot
assess which methods would be most appropriate to my task, what the
tradeoffs, advantages and disadvantages are, etc. I made a similar
posting to the R users list, but it did not yield any replies.
Because of these two issues I was thinking I might have more luck with
a more specialized package, one which centers only on regression, as
it might be more optimized, and have better regression-specific
documentation...
]> Sure, R can be driven from Python via RPy. And
>
> $ apt-get install python-rpy r-base
>
> gets them both for you.
Used'em, loved'em. :)
> Greetings back to Ontario, and good luck, Dirk
Whereabouts in Ontario? Just curious. Hmm, is there a map
illustrating the geo-distribution of Debian developers? Wondering how
many of you are hiding in this neck of the woods... :)
--
Maciej Kalisiak
<mkalisiak@gmail.com>
http://www.dgp.toronto.edu/~mac
Reply to: