[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

R performance



Hello

I have recently had cause to compare performance of running the R 
language on my 10+-year-old PC running Buster (Intel Core i7-920 CPU) 
and in the cloud on AWS. I got a surprising result, and I am wondering 
if the R packages on Debian have been built with any flags that account 
for the difference.

My PC was a mean machine when it was built, but that was in 2009. I'd 
expect it would be outperformed by up to date hardware.

I have a script in R which I wrote which performs a moderately involved 
calculation column-by-column on a 4000-row, 10000-column matrix. On my 
Buster PC, performing the calculation on a single column takes 9.5 
seconds. The code does not use any multi-cpu capabilities so it uses 
just one of the 8 avaialable virtual CPUs in my PC while doing so. (4 
cores, with hyperthreading = 8 virtual CPUs)

Running the same code on the same data on a fairly high-spec AWS EC2 
server in the cloud, (the r5a-4xlarge variety for those who know about 
AWS) the same calculation takes 2 minutes and 6 seconds. 

Obviously there is virtualisation involved here, but at low load with 
just one instance running and the machine not being asked to do anything 
else I would have expected the AWS machine to be much closer to local 
performance if not better, given the age of my PC.

In the past I have run highly parallel Java programs in the two 
environments and have seen much better results from using AWS in 
Java-land. That led me to wonder if it is something about how R is 
configured. I am not getting anywhere in the AWS forums (unless you pay 
a lot of money you basically don't get a lot of attention) so I was 
wondering if anyone was familiar with how the R packages are configured 
in Debian who might know if anything has been done to optimise 
performance, that might explain why it is so much faster in Debian? Is 
it purely local hardware versus virtualised? I am struggling to believe 
that because I don't see the same phenomenon in Java programs.

Thanks for any ideas

Mark


Reply to: