[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: OT: Can lot of RAM can slow down a calculation workstation?



On Fri, Jun 26, 2015 at 2:34 PM,  <tomas@tuxteam.de> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Fri, Jun 26, 2015 at 11:58:40AM +0200, Dan wrote:
>> On Fri, Jun 26, 2015 at 11:46 AM,  <tomas@tuxteam.de> wrote:
>
> [...]
>
>> No I do not know that. I am a scientist, and I use the computers as a
>> tool to do simulations that I write in C++ with (Threading Building
>> Blocks). I have a limited knowledge of the computer architecture. That
>> would mean that a calculation that fits in 64GB will run slower with
>> 256GB? or that means that when I increase the size the calculation
>> will be slower.
>
> Note that I'm deep in hand-waving territory here.
>
> Suppose you could reduce your problem size to one-fourth its current
> size, so that it fits in your 64 GB. Let's say it needs now a time
> T_0.
>
> Now consider your (real) four-fold problem. Leaving out all effects
> of swapping and all that (and only considering the "pure" algorithm),
> your run time will be T_1, which most probably is (depending on
> the algorithm's properties) bigger than T_0. For a linear algorithm:
> T_1 >= 4 * T_0 (you're lucky!), for a quadratic one T_1 >= 16 * T_0
> and so on (if you're _very_ lucky, you have a sublinear algorithm,
> but given all you've written before I wouldn't bet on that).
>
> Taking into account the effects of RAM, you get for 64 GB and your
> problem ("real" size) some time T_1_64 which is most probably
> significantly bigger than T_1: T_1_64 >> T_1, due to all the caching
> overhead. How much bigger will depend on how cache-friendly the
> algorithm is: if it is random-accessing data from all over the
> place, the slowdown will be horrible (in the limit, of the order
> of magnitude of the relation of the SSD speed to the RAM speed
> (yeah, latency, bandwidth. Some combination of both. Pick the worst
> of them ;-)
>
> Now to the 256 GB case. Ideally, the thing fits in there: ideally
> the time would be T_1_256 =~ T_1, since no swapping overhead,
> etc.
>
> What I was saying is that you might quite well get T_1_256 > T_1,
> because there are other factors (the CPU has a whole hierarchy
> of caches between itself and the RAM, because the RAM is horribly
> slow from the POV of the CPU). Those caches might be more
> overwhelmed by the bigger addressable memory.
>
> Now how much bigger, that is a tough question. Most probably you
> get
>     T_1_64 >> T_1_256 > T_1
>
> so the extra RAM will help, but in some cases the slowdown from
> the "ideal" T_1 to the "real" T_1_256 might prove disappointing.
>
> Sometimes, partitioning the problem might give you more speed
> than throwing RAM at it. Sometimes!
>
> I think you have no choice but to try it out.

Hi,

I took a closer look into this. I found out that that it is important
to have at least one DIMM per channel. But if you have a several DIMMs
per channel there can be a hit in the performance. Many servers clock
down the memory to a lower speed when you add a second or a third DPC.

https://marchamilton.wordpress.com/2012/02/07/optimizing-hpc-server-memory-configurations/
http://frankdenneman.nl/2015/02/20/memory-deep-dive/

I did some research and it seems that "in general" there is no drop
with 2 dimms per channel but there is a drop with 3 dimms per channel.

I can buy 8x 32 Gb or 16 x 16Gb. The first option is more expensive
than the second one, but I will have only one DIMM per channel. Any
suggestions or experience with this?

Thanks,
Dan


Reply to: