Re: OT: Can lot of RAM can slow down a calculation workstation?

To: tomas@tuxteam.de
Cc: rajiv chavan <rc214105@gmail.com>, Debian User List <debian-user@lists.debian.org>, mailinglist@darac.org.uk
Subject: Re: OT: Can lot of RAM can slow down a calculation workstation?
From: Dan <ganchya@gmail.com>
Date: Thu, 2 Jul 2015 18:32:57 +0200
Message-id: <[🔎] CAK00fOKAGqoWp7SbXS7dwNL1kDO=Fiw77ErHHhQPxJCz2_yWgg@mail.gmail.com>
In-reply-to: <20150626123449.GA15178@tuxteam.de>
References: <CAK00fOLM5-JBL0mwhTOrth-59-KLHkEYaQ-NTMvTKTg12rT1Ug@mail.gmail.com> <20150623092244.GC19575@tuxteam.de> <CALM+MZTgGH-3hkEB6D01X96EML92iWmr-L-q5zhWbe9TZpGxKg@mail.gmail.com> <20150623165025.GB32418@tuxteam.de> <CALM+MZS8URXfWKgVzKQbhd3o_eoQ_bCCgu8VTj1-_pN2q2hgmg@mail.gmail.com> <CAK00fO+VNBAfeJ_08CQ0uB3ZG+y_O=K6PX0rayJOmU5ZnkNxMQ@mail.gmail.com> <20150626094620.GA11234@tuxteam.de> <CAK00fOKy_2f57oNtOG=Eshb2djr-wMWwwMdh4c5hG7TRYnwT3A@mail.gmail.com> <20150626123449.GA15178@tuxteam.de>

On Fri, Jun 26, 2015 at 2:34 PM,  <tomas@tuxteam.de> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Fri, Jun 26, 2015 at 11:58:40AM +0200, Dan wrote:
>> On Fri, Jun 26, 2015 at 11:46 AM,  <tomas@tuxteam.de> wrote:
>
> [...]
>
>> No I do not know that. I am a scientist, and I use the computers as a
>> tool to do simulations that I write in C++ with (Threading Building
>> Blocks). I have a limited knowledge of the computer architecture. That
>> would mean that a calculation that fits in 64GB will run slower with
>> 256GB? or that means that when I increase the size the calculation
>> will be slower.
>
> Note that I'm deep in hand-waving territory here.
>
> Suppose you could reduce your problem size to one-fourth its current
> size, so that it fits in your 64 GB. Let's say it needs now a time
> T_0.
>
> Now consider your (real) four-fold problem. Leaving out all effects
> of swapping and all that (and only considering the "pure" algorithm),
> your run time will be T_1, which most probably is (depending on
> the algorithm's properties) bigger than T_0. For a linear algorithm:
> T_1 >= 4 * T_0 (you're lucky!), for a quadratic one T_1 >= 16 * T_0
> and so on (if you're _very_ lucky, you have a sublinear algorithm,
> but given all you've written before I wouldn't bet on that).
>
> Taking into account the effects of RAM, you get for 64 GB and your
> problem ("real" size) some time T_1_64 which is most probably
> significantly bigger than T_1: T_1_64 >> T_1, due to all the caching
> overhead. How much bigger will depend on how cache-friendly the
> algorithm is: if it is random-accessing data from all over the
> place, the slowdown will be horrible (in the limit, of the order
> of magnitude of the relation of the SSD speed to the RAM speed
> (yeah, latency, bandwidth. Some combination of both. Pick the worst
> of them ;-)
>
> Now to the 256 GB case. Ideally, the thing fits in there: ideally
> the time would be T_1_256 =~ T_1, since no swapping overhead,
> etc.
>
> What I was saying is that you might quite well get T_1_256 > T_1,
> because there are other factors (the CPU has a whole hierarchy
> of caches between itself and the RAM, because the RAM is horribly
> slow from the POV of the CPU). Those caches might be more
> overwhelmed by the bigger addressable memory.
>
> Now how much bigger, that is a tough question. Most probably you
> get
>     T_1_64 >> T_1_256 > T_1
>
> so the extra RAM will help, but in some cases the slowdown from
> the "ideal" T_1 to the "real" T_1_256 might prove disappointing.
>
> Sometimes, partitioning the problem might give you more speed
> than throwing RAM at it. Sometimes!
>
> I think you have no choice but to try it out.

Hi,

I took a closer look into this. I found out that that it is important
to have at least one DIMM per channel. But if you have a several DIMMs
per channel there can be a hit in the performance. Many servers clock
down the memory to a lower speed when you add a second or a third DPC.

https://marchamilton.wordpress.com/2012/02/07/optimizing-hpc-server-memory-configurations/
http://frankdenneman.nl/2015/02/20/memory-deep-dive/

I did some research and it seems that "in general" there is no drop
with 2 dimms per channel but there is a drop with 3 dimms per channel.

I can buy 8x 32 Gb or 16 x 16Gb. The first option is more expensive
than the second one, but I will have only one DIMM per channel. Any
suggestions or experience with this?

Thanks,
Dan

Reply to:

Follow-Ups:
- Re: OT: Can lot of RAM can slow down a calculation workstation?
  - From: David Christensen <dpchrist@holgerdanske.com>

Prev by Date: Re: System freeze with multiple ttys in Debian Jessie
Next by Date: Re: Mixing 4 audio sources
Previous by thread: Re: Mixing 4 audio sources
Next by thread: Re: OT: Can lot of RAM can slow down a calculation workstation?
Index(es):
- Date
- Thread