[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Fwd: workstation for CUDA



Following a market search, i have to reformulate my question by
replacing GTX-470 with GTX-570. The former seems to have nearly
disappeared, or i costs nearly as much as the latter.
francesco pietra


---------- Forwarded message ----------
From: Francesco Pietra <chiendarret@gmail.com>
Date: Sat, May 14, 2011 at 8:06 AM
Subject: Re: workstation for CUDA
To: Lennart Sorensen <lsorense@csclub.uwaterloo.ca>


Hi:
in the meantime i had the opportunity to carry out my simulations on
both a high-end tesla gpu computer and a very simple consumer-type
computer based on a single gtx-470, both running on debian amd 64
lenny. As all classical molecular dynamics codes are compiled for
single precision, there was no advantage with the tesla, actually,
gxt-470 run a bit faster when comparing 1:1 graphics cards number.

Therefore, as all expenses for this computer will be supported by
myself, i am looking for up-to-date consumer components, except the
power source, hard disks and fans, which should be server-type.
Ideally, the motherboard should support four gtx-470, less if no such
consumer motherboard exists. No X server will be installed, graphics
being examined at a ssh-linked desktop. From this, the choice of the
cpus and the power source descends. As the simulations last many days,
the cage must have place for many, large-diameter, fans, for use in
the open air. With a server-type, four socket cpu machine that i set
up a few years ago, there are eleven fans on the cage and no air
cooler was ever needed at the latitude i am based.

 i am particularly embarrassed any time i have to select hardware,
especially here. Such type of computations require little ram (a few
gb in the four-gtx470 case), performance being entirely to the
cpu/gpu. I understand that a consumer motherboard  may well be a
bottle-neck, and should not. This explains my  difficulty in the
choice of the essential haraware. i am prepared to accept a compromise
in the performance

thanks a lot for advice
francesco pietra

On Wed, Jan 26, 2011 at 7:27 PM, Lennart Sorensen
<lsorense@csclub.uwaterloo.ca> wrote:
> On Tue, Jan 25, 2011 at 05:00:51PM -0500, Lennart Sorensen wrote:
>> Most boards technically don't have enough bandwidth.  However when
>> they use the NF200 PCIe switch, it tends to work quite well.  The NF200
>> actually allows broadcast of the same data to multiple cards if that is
>> what is needed.  Each card has potentially 16 lanes, but it is sharing
>> with another card for those 16 lanes.  If you are sending data to just
>> one card, it will get full speed.  If you are sending to both at once,
>> they get half speed, unless you are sending the same data to both in
>> which case they can get full bandwidth.
>>
>> Some designs have simply got 8 lanes per slot all the time.  That will
>> be slower of course.  So if it has the NF200 it should be very good,
>> and otherwise it will be speed limited to 8x.  Now if you are doing heavy
>> calculations with data that fits in the card, it doesn't really matter.
>> If your data set is larger than the card can hold and you have to move
>> data to the card all the time, the bandwidth could be an issue.  That's
>> also when the extra memory on a tesla might start to make a difference.
>>
>> Really serious boards do have enough lanes for 16x on each slot.
>>
>> The Tyan board I listed has the intel 5520 chipset, which has 36 lanes,
>> but since it has two 5520's, each with 36 lanes, it actually has enough
>> to run all four slots at 16x all the time.  So that is probably going
>> to be as fast as you can currently get.  The supermicro dual opteron
>> board in the machine you mentioned also has dual chipset, which also
>> gives it full 16x on all four slots.  The Asus P6T7 WS board uses the
>> NF200 chips instead.  It only has 36 lanes (and of course only one
>> CPU socket).
>
> Of course for really crazy (and expensive) there is this:
> http://www.colfax-intl.com/ms_tesla.asp?M=102
>
> Dual xeon and up to 8 tesla cards (it uses PCIe switches to share 16
> lanes between pairs of cards).
>
> Price is rather high when filled with tesla cards and ram and such.
> Probably fast though.
>
> --
> Len Sorensen
>


Reply to: