[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Fwd: upgrade to jessie from wheezy with cuda problems



I am attacking the problem from another side, directly from within the OS itself:

#lspi -vvvv

tells that the link speed (= link status) "LnkSta" is at 5Gb/s, no matter whether the system is at number crunching or not. I.e., my system is at PCIe 2.0. This might explain why upgrading from sandy bridge to ivy bridge gave no speed gain of molecular dynamics. PCIe 3.0 was not achieved.

As far as I could investigate, nvidia suggests to either:
(1) Modify /etc/modprobe.d/local.conf (which does not exist on jessie) or create a new

/etc/modprobe.d/nvidia.conf, adding to that

1. options nvidia NVreg_EnablePCIeGen3=1

Actually, on my jessie, nvidia.conf reads

alias nvidia nvidia-current
remove nvidia-current rm mod nvidia


Some guys found that useless, even when both grub-efi and initramfs are edited accordingly, so that nvidia offered a different move, updating the kernel boot string, by appending this:

1. options nvidia NVreg_EnablePCIeGen3=1
***************************

I did nothing, as I hope that the best adaptation to jessie may be suggested by those who know the OS better than me.
The kind of information about links includes:

LnkSta: the actual speed

LnkCap: the capacity of the specific port, as far as I can understand.

LnkCtl: ??


One could also run

#lspci -vt

to determine the bus where the GPU card is located, then running

# lspci -vv -s ##

where "##" is the location.
******************************

So, it is a tricky matter, but perhaps not so much when one knows where to put the hands. At any event, being unable to go to 8GT/s, as from PCIe 3.0, means loosing time and energy (=money and pollution), at least when the GPUs are used for long number crunching.

I'll continue investigating. The above seems to be promising. Hope to get help.

francesco pietra

PS
With my jessie
/etc/modprobe.d  includes the following files:
alsa-base.conf
alsa-case-blacklist.conf
dkms.conf (which has no active statemente)
fbdev-blacklist.conf
i915-kms.conf
nvidia.conf
nvidia-blacklist-nouveau.conf
radeon-kms.conf
**************************


On Thu, Nov 14, 2013 at 3:33 AM, Lennart Sorensen <lsorense@csclub.uwaterloo.ca> wrote:
On Wed, Nov 13, 2013 at 05:43:47PM -0500, Lennart Sorensen wrote:
> On Wed, Nov 13, 2013 at 10:53:30PM +0100, Francesco Pietra wrote:
> > francesco@gig64:~/tmp$ file ./CUDA-Z-0.7.189.run
> > ./CUDA-Z-0.7.189.run: data
> > francesco@gig64:~/tmp$
>
> OK that's weird.  I expected to see x86 32 or 64bit binary.
>
> Seems to be a shell scripts with compressed code in it.  Yuck. :)

OK I got it running.  It is a 32bit binary.

I had to install these:

ii  libcuda1:i386                         331.20-1                        i386         NVIDIA CUDA runtime library
ii  libcudart5.0:i386                     5.0.35-8                        i386         NVIDIA CUDA runtime library
ii  libgl1-nvidia-glx:i386                331.20-1                        i386         NVIDIA binary OpenGL libraries
ii  libstdc++6:i386                       4.8.2-4                         i386         GNU Standard C++ Library v3
ii  libxrender1:i386                      1:0.9.8-1                       i386         X Rendering Extension client library
ii  zlib1g:i386                           1:1.2.8.dfsg-1                  i386         compression library - runtime

Then I was able to run it.  No messing with LD_LIBRARY_PATH or anything.

To install :i386 packages you first have to enable multiarch support
with dpkg and run apt-get update.  So something like:

dpkg --add-architecture i386
apt-get update
apt-get install libcuda1:i386 libcudart5.0:i386 libgl1-nvidia-glx:i386 libstdc++6:i386 libxrender1:i386 zlib1g:i386

Don't worry about the exact versions, since I am running
unstable+experimental.  You don;t need to do that to get it working.

For your 64bit code you probably need libcuda1 libcudart5.0 and such
installed in the 64bit version.

--
Len Sorensen


Reply to: