[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: upgrade to jessie from wheezy with cuda problems



# apt-get --purge remove *legacy*
did the job.

I wonder how these legacy packages entered the scene while updating/upgrading from a clean wheezy.

The bad news are that with the new driver 319.60 there was no acceleration of molecular dynamics for a job of modest size (150K atoms) and slight acceleration (0.12 s/step vs 0.14 s/step) for a heavy job (500K atoms). Weather bringing from PCIe 2.0 (with the 304.xx driver of wheezy) to PCIe 3.0 (with driver 319.60 of jessie)  (increasing the bandwidth from GPUs to RAM from 5 to 8GB/s) has not the effect that I hoped on the calculations, or PCIe is still 2.0 with jessie.

Now, with cuda 5.0, it should be easy to measure the bandwidth directly. I have to learn how and I'll report about in due course.


Now
nvidia-smi activates the GPUs for normal work,
nvidia-smi -L tells about the GPUs,
dpkg -l |grep nvidia shows all 319.60 or 5.0.35-8,
the X-server can be started and gnome loaded (startx, gnome-session),
nvcc --version gives 5.0,  however


# modinfo nvidia
ERROR: module nvidia not found

In analogy with wheezy 3.2.0-4, I expected /lib/modules/3.10-3-amd64/updates/dkms/nvidia.ko

Instead, there is

/lib/modules/3.10-3-amd64/nvidia/nvidia-current.ko

is that a feature of jessie or something wrong?



Thanks a lot for advice.

francesco pietra.


On Tue, Nov 12, 2013 at 5:59 PM, Lennart Sorensen <lsorense@csclub.uwaterloo.ca> wrote:
On Tue, Nov 12, 2013 at 05:22:18PM +0100, Francesco Pietra wrote:
> Yes. Also,
>
> # apt-get remove nvidia-kernel-dkms
>
> # apt-get install nvidia-kernel-dkms
>
> (which, in the year 2011, served to clear the driver at
> /lib/modules/2.6.38-2-amd64/updates/dkms. But now the kernel was 3.2.) left
> the issue unaltered.
>
> # modinfo nvidia
>    ERROR: module nvidia not found
>
> $ dpkg -l |grep nvidia |less
>
> shows
>
> libl1-nvidia-glx:amd64 319.60
>
> and
>
> libg1-nvidia-legacy-304xx--glx:amd64 304.108-4
>
> NVIDIA metapackage rc nvidia-glx 304.88-1-deb7u1
>
> nvidia-legacy-304xx-driver 304.108-4
>
>
> nvidia-legacy-304xx-kernel-dkms  304.108-4
>
> nvidia-settings-legacy-303xx  304.108-2
>
> xserver-xorg-video-nvidia-legacy-304xx    304.108-4
>
>
> Everything else 319.60-1 and cuda 5.0
>
> I don't understand why these 304xx are threatening.
>
> I had also run
> # nvidia-xconfig

I think you should remove all packages with legacy-304xx in the name,
and install the current ones (nvidia-kernel-dkms, nvidia-glx, etc).

legacy-304xx will never move beyond version 304.xx after all as the
name implies.

--
Len Sorensen


Reply to: