[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Fwd: upgrade to jessie from wheezy with cuda problems



This addendum to let you know that simply adding

1. options nvidia NVreg_EnablePCIeGen3=1

to /etc/modprobe.d/nvidia.conf

as suggested in

https://devtalk.nvidia.com/default/topic/545186/enabling-pcie-3-0-with-nvreg_enablepciegen3-on-titan/

had no effect. Also, please note that what should be added to the kernel boot string, according to the same source, is

    nvidia.NVreg_EnablePCIeGen3=1

unlike I wrote before (i.e., no "options", while a dot between nvidia and NVreg

francesco pietra

---------- Forwarded message ----------
From: Francesco Pietra <chiendarret@gmail.com>
Date: Sun, Nov 17, 2013 at 11:42 AM
Subject: Fwd: upgrade to jessie from wheezy with cuda problems
To: amd64 Debian <debian-amd64@lists.debian.org>, Lennart Sorensen <lsorense@csclub.uwaterloo.ca>


Very sorry, forget about previous post. There, I had started MD from the gnome terminal, without activating the GPUUs.

When carrying out regularly MD from the Linux prompt, without X-server, while activating the cards with

#nvidia-smi -L
#nvidia-smi -pm 1

as in all previous tests, both LnkCap and LnkSta are 5GT/s, as from PCIe 2.0.

Thus, the problem seems to be activating PCIe 3.0, as before said.

francesco pietra


---------- Forwarded message ----------
From: Francesco Pietra <chiendarret@gmail.com>
Date: Sun, Nov 17, 2013 at 11:29 AM
Subject: Fwd: upgrade to jessie from wheezy with cuda problems
To: amd64 Debian <debian-amd64@lists.debian.org>, Lennart Sorensen <lsorense@csclub.uwaterloo.ca>


I have to correct my previous report. When running molecular dynamics, the capability of the GPU is 5GT/, but the actual speed link is at 2.5GT/s, i.e., below PCIe 2.o

#lspci -vvvv
02:00.0 VGA compatible controller: NVIDIA Corporation GK104 [GeForce GTX 680] (rev a1) (prog-if 00 [VGA controller])
    Subsystem: NVIDIA Corporation Device 0969
    Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0
    Interrupt: pin A routed to IRQ 16
    Region 0: Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
    Region 1: Memory at c0000000 (64-bit, prefetchable) [size=128M]
    Region 3: Memory at c8000000 (64-bit, prefetchable) [size=32M]
    Region 5: I/O ports at e000 [size=128]
    [virtual] Expansion ROM at fb000000 [disabled] [size=512K]
    Capabilities: [60] Power Management version 3
        Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
        Address: 0000000000000000  Data: 0000
    Capabilities: [78] Express (v2) Endpoint, MSI 00
        DevCap:    MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
        DevCtl:    Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
            RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
            MaxPayload 256 bytes, MaxReadReq 512 bytes
        DevSta:    CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
        LnkCap:    Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Latency L0 <512ns, L1 <4us
            ClockPM+ Surprise- LLActRep- BwNot-
        LnkCtl:    ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
            ExtSynch- ClockPM+ AutWidDis- BWInt- AutBWInt-
        LnkSta:    Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR-, OBFF Not Supported
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
        LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
             EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-

francesco pietra

---------- Forwarded message ----------
From: Francesco Pietra <chiendarret@gmail.com>
Date: Tue, Nov 12, 2013 at 3:54 PM
Subject: upgrade to jessie from wheezy with cuda problems
To: amd64 Debian <debian-amd64@lists.debian.org>


Hello:
I decided to try jessie to get PCIe 3.0 with a recent nvidia driver, thus upgrading from wheezy.

wheezy was
uname -r
3.2.0-4-amd64

nvidia-smi
304.88

nvcc --version
4.2

(the latter is also the version at which the molecular dynamics code was compiled, and used without calling the X-server)
********************

Following aptitude update

aptitude-upgrade

a number of dependecies related to gnome were not met (evolution-common lbfolks25 gnome-panel gnome-shell gnome-theme-extras gnome-theme-standard libreoffice-evolution). This notwithstanding, I decided to upgrade.

After rebooting to get linux matching with nvidia:

nvcc --version
  5.0

uname -r
  3.10-3-amd64

nvidia-smi
  the nvidia kernel module has version 304.108 but the nvidia driver component has version 319.60.


Driver 319.6 is just what I wanted. Now, how best fix the problems? Install linux image 3.2?

In the past I tried dist-upgrade, getting into devastating problems.


thanks
francesco pietra

PS I was advised that debian is getting bounces from my address above. If so, please try my institutional address <francesco.pietra@accademialucchese.it>




Reply to: