[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

"cuda error cudastreamcreate",



Hello:
With a gaming machine
Gigabyte GA 890FXAUD5
Six-core AMD PhenomII 1075T
2x GTX 470
Debian GNU-Linux amd64 wheezy


I run successfully NAMD code (molecular dynamics simulations). Now I
am having problems getting GTX 470 to work and I can't understand
whether it is hardware or software problem, and if software the OS is
concerned. I am submitting the same problem to NAMD, s it might be
NAMD specific.

When the code works, the top of the log file says:

nfo: Based on Charm++/Converse 60303 for net-linux-x86_64-iccstatic
Info: Built Sat Jun 4 02:22:51 CDT 2011 by jim on lisboa.ks.uiuc.edu
Info: 1 NAMD  CVS-2011-06-04  Linux-x86_64-CUDA  6    gig64  francesco
Info: Running on 6 processors, 6 nodes, 1 physical nodes.
Info: CPU topology information available.
Info: Charm++/Converse parallel runtime startup completed at 0.00650811 s
Pe 5 sharing CUDA device 1 first 1 next 1
Pe 2 sharing CUDA device 0 first 0 next 4
Did not find +devices i,j,k,... argument, using all
Pe 5 physical rank 5 binding to CUDA device 1 on gig64: 'GeForce GTX
470'  Mem: 1279MB  Rev: 2.0
Pe 2 physical rank 2 binding to CUDA device 0 on gig64: 'GeForce GTX
470'  Mem: 1279MB  Rev: 2.0
Pe 0 sharing CUDA device 0 first 0 next 2
Pe 3 sharing CUDA device 1 first 1 next 5
Pe 1 sharing CUDA device 1 first 1 next 3
Pe 1 physical rank 1 binding to CUDA device 1 on gig64: 'GeForce GTX
470'  Mem: 1279MB  Rev: 2.0
Pe 0 physical rank 0 binding to CUDA device 0 on gig64: 'GeForce GTX
470'  Mem: 1279MB  Rev: 2.0
Pe 3 physical rank 3 binding to CUDA device 1 on gig64: 'GeForce GTX
470'  Mem: 1279MB  Rev: 2.0
Pe 4 sharing CUDA device 0 first 0 next 0
Pe 4 physical rank 4 binding to CUDA device 0 on gig64: 'GeForce GTX
470'  Mem: 1279MB  Rev: 2.0
Info: 1.64104 MB of memory in use based on CmiMemoryUsage
Info: Configuration file is min-02.conf

When failure:

Info: Based on Charm++/Converse 60303 for net-linux-x86_64-iccstatic
Info: Built Sat Jun 4 02:22:51 CDT 2011 by jim on lisboa.ks.uiuc.edu
Info: 1 NAMD  CVS-2011-06-04  Linux-x86_64-CUDA  6    gig64  francesco
Info: Running on 6 processors, 6 nodes, 1 physical nodes.
Info: CPU topology information available.
Info: Charm++/Converse parallel runtime startup completed at 0.0124412 s
Pe 5 sharing CUDA device 0 first 0 next 0
Pe 5 physical rank 5 binding to CUDA device 0 on gig64: 'Device
Emulation (CPU)'  Mem: 0MB  Rev: 9999.9999
FATAL ERROR: CUDA error cudaStreamCreate on Pe 5 (gig64 device 0): no
CUDA-capable device is available
------------- Processor 5 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 5 (gig64 device
0): no CUDA-capable device is available

Did not find +devices i,j,k,... argument, using all
Pe 0 sharing CUDA device 0 first 0 next 1
Pe 0 physical rank 0 binding to CUDA device 0 on gig64: 'Device
Emulation (CPU)'  Mem: 0MB  Rev: 9999.9999
Pe 3 sharing CUDA device 0 first 0 next 4
Pe 3 physical rank 3 binding to CUDA device 0 on gig64: 'Device
Emulation (CPU)'  Mem: 0MB  Rev: 9999.9999
Pe 1 sharing CUDA device 0 first 0 next 2
Pe 1 physical rank 1 binding to CUDA device 0 on gig64: 'Device
Emulation (CPU)'  Mem: 0MB  Rev: 9999.9999
FATAL ERROR: CUDA error cudaStreamCreate on Pe 0 (gig64 device 0): no
CUDA-capable device is available
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 0 (gig64 device
0): no CUDA-capable device is available

FATAL ERROR: CUDA error cudaStreamCreate on Pe 3 (gig64 device 0): no
CUDA-capable device is available
------------- Processor 3 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 3 (gig64 device
0): no CUDA-capable device is available

FATAL ERROR: CUDA error cudaStreamCreate on Pe 1 (gig64 device 0): no
CUDA-capable device is available
------------- Processor 1 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 1 (gig64 device
0): no CUDA-capable device is available

Pe 2 sharing CUDA device 0 first 0 next 3
Pe 2 physical rank 2 binding to CUDA device 0 on gig64: 'Device
Emulation (CPU)'  Mem: 0MB  Rev: 9999.9999
FATAL ERROR: CUDA error cudaStreamCreate on Pe 2 (gig64 device 0): no
CUDA-capable device is available
------------- Processor 2 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 2 (gig64 device
0): no CUDA-capable device is available

Pe 4 sharing CUDA device 0 first 0 next 5
Pe 4 physical rank 4 binding to CUDA device 0 on gig64: 'Device
Emulation (CPU)'  Mem: 0MB  Rev: 9999.9999
FATAL ERROR: CUDA error cudaStreamCreate on Pe 4 (gig64 device 0): no
CUDA-capable device is available
------------- Processor 4 Exiting: Called CmiAbort ------------
Reason: FATAL ERROR: CUDA error cudaStreamCreate on Pe 4 (gig64 device
0): no CUDA-capable device is available

[0] Stack Traceback:

--------------------------------

In both cases:

/var/lib/dkms/nvidia/270.41.19/2.6.38-2-amd64/x86_64/module/nvidia.ko

/lib/module/2.6.38-2-amd64/update/dkms/nvidia.ko

are in order.
  I tried:

nvidia-smi -r (or nvidia-smi -a)
NVIDIA: could not open the device file /dev/nvidia1 (no such file)
Failed to initialize NVML: unknown error.

unsure if these commands are for Tesla only.

Thanks for advice

francesco pietra


Reply to: