Fwd: Failure to activate node zero in shared memory machine
I forgot to add that I tried with either AMBER ff and CHARMM ff
(all27). In both cases also with previously proven systems and
scripts.
Also, I am using the precompiled NAMD (self-contained
parallelization), not message passing from Debian.
francesco
---------- Forwarded message ----------
From: Francesco Pietra <chiendarret@gmail.com>
Date: Fri, Mar 9, 2012 at 7:14 PM
Subject: Failure to activate node zero in shared memory machine
To: amd64 Debian <debian-amd64@lists.debian.org>
Hello:
I was running NAMD-CUDA 2.8 4JUN2011nb (a molecular dynamics
simulation code) successfully on nvidia
280.13-1. I am now bach to namd after a few months, on the same
macjhine, now nvidia 295.20-1 (which version matches debian amd64
xserver and all
libraries). First activating CUDA:
# nvidia-smi -L
# nvidia-smi -pm 1
then launching namd, node zero failure
Charmrun> charmrun started...
Charmrun> node programs all started
Charmrun> error 0 attaching to node:
Timeout waiting for node-program to connect
Charmrun> adding client 0: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 1: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 2: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 3: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 4: "127.0.0.1", IP:127.0.0.1
Charmrun> adding client 5: "127.0.0.1", IP:127.0.0.1
Charmrun> Charmrun = 127.0.0.1, port = 41824
Charmrun> start 0 node program on localhost.
Charmrun> start 1 node program on localhost.
Charmrun> start 2 node program on localhost.
Charmrun> start 3 node program on localhost.
Charmrun> start 4 node program on localhost.
Charmrun> start 5 node program on localhost.
Charmrun> Waiting for 0-th client to connect.
Hardware
Gigabyte Technology Co., Ltd. GA-890FXA-UD5/GA-890FXA-UD5, BIOS F6 11/24/2010
AMD Phenom(tm) II X6 1075T Processor (6 cpu cores) (version 2.20.00)
16GB RAM
Two GTX-580
Scanning NUMA topology in Northbridge 24
[ 0.000000] No NUMA configuration found (SHOULD NUMA BE ACTIVATED?
it was not when running parallel in the past)
All nvidia tests were OK:
francesco@gig64:~/1PLC$ dpkg -l | grep nvidia
ii glx-alternative-nvidia 0.2.1
allows the selection of NVIDIA as GLX provider
ii libgl1-nvidia-alternatives 295.20-1
transition libGL.so* diversions to glx-alternative-nvidia
ii libgl1-nvidia-glx 295.20-1
NVIDIA binary OpenGL libraries
ii libglx-nvidia-alternatives 295.20-1
transition libgl.so diversions to glx-alternative-nvidia
ii libnvidia-compiler-ia32 295.20-1
NVIDIA runtime compiler library (32-bit)
ii libnvidia-ml1 295.20-1
NVIDIA management library (NVML) runtime library
ii nvidia-alternative 295.20-1
allows the selection of NVIDIA as GLX provider
ii nvidia-compute-profiler 4.0.17-3
NVIDIA Compute Visual Profiler
ii nvidia-cuda-dev 4.0.17-3
NVIDIA CUDA development files
ii nvidia-cuda-doc 4.1.28-1
NVIDIA CUDA and OpenCL documentation
ii nvidia-cuda-gdb 4.1.28-1
NVIDIA CUDA GDB
ii nvidia-cuda-toolkit 4.0.17-3
NVIDIA CUDA toolkit
ii nvidia-glx 295.20-1
NVIDIA metapackage
ii nvidia-installer-cleanup 20111111+3
Cleanup after driver installation with the nvidia-installer
ii nvidia-kernel-common 20111111+3
NVIDIA binary kernel module support files
ii nvidia-kernel-dkms 295.20-1
NVIDIA binary kernel module DKMS source
ii nvidia-libopencl1 295.20-1
NVIDIA OpenCL library
ii nvidia-libopencl1-ia32 295.20-1
NVIDIA OpenCL 32-bit library
ii nvidia-opencl-common 295.20-1
NVIDIA OpenCL driver
ii nvidia-opencl-dev 4.0.17-3
NVIDIA OpenCL development files
ii nvidia-opencl-icd-ia32 295.20-1
NVIDIA OpenCL ICD (32-bit)
ii nvidia-smi 295.20-1
NVIDIA System Management Interface
ii nvidia-support 20111111+3
NVIDIA binary graphics driver support files
ii nvidia-vdpau-driver 295.20-1
NVIDIA vdpau driver
ii nvidia-xconfig 295.20-1
X configuration tool for non-free NVIDIA drivers
ii xserver-xorg-video-nvidia 295.20-1
NVIDIA binary Xorg driver
francesco@gig64:~/1PLC$
root@gig64:/home/francesco/1PLC# modinfo nvidia
filename: /lib/modules/2.6.38-2-amd64/updates/dkms/nvidia.ko
alias: char-major-195-*
version: 295.20
supported: external
license: NVIDIA
alias: pci:v000010DEd00000E00sv*sd*bc04sc80i00*
alias: pci:v000010DEd00000AA3sv*sd*bc0Bsc40i00*
alias: pci:v000010DEd*sv*sd*bc03sc02i00*
alias: pci:v000010DEd*sv*sd*bc03sc00i00*
depends: i2c-core
vermagic: 2.6.38-2-amd64 SMP mod_unload modversions
parm: NVreg_EnableVia4x:int
parm: NVreg_EnableALiAGP:int
parm: NVreg_ReqAGPRate:int
parm: NVreg_EnableAGPSBA:int
parm: NVreg_EnableAGPFW:int
parm: NVreg_Mobile:int
parm: NVreg_ResmanDebugLevel:int
parm: NVreg_RmLogonRC:int
parm: NVreg_ModifyDeviceFiles:int
parm: NVreg_DeviceFileUID:int
parm: NVreg_DeviceFileGID:int
parm: NVreg_DeviceFileMode:int
parm: NVreg_RemapLimit:int
parm: NVreg_UpdateMemoryTypes:int
parm: NVreg_InitializeSystemMemoryAllocations:int
parm: NVreg_UseVBios:int
parm: NVreg_RMEdgeIntrCheck:int
parm: NVreg_UsePageAttributeTable:int
parm: NVreg_EnableMSI:int
parm: NVreg_MapRegistersEarly:int
parm: NVreg_RegisterForACPIEvents:int
parm: NVreg_RegistryDwords:charp
parm: NVreg_RmMsg:charp
parm: NVreg_NvAGP:int
root@gig64:/home/francesco/1PLC#
Thanks a lot for advice
francesco pietra
Reply to: