[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Very wierd behavior on new nodes



This is a standalone box (now), so everything was compiled there. The
only group of routines not compiledon the machine is ATLAS. I'll try
using the native routines.

Art
On Tue, Jan 07, 2003 at 04:38:32PM -0600, Ron Johnson wrote:
> On Tue, 2003-01-07 at 11:34, user list wrote:
> > Under the 2.2 kernel that I'm now running, I did just that. That is,
> > I started the job locally and then ran top. I saw no strange
> > indications. The job was running and then exited from the top 
> > list. I then received 
> > 
> > [1]  + Exit 139                      test_bsh.sh o123_vac_lda1
> > 
> > I'll try running the job again to see if memory was increasing, but it
> > seemed not to.
> 
> What about on the 2.4 box?
> 
> Did/do these fortran programs run on other boxen?  What if you compile
> the source on the target machine?  There may be subtle library
> conflicts that a local compilation would cure.
> 
> > Art Edwards
> > 
> > On Tue, Jan 07, 2003 at 10:32:02AM -0600, Ron Johnson wrote:
> > > On Tue, 2003-01-07 at 09:25, Arthur H.
> > > Edwards,1,505-853-6042,505-256-0834 wrote:
> > > > Under the 2.4.19 kernel, it depends on whether the job was spauned using 
> > > > mpi. If I start it on the node, it does not kill communications. If I 
> > > > start it from a head node, I can still ping. I have only tried the 2.2 
> > > 
> > > So it doesn't kill the box, but makes it effectively unusable?
> > > 
> > > If you have a remote telnet/ssh session already open to the box, is it
> > > knocked out?  I'd open that remote session and run top(1) on it to try
> > > glean a little more info.  If you *do* get kicked off, that certainly 
> > > would tell us something!!!!
> > > 
> > > > kernel job as a stand-alone. There it exits quite gracefully with exit 
> > > > 139. Do you know what exit 139 is?
> > > 
> > > Nope, but it looks like Colin does...
> > > 
> > > > Art Edwards
> > > > 
> > > > Ron Johnson wrote:
> > > > 
> > > > >On Mon, 2003-01-06 at 22:48, Arthur H.
> > > > >Edwards,1,505-853-6042,505-256-0834 wrote:
> > > > >  
> > > > >
> > > > >>I'm having monumental difficulty getting a new set of PC's working. I 
> > > > >>had been installing a 2.4.19 kernel with debian on a MB with a via chip 
> > > > >>set, and athlon XP2100, a promise ide system. Debian semms to install 
> > > > >>correctly. However, when running large fortran jobs (under g77-3.2), the 
> > > > >>system would either die immedieately, or start running and then die. 
> > > > >>When I say die I mean that I can't login. I have backed off to a 2.2.20 
> > > > >>kernel and g77 2.95. Now the program dies with an exit 139, but the 
> > > > >>system stays up.
> > > > >>    
> > > > >>
> > > > >
> > > > >Can't login because your fortran program is taking too much CPU?
> > > > >
> > > > >Can you still ping the box from another node?
> > > > >
> > > -- 
> > > +------------------------------------------------------------+
> > > | Ron Johnson, Jr.     mailto:ron.l.johnson@cox.net          |
> > > | Jefferson, LA  USA   http://members.cox.net/ron.l.johnson  |
> > > |                                                            |
> > > | "Basically, I got on the plane with a bomb. Basically, I   |
> > > |  tried to ignite it. Basically, yeah, I intended to damage |
> > > |  the plane."                                               |
> > > |    RICHARD REID, who tried to blow up American Airlines    |
> > > |                  Flight 63                                 |
> > > +------------------------------------------------------------+
> > > 
> -- 
> +------------------------------------------------------------+
> | Ron Johnson, Jr.     mailto:ron.l.johnson@cox.net          |
> | Jefferson, LA  USA   http://members.cox.net/ron.l.johnson  |
> |                                                            |
> | "Basically, I got on the plane with a bomb. Basically, I   |
> |  tried to ignite it. Basically, yeah, I intended to damage |
> |  the plane."                                               |
> |    RICHARD REID, who tried to blow up American Airlines    |
> |                  Flight 63                                 |
> +------------------------------------------------------------+
> 
> 
> .

-- 
Arthur H. Edwards
712 Valencia Dr. NE
Abq. NM 87108

(505) 256-0834



Reply to: