[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#301153: libc6: Occasional EPERM during first fread() following popen()



At Thu, 24 Mar 2005 05:00:27 +0900,
Adam C Powell IV wrote:
> I have an MPI program which does a popen and fread, something like:
> 
>       if (snprintf (filename, 999, "gunzip -c < %s.cpu%.4d.data",
>                     basename, rank) > 999)
>         return 1;
>       if (!(infile = popen (filename, "r")))
>         return 1;
>       if (ferror (infile))
>       {
>           printf ("[%d] Pipe open has error %d\n", rank, ferror(infile));
>           fflush (stdout);
>       }
>       ... some stuff ...
>         nmemb=fread (globalarray, sizeof (PetscScalar), gridpoints * dof, infile);
>         if (nmemb != gridpoints*dof)
>         {
>             printf ("[%d] ferror = %d\n", rank, ferror (infile));
>             fflush (stdout);
>         }
> 
> So, there seems to be no error in the popen, but on between one and five
> CPUs out of about 20, the fread results in an EPERM error.  On the other
> cluster, the error is less frequent but still there.  They're both
> identically-configured Debian beowulfs using the diskless package and
> mpich, though the one with fewer errors is made of dual AthlonXP 1.53
> GHz boxes and the one with more errors of dual Opteron 240 boxes running
> Debian stock -k7-smp kernels and 32-bit userland.
> 
> On the other hand, the same program earlier fopen()s a file whose path
> and name are identical to the popen redirected input except for the
> extension, and those work flawlessly.

I think this problem should be separated from MPI and clusters.  This
kind of random behavior is usually occured by an invalid access.  I
recommend you to check your program with valgrind in first, then
isolate the problem from MPI.

Regards,
-- gotom



Reply to: