Bug#301153: libc6: Occasional EPERM during first fread() following popen()
At Thu, 24 Mar 2005 05:00:27 +0900,
Adam C Powell IV wrote:
> I have an MPI program which does a popen and fread, something like:
>
> if (snprintf (filename, 999, "gunzip -c < %s.cpu%.4d.data",
> basename, rank) > 999)
> return 1;
> if (!(infile = popen (filename, "r")))
> return 1;
> if (ferror (infile))
> {
> printf ("[%d] Pipe open has error %d\n", rank, ferror(infile));
> fflush (stdout);
> }
> ... some stuff ...
> nmemb=fread (globalarray, sizeof (PetscScalar), gridpoints * dof, infile);
> if (nmemb != gridpoints*dof)
> {
> printf ("[%d] ferror = %d\n", rank, ferror (infile));
> fflush (stdout);
> }
>
> So, there seems to be no error in the popen, but on between one and five
> CPUs out of about 20, the fread results in an EPERM error. On the other
> cluster, the error is less frequent but still there. They're both
> identically-configured Debian beowulfs using the diskless package and
> mpich, though the one with fewer errors is made of dual AthlonXP 1.53
> GHz boxes and the one with more errors of dual Opteron 240 boxes running
> Debian stock -k7-smp kernels and 32-bit userland.
>
> On the other hand, the same program earlier fopen()s a file whose path
> and name are identical to the popen redirected input except for the
> extension, and those work flawlessly.
I think this problem should be separated from MPI and clusters. This
kind of random behavior is usually occured by an invalid access. I
recommend you to check your program with valgrind in first, then
isolate the problem from MPI.
Regards,
-- gotom
Reply to: