[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#516530: [unpack,reporting] support multiple sections



Hi,

On Sunday 05 July 2009 01:13:52 Russ Allbery wrote:
> Raphael Geissert <atomo64@gmail.com> writes:
> >     Wait until the harness finished before processing its output.
> >
> >     Not doing so can cause SIGPIPE's, terminating the main process.
> >
> > reporting/harness kept dying without any error message when testing the
> > changes in experimental (so that it wouldn't take too long to process the
> > Packages files), and tracked it back to lib/Util.pm's get_deb_info.
> > The tail of the strace:
> >
> > clone(child_stack=0,
> > flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
> > child_tidptr=0xb7ecb918) = 26047
> > close(10)                               = 0
> > read(9, ""..., 10000)                   = 0
> > close(9)                                = 0
> > close(6)                                = 0
> > close(4)                                = 0
> > close(7)                                = 0
> > close(3)                                = 0
> > read(5, "Package: gforge-mta-postfix\nSourc"..., 4096) = 871
> > --- SIGPIPE (Broken pipe) @ 0 (0) ---
> > +++ killed by SIGPIPE +++
>
> This doesn't make any sense.  The SIGPIPE is going to be in the child,
> which isn't going to kill reporting/harness.  You only get SIGPIPE if
> you write to a pipe that has no reader on the other end, not if you
> read, and reporting/harness isn't writing to a pipe.  It also wouldn't
> die with no error message if it died due to a broken pipe; you'd see a
> broken pipe error.

Indeed, that's true.

>
> I'm fairly sure the change is also incorrect.  It introduces a potential
> deadlock.
>
> --- a/lib/Util.pm
> +++ b/lib/Util.pm
> @@ -162,8 +162,9 @@ sub get_deb_info {
>           ['ar', 'p', $file, 'control.tar.gz'],
>           '|', ['tar', '--wildcards', '-xzO', '-f', '-', '*control'])
>         or fail("cannot fork to unpack $file: $opts->{exception}\n");
> -    my @data = parse_dpkg_control($opts->{pipe_out});
> +    # Wait until the harness finishes, otherwise we might die with a
> SIGPIPE $opts->{harness}->finish();
> +    my @data = parse_dpkg_control($opts->{pipe_out});
>      return $data[0];
>  }
>
> What if the command outputs more data than fits in the pipe buffer?  The
> child will stall forever waiting for the parent to consume some data so
> that it can write more, and the parent will wait forever for the child
> to complete.

True

>
> One can avoid killing the child process with a SIGPIPE by being sure to
> consume all of its data before exiting:
>
> --- a/lib/Util.pm
> +++ b/lib/Util.pm
> @@ -156,15 +156,18 @@ sub get_deb_info {
>         return undef;
>      }
>
> -    # `dpkg-deb -f $file' is very slow. Instead, we use ar and tar.
> +    # dpkg-deb -f $file is very slow. Instead, we use ar and tar.
>      my $opts = { pipe_out => FileHandle->new };
>      spawn($opts,
>           ['ar', 'p', $file, 'control.tar.gz'],
>           '|', ['tar', '--wildcards', '-xzO', '-f', '-', '*control'])
>         or fail("cannot fork to unpack $file: $opts->{exception}\n");
> -    # Wait until the harness finishes, otherwise we might die with a
> SIGPIPE -    $opts->{harness}->finish();
>      my @data = parse_dpkg_control($opts->{pipe_out});
> +
> +    # Consume all data before exiting so that we don't kill child
> processes +    # with SIGPIPE.
> +    1 while readline $opts->{pipe_out};
> +    $opts->{harness}->finish();
>      return $data[0];
>  }
>
> but that doesn't explain why you would see reporting/harness exiting.
>

I don't know either. A possible explanation would be that while (<$CONTROL>) 
is returning false before it actually reaches the EOF which in turn makes 
get_deb_info finish() the harness before the all the child's output is 
consumed, causing a SIGPIPE which is incorrectly? handled by perl which 
somehow ends up killing reporting/harness.

I'm afraid I don't have enough time right now to dedicate more than a couple 
of hours (which is way less time I spent when originally found the problem) 
to this issue.

Anyway, it might be a good idea to play safe and add the above workaround 
(starving child's input + a warning whenever that happens) and give it a try. 
I mean, if it has never failed before chances are that it won't fail now.

P.S. sorry for the late reply.

Cheers,
-- 
Raphael Geissert - Debian Developer
www.debian.org - get.debian.net



Reply to: