[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#647254: remctl: FTBFS(kFreeBSD): testsuite failures



Christoph Egger <christoph@debian.org> writes:

> Hi!

> Your package failed to build on the kfreebsd-* buildds:

> Failed Set                 Fail/Total (%) Skip Stat  Failing Tests
> -------------------------- -------------- ---- ----  ------------------------
> portable/getaddrinfo          1/75     1%    0    0  72
> util/network                  1/100    1%    0    0  43

Hi folks,

Can I get some help with this?  I thought I understood the problem, and
the updated 3.0-2 package's tests work fine on asdfasdf, but on the
buildds the behavior is even worse than it was before.

These tests are intended to check that network_connect with a timeout
works.  It's hard to artificially create a case where connect will time
out.  What the test tries to do is create a listener socket with a short
queue and then keep connecting to it until the OS starts not accepting new
connections because the listener queue has been exhausted.  Then I can
test that the timeout is applied properly on a non-blocking connect.

This works with the Linux kernel, but on FreeBSD the test originally was
failing because after the listener queue was exhausted, the OS started
returning connection refused instead of timing out the connection.  Okay,
that's perfectly reasonable, so I modified the test so that it would
notice and skip that case.  On asdfasdf:

ok 41 - Timeout: first connection worked
# Finally timed out on socket 2
ok 42 - Timeout: later connection timed out
ok 43 # skip unable to test timeouts with short listening queue

Everything is fine.

But on the buildd, not only does it apparently not even return failure on
any subsequent connect, but then something apparently kills the whole test
case and causes an abnormal exit:

util/network............MISSED 44-100; FAILED 42-43

On i386, it just fails both of the relevant tests:

util/network............FAILED 42-43

I'm not sure how to fix this without being able to reproduce it on the
porter box.  Any ideas?  Maybe every one of the connections is being
accepted despite the listener queue length of 1?  That would explain the
i386 result, but not the abnormal abort of the test on amd64....

Here's the relevant test code (not completely standalone since it's
testing the remctl utility library):

static void
test_timeout_ipv4(void)
{
    socket_type fd, c;
    pid_t child;

    fd = network_bind_ipv4("127.0.0.1", 11119);
    if (fd == INVALID_SOCKET)
        sysbail("cannot create or bind socket");
    if (listen(fd, 1) < 0) {
        sysdiag("cannot listen to socket");
        ok_block(3, 0, "IPv4 network client with timeout");
        close(fd);
        return;
    }
    child = fork();
    if (child < 0)
        sysbail("cannot fork");
    else if (child == 0) {
        struct sockaddr_in sin;
        socklen_t slen;

        alarm(10);
        c = accept(fd, &sin, &slen);
        if (c == INVALID_SOCKET)
            _exit(1);
        sleep(9);
        _exit(0);
    } else {
        socket_type block[20];
        int i;

        close(fd);
        c = network_connect_host("127.0.0.1", 11119, NULL, 1);
        ok(c != INVALID_SOCKET, "Timeout: first connection worked");

        /*
         * For some reason, despite a listening queue of only 1, it can take
         * up to seven connections on Linux before connections start actually
         * timing out.
         */
        alarm(10);
        for (i = 0; i < (int) ARRAY_SIZE(block); i++) {
            block[i] = network_connect_host("127.0.0.1", 11119, NULL, 1);
            if (block[i] == INVALID_SOCKET)
                break;
        }
        diag("Finally timed out on socket %d", i);
        ok(block[i] == INVALID_SOCKET, "Timeout: later connection timed out");
        if (socket_errno == ECONNRESET)
            skip("unable to test timeouts with short listening queue");
        else
            is_int(ETIMEDOUT, socket_errno, "...with correct error");
        alarm(0);
        kill(child, SIGTERM);
        waitpid(child, NULL, 0);
        close(c);
        for (; i >= 0; i--)
            if (block[i] != INVALID_SOCKET)
                close(block[i]);
    }
    close(fd);
}

-- 
Russ Allbery (rra@debian.org)               <http://www.eyrie.org/~eagle/>


Reply to: