[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#959139: numpy breaks scikit-learn arm64 autopkgtest: assert_uniform_grid(Y, try_name)



Source: numpy, scikit-learn
Control: found -1 numpy/1:1.18.3-1
Control: found -1 scikit-learn/0.22.2.post1+dfsg-5
Severity: serious
Tags: sid bullseye
X-Debbugs-CC: debian-ci@lists.debian.org
User: debian-ci@lists.debian.org
Usertags: breaks needs-update

Dear maintainer(s),

With a recent upload of numpy the autopkgtest of scikit-learn fails in
testing on arm64 when that autopkgtest is run with the binary packages
of numpy from unstable. It passes when run with only packages from
testing. In tabular form:

                       pass            fail
numpy                  from testing    1:1.18.3-1
scikit-learn           from testing    0.22.2.post1+dfsg-5
all others             from testing    from testing

I copied some of the output at the bottom of this report.

Currently this regression is blocking the migration of numpy to testing
[1]. Due to the nature of this issue, I filed this bug report against
both packages. Can you please investigate the situation and reassign the
bug to the right package?

More information about this bug and the reason for filing it can be found on
https://wiki.debian.org/ContinuousIntegration/RegressionEmailInformation

Paul

[1] https://qa.debian.org/excuses.php?package=numpy

https://ci.debian.net/data/autopkgtest/testing/arm64/s/scikit-learn/5194679/log.gz

=================================== FAILURES
===================================
________________________ test_uniform_grid[barnes_hut]
_________________________

method = 'barnes_hut'

    @pytest.mark.parametrize('method', ['barnes_hut', 'exact'])
    def test_uniform_grid(method):
        """Make sure that TSNE can approximately recover a uniform 2D grid

        Due to ties in distances between point in X_2d_grid, this test
is platform
        dependent for ``method='barnes_hut'`` due to numerical imprecision.

        Also, t-SNE is not assured to converge to the right solution
because bad
        initialization can lead to convergence to bad local minimum (the
        optimization problem is non-convex). To avoid breaking the test
too often,
        we re-run t-SNE from the final point when the convergence is not
good
        enough.
        """
        seeds = [0, 1, 2]
        n_iter = 500
        for seed in seeds:
            tsne = TSNE(n_components=2, init='random', random_state=seed,
                        perplexity=20, n_iter=n_iter, method=method)
            Y = tsne.fit_transform(X_2d_grid)

            try_name = "{}_{}".format(method, seed)
            try:
>               assert_uniform_grid(Y, try_name)

/usr/lib/python3/dist-packages/sklearn/manifold/tests/test_t_sne.py:784:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

Y = array([[ 52.326397  , -15.92225   ],
       [ 46.679527  , -20.175953  ],
       [ 40.870537  , -24.181147  ],
       ...[-35.291374  ,  22.122814  ],
       [-42.2738    ,  18.793724  ],
       [-48.922283  ,  15.606232  ]], dtype=float32)
try_name = 'barnes_hut_1'

    def assert_uniform_grid(Y, try_name=None):
        # Ensure that the resulting embedding leads to approximately
        # uniformly spaced points: the distance to the closest neighbors
        # should be non-zero and approximately constant.
        nn = NearestNeighbors(n_neighbors=1).fit(Y)
        dist_to_nn = nn.kneighbors(return_distance=True)[0].ravel()
        assert dist_to_nn.min() > 0.1

        smallest_to_mean = dist_to_nn.min() / np.mean(dist_to_nn)
        largest_to_mean = dist_to_nn.max() / np.mean(dist_to_nn)

        assert smallest_to_mean > .5, try_name
>       assert largest_to_mean < 2, try_name
E       AssertionError: barnes_hut_1
E       assert 6.67359409617653 < 2

/usr/lib/python3/dist-packages/sklearn/manifold/tests/test_t_sne.py:807:
AssertionError

During handling of the above exception, another exception occurred:

method = 'barnes_hut'

    @pytest.mark.parametrize('method', ['barnes_hut', 'exact'])
    def test_uniform_grid(method):
        """Make sure that TSNE can approximately recover a uniform 2D grid

        Due to ties in distances between point in X_2d_grid, this test
is platform
        dependent for ``method='barnes_hut'`` due to numerical imprecision.

        Also, t-SNE is not assured to converge to the right solution
because bad
        initialization can lead to convergence to bad local minimum (the
        optimization problem is non-convex). To avoid breaking the test
too often,
        we re-run t-SNE from the final point when the convergence is not
good
        enough.
        """
        seeds = [0, 1, 2]
        n_iter = 500
        for seed in seeds:
            tsne = TSNE(n_components=2, init='random', random_state=seed,
                        perplexity=20, n_iter=n_iter, method=method)
            Y = tsne.fit_transform(X_2d_grid)

            try_name = "{}_{}".format(method, seed)
            try:
                assert_uniform_grid(Y, try_name)
            except AssertionError:
                # If the test fails a first time, re-run with init=Y to
see if
                # this was caused by a bad initialization. Note that
this will
                # also run an early_exaggeration step.
                try_name += ":rerun"
                tsne.init = Y
                Y = tsne.fit_transform(X_2d_grid)
>               assert_uniform_grid(Y, try_name)

/usr/lib/python3/dist-packages/sklearn/manifold/tests/test_t_sne.py:792:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
_ _ _ _

Y = array([[-18.169476  ,   6.0802336 ],
       [-18.278513  ,   2.8822129 ],
       [-18.671782  ,  -0.4646889 ],
       ...[ 22.550077  ,  19.698557  ],
       [ 21.399723  ,  22.933178  ],
       [ 16.22136   ,  28.22955   ]], dtype=float32)
try_name = 'barnes_hut_1:rerun'

    def assert_uniform_grid(Y, try_name=None):
        # Ensure that the resulting embedding leads to approximately
        # uniformly spaced points: the distance to the closest neighbors
        # should be non-zero and approximately constant.
        nn = NearestNeighbors(n_neighbors=1).fit(Y)
        dist_to_nn = nn.kneighbors(return_distance=True)[0].ravel()
        assert dist_to_nn.min() > 0.1

        smallest_to_mean = dist_to_nn.min() / np.mean(dist_to_nn)
        largest_to_mean = dist_to_nn.max() / np.mean(dist_to_nn)

        assert smallest_to_mean > .5, try_name
>       assert largest_to_mean < 2, try_name
E       AssertionError: barnes_hut_1:rerun
E       assert 2.145051767903112 < 2

/usr/lib/python3/dist-packages/sklearn/manifold/tests/test_t_sne.py:807:
AssertionError

Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: