[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#931921: clutter's autopkgtests hang when ran with a libglib2.0-0 built with gcc-9



Control: reassign -1 clutter-1.0-tests
Control: severity -1 serious

We were trying so hard to solve this in either gcc-9 or libglib2.0-0
that we didn't consider whether it could be a clutter bug. (It is.)

On Fri, 12 Jul 2019 at 11:16:53 +0100, Iain Lane wrote:
> Here's the bit of code.
> 
>   https://sources.debian.org/src/clutter-1.0/1.26.2+dfsg-10/tests/conform/actor-offscreen-redirect.c/#L172
> 
> It's adding some stuff to a main loop and expecting it to finish when a
> particular signal handler is called.
...
> Things which make it work again
> 
>   - Building glib2.0 w/gcc-9 -O1 (and -O0)
>   - Building w/gcc-8

This appears to have been because building gtestutils.c with different
optimizations results in different junk being left on the stack afterwards.
When running the clutter test under valgrind, we get:

# Start of actor tests
# Start of offscreen tests
==13864== Conditional jump or move depends on uninitialised value(s)
==13864==    at 0x10AD7C: actor_offscreen_redirect (actor-offscreen-redirect.c:331)
==13864==    by 0x10AD7C: actor_offscreen_redirect (actor-offscreen-redirect.c:299)
==13864==    by 0x492F889: clutter_test_func_wrapper (clutter-test-utils.c:138)
==13864==    by 0x4B6F3BD: ??? (in /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0.6000.6)

This is the variable 'data' here:

> static void
> actor_offscreen_redirect (void)
> {
>   Data data;
>
>   ... data.was_painted is never initialized ...
>
>   while (!data.was_painted)
>     g_main_context_iteration (NULL, FALSE);
> }

It seems that data.was_painted was intended to be initialized to FALSE
(all-zeroes), but this never actually happened.

If the uninitialized value of data.was_painted happens to be nonzero,
this results in basically the entire test being skipped - we never enter
the main loop, and never have the opportunity for the test to hang while
waiting for a paint signal that will never happen.

Adding some debug code to hexdump the contents of the data struct reveals
that gcc-9 -O1, or gcc-9 -O2 with -fno-tree-pre, fairly reliably fills
data.was_painted with a nonzero value, so most of the test is effectively
never run. gcc-9 -O2 fills it with zeroes, so the test runs. The paint
signal never happens (at least in my testing) and the test hangs.

The attached is probably a good starting point for someone who has some
sort of understanding of Clutter to start to investigate this.

    smcv
diff --git a/tests/conform/actor-offscreen-redirect.c b/tests/conform/actor-offscreen-redirect.c
index f47af3617..44b2e43c6 100644
--- a/tests/conform/actor-offscreen-redirect.c
+++ b/tests/conform/actor-offscreen-redirect.c
@@ -150,25 +150,37 @@ verify_results (Data *data,
   g_free (pixel);
 }
 
+static void
+paint_handler_cb (GMainLoop *main_loop,
+                  gpointer nil)
+{
+  g_debug ("in paint_handler_cb");
+  g_main_loop_quit (main_loop);
+}
+
 static void
 verify_redraw (Data *data, int expected_paint_count)
 {
   GMainLoop *main_loop = g_main_loop_new (NULL, TRUE);
   guint paint_handler;
 
+  g_debug ("in verify_redraw");
+
   paint_handler = g_signal_connect_data (data->stage,
                                          "paint",
-                                         G_CALLBACK (g_main_loop_quit),
+                                         G_CALLBACK (paint_handler_cb),
                                          main_loop,
                                          NULL,
                                          G_CONNECT_SWAPPED | G_CONNECT_AFTER);
 
   /* Queue a redraw on the stage */
+  g_debug ("queueing redraw");
   clutter_actor_queue_redraw (data->stage);
 
   data->foo_actor->paint_count = 0;
 
   /* Wait for it to paint */
+  g_debug ("running main loop");
   g_main_loop_run (main_loop);
 
   g_signal_handler_disconnect (data->stage, paint_handler);
@@ -181,6 +193,8 @@ run_verify (gpointer user_data)
 {
   Data *data = user_data;
 
+  g_debug ("in run_verify");
+
   group_has_overlaps = FALSE;
 
   /* By default the actor shouldn't be redirected so the redraw should
@@ -298,7 +312,7 @@ run_verify (gpointer user_data)
 static void
 actor_offscreen_redirect (void)
 {
-  Data data;
+  Data data = {};
 
   if (!cogl_features_available (COGL_FEATURE_OFFSCREEN))
     return;

Reply to: