[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

[Bug tree-optimization/18694] [4.0 regression] loop miscompilation at -O1 (-ftree-ch)



------- Additional Comments From law at redhat dot com  2004-12-10 18:10 -------
Subject: Re:  [4.0 regression] loop
	miscompilation at -O1 (-ftree-ch)

On Thu, 2004-12-09 at 20:02 +0000, kazu at cs dot umass dot edu wrote: 
> ------- Additional Comments From kazu at cs dot umass dot edu  2004-12-09 20:02 -------
> Created an attachment (id=7715)
 --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=7715&action=view)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=7715&action=view)
> a new patch
> 
> This patch is basically the same as the last patch except that
> I've added a fast path for common cases.
> 
> I am testing this with submission to gcc-patches@ in mind.
BTW, here's the analysis of why the first patch didn't work.

First, as I mentioned earlier, we're failing to register a
definition with the original patch you submitted.  This can
lead to different (and incorrect jump threading).

Second, the comment at the start of the code is somewhat misleading,
particularly in the case where we're threading the back edge in a
loop.

When dealing with a backedge in a loop, we may have already recorded
a value for the result of the PHI node.  The PHI node may record a
new (and different) value for the result of the PHI node.  Failure
to note that can result in extracting the old (and now invalid) value.

A great example would be this code extracted from stmt.c (yes, I know
this uses uninitialized variables and such.  I was too lazy to turn
those variables into parameters -- and it doesn't matter for the
sake of this analysis.


union tree_node;
typedef union tree_node *tree;

struct tree_common
{
  tree chain;
};


union tree_node
{
  struct tree_common common;
};

static unsigned char
check_operand_nalternatives (tree outputs, tree inputs)
{
  tree next, tmp;
  while (tmp)
    {
      if (((tmp)->common.chain))
        tmp = next, next = 0;
    }
}

  # BLOCK 0
  # PRED: ENTRY (fallthru)
  goto <bb 3> (<L2>);
  # SUCC: 3 (fallthru)

  # BLOCK 1
  # PRED: 3 (true)
<L0>:;
  #   VUSE <TMT.0_8>;
  D.1130_5 = tmp_1->common.chain;
  if (D.1130_5 != 0B) goto <L1>; else goto <L2>;
  # SUCC: 2 (true) 3 (dfs_back,false)

  # BLOCK 2
  # PRED: 1 (true)
<L1>:;
  tmp_6 = next_2;
  next_7 = 0B;
  # SUCC: 3 (fallthru,dfs_back)

  # BLOCK 3
  # PRED: 0 (fallthru) 1 (dfs_back,false) 2 (fallthru,dfs_back)
  # next_2 = PHI <next_3(0), next_2(1), 0B(2)>;
  # tmp_1 = PHI <tmp_4(0), tmp_1(1), next_2(2)>;
<L2>:;
  if (tmp_1 != 0B) goto <L0>; else goto <L4>;
  # SUCC: 1 (true) 4 (false)

  # BLOCK 4
  # PRED: 3 (false)
<L4>:;
  return;
  # SUCC: EXIT

}

One possible ordering of blocks for the DOM walker would be
0, 3, 1, 2, ...


Traversing the edge 3->1 will record true = (tmp_1 != 0) into our
hash tables.  When we're done processing block #2, we try to thread
the edge from 2->3.

Note carefully the PHIs at the start of block #3.  If we simply don't
record anything for tmp_1 because it's argument references next_2 from
the previous PHI, then when we hit the tmp_1 != 0B conditional again,
we will assume that it's true (it was true when we traversed from 3->1
and we failed to record that tmp_1's value changed).

Note that recording tmp_1 = next_2 isn't particularly good either since 
tmp_1 really isn't equivalent to next_2.  It's equivalent to the
previous valueof n next_2, which was next_3.  ugh.  Anyway, I'll verify
that Kazu's patch handles this correctly.

I'll add this to my list of nasty things related to following backedges
in the threader.  More than once I've pondered not threading back edges.

jeff



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18694

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.



Reply to: