[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#43401: (c++ unaligned traps) DWARF2 exception support in gcc is broken



Hi,

I've been tracing this bug down today, and here is what I have found.
Sorry it's kind of long, but there's a lot of information involved...

First, a minimal testcase:

--- begin Makefile ---
topdir   := $(shell pwd)

all:: libtest.so test

# note, we use 'gcc' not 'g++' to link so that we don't see the
# unaligned access errors that are caused by the simple act of linking
# against libstdc++.

libtest.so: libtest.o
	gcc -shared -o $@ $<

test: test.o
	gcc -o $@ $< -L. -Wl,-rpath -Wl,$(topdir) -ltest

clean:
	rm -f libtest.so test *.o
--- end Makefile ---
--- begin libtest.cc ---
class Random_Lossage {};

void foo()
{
  throw Random_Lossage();
}
--- end libtest.cc ---
--- begin test.cc ---
#include <cstdio>
using namespace std;

/* from libtest.cc */
class Random_Lossage {};
extern void foo();

int main()
{
  try {
    foo();
  } catch(Random_Lossage) {
    printf("caught random lossage\n");
  };    
  return 0;
}
--- end test.cc ---

Running the 'test' program gives me these errors in syslog:

Feb 19 20:05:21 blood-axp kernel: test(22028): unaligned trap at 000002000000e954: 000002000022b534 29 1
Feb 19 20:05:21 blood-axp kernel: test(22028): unaligned trap at 000002000000e95c: 000002000022b534 2d 1
Feb 19 20:05:21 blood-axp kernel: test(22028): unaligned trap at 000002000000e954: 000002000022ba44 29 1
Feb 19 20:05:21 blood-axp kernel: test(22028): unaligned trap at 000002000000e95c: 000002000022ba44 2d 1
Feb 19 20:05:21 blood-axp kernel: test(22028): unaligned trap at 000002000000e954: 000002000022bdbc 29 1
Feb 19 20:05:21 blood-axp kernel: test(22028): unaligned trap at 000002000000e95c: 000002000022bdbc 2d 1

Now, when we trace down the address that faults, we find this code, in
glibc-2.1.3/elf/dl-reloc.c (but actually in
glibc-2.1.3/sysdeps/alpha/dl-machine.h):

  if (r_type == R_ALPHA_RELATIVE)
 46c:	a1 75 23 41 	cmpeq	s0,0x1b,t0
 470:	07 00 20 e4 	beq	t0,490 <_dl_relocate_object+0x470>
    {
#ifndef RTLD_BOOTSTRAP
      /* Already done in dynamic linker.  */
      if (map != &_dl_rtld_map)
 474:	a6 00 e0 f5 	bne	fp,710 <_dl_relocate_object+0x6f0>
#endif
	*reloc_addr += map->l_addr;
 478:	00 00 2a a4 	ldq	t0,0(s1) # 0x2000000e954 - trap #1
 47c:	01 04 27 40 	addq	t0,t6,t0
 480:	00 00 2a b4 	stq	t0,0(s1) # 0x2000000e95c - trap #1

    }

(annotations mine)

Basically we are seeing lots of unaligned R_ALPHA_RELATIVE relocations
being generated by either g++ or binutils when linking programs that
use exceptions.  Note that a C++ program that does not link with
libstdc++ and does not itself use any shared libraries with exceptions
will not see these errors.  So they must be in the exception handling
info.  And in fact, our friend objdump will tell us this:

dhd@blood-axp:~/work/g++-brokenness$ objdump -R libtest.so | grep '[4c] RELATIVE'
000000000010b534 RELATIVE          *ABS*
000000000010ba44 RELATIVE          *ABS*
000000000010bdbc RELATIVE          *ABS*

(note: RELATIVE relocations add the base address of the loaded object,
thus they are *obviously* 8 bytes wide, and on machines like the Alpha
that require data to be naturally aligned, must therefore be 8 byte
aligned)

You'll notice that those addresses match up exactly with the three
sets of unaligned traps we saw in syslog.

So hmm, where in the object file are those unaligned relocations?

dhd@blood-axp:~/work/g++-brokenness$ objdump -h libtest.so
<... snip ...>
 17 .eh_frame     00001150  000000000010b528  000000000010b528  0000b528  2**3
                  CONTENTS, ALLOC, LOAD, DATA

So let's look at the actual assembler output shall we...

dhd@blood-axp:~/work/g++-brokenness$ g++ -dA -S libtest.cc

(from resulting libtest.s)
.section	.eh_frame,"aw",@progbits
__FRAME_BEGIN__:
	.4byte	$LECIE1-$LSCIE1	 # Length of Common Information Entry
$LSCIE1:
	.4byte	0x0	 # CIE Identifier Tag
	.byte	0x1	 # CIE Version
	.ascii "eh\0"	 # CIE Augmentation
	.8byte	__EXCEPTION_TABLE__	 # pointer to exception region info

Now, do you see the problem? :-)

So, some work with grep -r on the gcc source reveals the culprit
... dwarf2out.c (lines 1765 to 1808, in output_call_frame_info()):

  if (flag_debug_asm)
    fprintf (asm_out_file, "\t%s CIE Identifier Tag", ASM_COMMENT_START);

  fputc ('\n', asm_out_file);
  if (! for_eh && DWARF_OFFSET_SIZE == 8)
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    {
      ASM_OUTPUT_DWARF_DATA4 (asm_out_file, DW_CIE_ID);
      fputc ('\n', asm_out_file);
    }

  ASM_OUTPUT_DWARF_DATA1 (asm_out_file, DW_CIE_VERSION);
  if (flag_debug_asm)
    fprintf (asm_out_file, "\t%s CIE Version", ASM_COMMENT_START);

  fputc ('\n', asm_out_file);
  if (eh_ptr)
    {
      /* The CIE contains a pointer to the exception region info for the
         frame.  Make the augmentation string three bytes (including the
         trailing null) so the pointer is 4-byte aligned.  The Solaris ld
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
         can't handle unaligned relocs.  */
      if (flag_debug_asm)
	{
	  ASM_OUTPUT_DWARF_STRING (asm_out_file, "eh");
	  fprintf (asm_out_file, "\t%s CIE Augmentation", ASM_COMMENT_START);
	}
      else
	{
	  ASM_OUTPUT_ASCII (asm_out_file, "eh", 3);
	}
      fputc ('\n', asm_out_file);

      ASM_OUTPUT_DWARF_ADDR (asm_out_file, "__EXCEPTION_TABLE__");
      if (flag_debug_asm)
	fprintf (asm_out_file, "\t%s pointer to exception region info",
		 ASM_COMMENT_START);
    }
  else
    {
      ASM_OUTPUT_DWARF_DATA1 (asm_out_file, 0);
      if (flag_debug_asm)
	fprintf (asm_out_file, "\t%s CIE Augmentation (none)",
		 ASM_COMMENT_START);
    }

Also see the comment in the top of dwarf2out.c:

/* The size in bytes of a DWARF field indicating an offset or length
   relative to a debug info section, specified to be 4 bytes in the DWARF-2
   specification.  The SGI/MIPS ABI defines it to be the same as PTR_SIZE.  */

#ifndef DWARF_OFFSET_SIZE
#define DWARF_OFFSET_SIZE 4
#endif

However, on Alpha, DWARF_OFFSET_SIZE is not defined to PTR_SIZE.  I
think this is an oversight on the part of the gcc maintainers.  Since
it appears that this DWARF2 stuff is mostly used for debugging it
should hopefully be safe to change it so long as BFD is updated to
know about this.  One strange thing I noticed is that while comments
in the sparc64 code suggest that this is the right thing to do, the
relevant line of code is actually commented out (from
gcc/config/sparc/linux64.h):

/* DWARF bits.  */

/* Follow Irix 6 and not the Dwarf2 draft in using 64-bit offsets. 
   Obviously the Dwarf2 folks havn't tried to actually build systems
   with their spec.  On a 64-bit system, only 64-bit relocs become
   RELATIVE relocations.  */

/* #define DWARF_OFFSET_SIZE PTR_SIZE */

Since the 64-bit ELF ABI for Alpha is not standardized, I suggest that
we fix this by doing the same thing as MIPS and defining
DWARF_OFFSET_SIZE to be the same as PTR_SIZE.  Note that since gas
tries to "optimize" DWARF CIE exception info in some rather evil ways
(see gas/ehopt.c in the binutils source), we probably need to fix
binutils too.  I haven't tried this yet, but I will tonight or
tomorrow.

GCC maintainers: can you forward this report upstream?


Reply to: