[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Having fun with the following C code (UB)



On Fri, 11 Apr 2014, Ansgar Burchardt wrote:

> On 04/11/2014 12:42, Ian Jackson wrote:

> > What people expect is that the compiler compiles programs the way C
> > was traditionally compiled.

Actually, I think we need to go further. We need a programming
language (with at least two compiler implementations), which I
will call Ͻ, that looks like C so much that *every* C program¹
is also a valid Ͻ program, and *every* Ͻ program that does not
make use of the additional guarantees (i.e. no C UB) is also a
valid C program.

Ͻ shall have absolutely no UB².

> Shouldn't -O0 come close to that expectation?

Sadly, no.

① That works with the additional guarantees³.
② No UB, every UB is defined (e.g. signed integers wrap around
  which makes this unusable on DSPs, but we don’t care about a
  DSP for Unix system programming), and most IB is also harmo-
  nised³.
③ This involves things like: only two’s complement⁴, bytes are
  8 bits (octets)⁵, right-shifting signed numbers DTRT, ABI is
  LP64 or ILP32, etc. but no implementation/environment stuff,
  i.e. no requirement for Linux TLS, POSIX pthreads, etc.
④ AIUI, there is no practical implementation of C on a machine
  that doesn’t use two’s complement, anyway.
⑤ Times of 18-/36-bit machines are over. We can just assume an
  8, 16, 32 and possibly 64 bit integer type exists. Also, PDP
  endian is dead. Only LE and BE for integers.

Some of this, and some other things, have been traditionally
guaranteed by some BSDs. Some of C’s rules have even been
“relaxed” (or, made stricter, depending on the PoV) by POSIX,
e.g. lengths of basic data types.

The important thing here is to not make Ͻ too different from C
(not make it too system-specific, so we can build a Ͻ compiler
on DEC ULTRIX 4.5 just as easily as mksh compiles on it (which
it does) or even nōn-Unix systems, even embedded systems, just
not on machines that don’t implement e.g. integer wraparound).

This is a challenge. Well, three (find a non-sucking name that
is ISO 646 IRV, design the language spec (base it on C11 maybe
even though I don’t like all the bloat that crept into it, but
it’d be consistent with the “not too different, just more sane
and more traditional” aspect) and write at least two compilers
for it.

I expect compilers to have support for C89 and C99 features no
longer in C11, possibly by using extra switches. Some K&R com-
patibility, too (some of the code I deal with is ancient). The
default mode should probably be standards-compliant plus added
guarantees only, though.

bye,
//mirabilos
-- 
Sometimes they [people] care too much: pretty printers [and syntax highligh-
ting, d.A.] mechanically produce pretty output that accentuates irrelevant
detail in the program, which is as sensible as putting all the prepositions
in English text in bold font.	-- Rob Pike in "Notes on Programming in C"


Reply to: