[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: OT: Language War (Re: "C" Manual)



Richard Cobbe wrote:
> 
> Lo, on Monday, December 31, Erik Steffl did write:
> 
> > "Eric G. Miller" wrote:
> > >
> > > On Mon, 31 Dec 2001 13:46:15 -0800, Erik Steffl <steffl@bigfoot.com> ...
> > > > it's the resource allocation that's important, not types. garbage
> > > > collectors are generally more robust as far as segfaulting (and
> > > > similar errors) go (of course, just because the program doesn't
> > > > segfault it doen't mean it's working correctly). the other
> > > > important factor is how much runtime check language does
> > > > (e.g. checking for array boundaries etc.)
> > >
> > > But it's the type safety of the language that prevents the abuse of
> > > memory, not how it was allocated.  Strong typing eliminates a huge
> > > number of error cases.  C and C++ both subvert the ability of the
> > > compiler to do static type checking via casts and void pointers.
> > > Strong static type checking also has the possible advantage that the
> > > compiler can better optimize the generated code.
> >
> > consider perl which doesn't have strong types but it's quite
> > impossible to make it segfault and C++ on the other side which is
> > fairly dangerous even without casting (I would even go as far as
> > saying that casting makes no difference (statistally), but I'd have to
> > think about it).
> 
> Perl does have strong types, but they don't really correspond to the
> types that most people are used to thinking of.  Perl's types are
> 
>     * scalars (no real distinction between strings, numbers, and the
>       undefined value)
>     * lists
>     * hashes
>     * filehandles
> 
> (I haven't really used Perl since Perl 4, so this list may not be
> complete.)

  actually there is real distinction between string and number, it's
just that it's internal only (perl stores numbers and strings
differently, it also treats them differently).

  it also has references and it knows what kind of references they are
(and since references are used to implement objects it even knows the
type of reference (what it was blessed into, plus whether it's scalar,
array, hash).

  the point was that it's not a strong type system - by which I mean
that you can assign pretty much any value to any l-value, no questions
asked. You don't get segfault but you still get non-working program
(e.g. when you mistakenly assing array to scalar, you get size of array
in scalar).

  the reason you don't get segfaults is that perl takes care of memory
allocation, e.g. if you try to assign something to the variable that's
undefined (no storage place yet), it allocates appropriate amount of
memory or if you try to read a value of variable that doesn't have value
it says that undefined variable was used (doesn't give you random piece
of memory like you can get in c).

> > most of the segfaults are because of the resource allocation mistakes,
> > not because of mistaken types... at last that's my impression.
> 
> Resource allocation mistakes (at least, the kind that typically lead to
> seg faults) *are* type errors, from a certain point of view.

  when you stretch it far enough.

  generally there are two distinct problems:

  1. resource allocation error:

  you use resource after it was freed
  you use resouce that was never allocated

  e.g.

  char *string;
  strcpy("should segfault", string);

  2. wrong type used:

  char c;
  int *i = (int*)(&c);
  *i = 12345; /* BOOM! */

  the first case COULD be described as type problem as well, if you say
that the compiler should check what the string actually points to but
it's not really the basic problem. just like the second problem can be
described as allocation problem (not enough space allocated for value of
i) but it would miss the point...

> Consider the following:
> 
>     char *a, *b;
> 
>     a = strdup("This is a sample string");
>     b = a;
> 
>     free(a);
> 
>     /* Much code follows here, none of which modifies b. */
> 
>     printf("%s\n", b);
> 
> This may or may not segfault, but it's obviously not correct.  The
> problem is, in fact, a type error in the reference to b in the printf()
> call.  The compiler and library think that b is a pointer to a
> null-terminated character string, but this is in fact not the case.  In
> a type-safe language, you would not be allowed to bring about this state
> of affairs.

  IMO it's good to have clear distinction between resource allocation
and type safety. example of perl - it doesn't have type safety, it lets
you assing (almost) anything to anything but you still don't get
problems as you describe above because it handles the memory allocation
automatically.

> > also note that in java you have to cast (when getting items from
> > containers) but it doesn't make java programs more likely to segfault.
> 
> That's because Java's typecasts are safe (i.e., checked at run-time):
> they're equivalent to C++'s dynamic_cast, not static_cast.  (Casting a
> float to an int, of course, is equivalent to C++'s static_cast, but
> errors with those sorts of casts are not likely to cause memory
> problems.)
> 
> > > >   and as far as runtime system goes - only interpreted languages have
> > > > runtime systems.
> > >
> > > Well, I dare you to remove 'ld' or 'libc.so' and see how many programs
> > > run ;-)  I think it's fair to characterize required language libraries
> > > as part of the "run time" system.  Whether or not a program is statically
> > > compiled is unimportant, as the language library still performs actions
> > > at runtime that "your" program depends on, and which "your" program
> > > could not function without.  Among those things, might be checking
> > > array accesses and raising exceptions for range errors...
> >
> > IMO the important distinction is whether the program runs itself
> > (compiled programs) or whether it is run by other program, which
> > controls it and takes care of various runtime checks etc.
> 
> Performing run-time checks, as with array indexing, does not necessarily
> imply the existence of an interpreter.  If a compiler sees the
> expression a[i], it can either assume that i's value is a valid index
> for the array a (as C and C++ do), or it can insert the appropriate
> checks directly into the executable code.  I still claim this is part of
> the language's run-time system, regardless of how it's interpreted.

  just because something does same thing it doesn't mean it's the same
thing. there are two extremes - on one side is e.g. c program that
doesn't use any libraries or standard start-up code. on the other side
there's e.g. VB that requires interpreter to run.

  run time checks was just one example, the point is that HW is
generally simpler and compiled languages tend to have less extra stuff,
SW interpreters are generally more complex and cary extra baggage (not
saying it's a bad thing).

> > you do not necessarily need ld and standard libraries for c or c++
> > program, however you need vm for java program.
> 
> Oh, I don't know, I'd say that ld is pretty essential!  (It may not be
> necessary for statically-linked binaries, but I don't know for certain.)

  I am not saying it's not essential for most applications (on
desktop/server type computer), I am saying it's not integral part of c
program. you can run c program without help of ld and without any extra
libraries, standard start-up code etc.  (it's more common in embedded
systems).

  I think there's clear distinction between java VM and standard
libraries, as far as their relationship to the actual program goes.

> And, in fact, you don't necessarily need the JVM for Java, either.  I
> believe there are some compilers out there which compile Java straight
> to native machine code.  And, for a while, Sun was talking about
> implementing the Java bytecodes in hardware, so the Java .class files
> would *be* native machine code.  (I'm not sure if that effort ever saw
> the light of day or not.)

  of course, as long as HW is turing comlete you can do anything
(sort-of) in SW (and you can then duplicate the same in HW). see the
quote below:

> > or, in other words - runtime for compiled program is HW, runtime for
> > interpreted program is SW (HW usually provides generic, basic &
> > low-level functionality, SW usually does provide higher level
> > functionality, specific for given language).
> 
> Some final thoughts:
> 
> 1) If that's how you define runtime systems, then how do you explain the
>    code necessary to support exception handling under C++?

  how do I explain the code needed to access items in array? etc.

  the source code is compiled into machine code that does what the
source code is supposed to do according to specs of language. as long as
it can be represented like this (VERY simplified to get the point
through):

  source code line 1 -> machine code chunk 1
  source code line 2 -> nachine code chunk 2
  ...

  I consider the chunks of machine code to be compiled source code and
the program to be 'standalone', thus not requiring 'runtime' (there
might be shared libs etc. but as long as they are used as libraries they
are not really 'runtime').

  if the code is processed like this:

  source code line 1 -> pseudocode chunk 1
  source code line 2 -> pseudocode chunk 2
  ...

  runtime:
    for chunk in psedo-code-chunks {
      interpret(chunk)
    }

  then I call the runtime a runtime. that's what I see it used for in
most cases. 

> 2) How do you explain those compilers that compile Java straight to
>    native machine code?  There's a lot of support that has to be
>    included in the resulting program, like GC, synchronization
>    operations, and exception handling.

  see above - it depends on who's calling who. if the program is
interpreted, the interpreter is runtime, if the program is not, then it
has no runtime. I guess you can make runtime for any compiled language
(there are c iterpreters), I would say that it can mostly go the other
way around as well (it's trickier though, to support e.g. perl's eval
you'd need to call compiler)).

> 3) Just an interesting idea, but from a certain point of view, *ALL*
>    programs are interpreted: it's just a question of where the
>    interpreter is.  For your standard binaries, the interpreter is
>    implemented in hardware.  For Java, it's in software.

  have you read the paragraph that you quoted just above your three
points??? you didn't:-) [check it, it's still there, just before you
first point]

	erik



Reply to: