[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: A possible GFDL compromise



On Mon, 25 Aug 2003, David Starner wrote:

>Fedor Zuev <fedor@earth.crust.irk.ru> writes:
>> Documentation in not a software.

>This has been refuted so many times. What about help2man, which
>turns software into documentation? What about the numerous other
>times documentation is embedded into source code or source code is
>embedded into documentation? What about literate programming?

	I aware. Yes, distinction is often unclear.

	But this is irrelevant. It is enough that _law_ (majority of
existed copyright laws) makes this difference. Difference, based not
on the structure of work, but on its function, btw. In some cases
you can't anyway ignore such difference, because law demands to make
it.  And in some other cases you should not ignore it, even if you
can, because such difference benefits you.

>> There is no any one-way transformation from the source to the binary.

>It so happens that I do a lot of work for Project Gutenberg, and
>have experience in this matter. Our output - no output I've seen to
>anything meaningfully called source - is not convertable into the
>original. We lose a lot of book related detail, and even stuff that
>may or may not be relevant like fonts and font sizes. The original
>in this digital age is maybe the result of a lossy conversion from
>an original that was marked up with content orientated tags to a
>paper format or a more presentation orientated format.  HTML ->
>ASCII loses information and has no reliable reverse transformation
>even for the information it doesn't loose.

	Of course. But, please note,

	1) All this is a elements of formatting, not a
copyrighted|public domain literary work itself. Formatting usually
not copyrighted, but where it is (AFAIK, in UK) copyright to
formatting is different from copyright to the work itself.

	2) You, probably, lose information while converting not
because you can't preserve it al all, but because you do not have
proper convention for preserving such things.

	3) And you do not have convention just because in majority
of cases this elements of formatting is completely unimportant for
using the text.


>On the flip side, the transformation from the source to the binary for
>programs is not one-way. You can turn that binary back into source - look
>at dozens of Java disassemblers, and the theory is the same for any
>source->binary language.

	No. It is essentially one-way. At least for the PDP11, x80,
x86, 68xxx. In many cases you can't even monosemantic disassemble
the binary. As for more abstract languages....to _which_ language
(or dialect of language) you going to decompile the binary?

	Of course, there are some excetpions, where decompilation
possible, but they are rare.

>> if you can read the document, you always, technically, can OCR it.

>No. OCR programs only work at DPIs and quality levels much higher then
>the human threshold. And only if they can get images, which is may
>be hard to do for a proprietary reader. 72 or 100 DPI isn't high
>enough to OCR from, anyway.

	You can resize the picture in GIMP. Or you can photograph a
computer display. Both techniques are really used by me or my
friends and gives reasonable results. Not perfect, but reasonable to
use.

>> it takes no more than 24-48 man\hours to completely OCR a
>> large 500-700pages book.

>For a simple novel, yes. A computer software manual would be much
>harder.

	Many OCR programs preserve much from formatting also.

>How long would it take to turn ls back into a reasonable facsimile
>of the source code? Probably not a whole lot longer, given a
>skilled programmer. A simple quantitive difference does not a
>qualitative difference make.

	Longer. Much longer. Specifically, much longer than rewrite
ls from scratch, using only manpage for reference.

	There are _many_ OCR programs in the world. There is _no_
x86-disassembler, which assure compilable output, in the world.



Reply to: