[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: TeX Licenses & teTeX (Was: Re: forwarded message from Jeff Licquia)



On Sat, 10 Aug 2002 15:40:19 -0400, Boris Veytsman <borisv@lk.net> wrote:
>I must say, however, that your letter gave an insight to me. I've
>reread DFSG-4 once more and I think I see how TeX, CM and LaTeX ARE in
>fact DFSG-free.
>
>     The license may restrict source-code from being distributed in
>     modified form _only_ if the license allows the distribution of
>     "patch files" with the source code for the purpose of modifying
>     the program at build time. The license must explicitly permit
>     distribution of software built from modified source code. The
>     license may require derived works to carry a different name or
>     version number from the original software. (This is a
>     compromise. The Debian group encourages all authors not to
>     restrict any files, source or binary, from being modified.)
>
>Now this gives specific references to source code, build time and
>built programs. To see whether this applies to TeX, we need to
>determine, what is source-code, what is building and what is a built
>program in TeX world.

Yes one should. I don't think you get it quite right, but it seems the gaps
can be repaired.

>There is several binaries in TeX: /usr/bin/tex, /usr/bin/mf etc. For
>them source code are tex.web, mf.web, etc. Everybody is allowed to
>distribute patch files with them and distribute the modified versions
>(thanks to Thomas for reminding this), so they are DFSG-free.
>
>What are the other parts of TeX-the-system? They are fonts and macro
>packages like plain.tex. If you call these pieces source code, you
>must determine, what is building. If this is source code, when is it
>built?
>
>The creation of plain.fmt and cmr10.tfm might be considered
>compilation. However, I now see it is not *building* because it is
>actually something like packing. When iniTeX creates plain.fmt, it
>actually reads the source and saves the memory state for future
>use. On a faster machine you can probably dispense with this step,
>reading plain.tex instead of plain.fmt. The same is true for creation
>of cmr10.tfm: here you pack the information from cmr10.mf for a future
>TeX run; this is *not* a complete build.

Making a format is more like compiling than it is like packing. The format
contains a lot of information about what things are, but one cannot tell
what sources made them so. It is not too difficult to reverse engineer a
format and produce a stream of TeX tokens that would have produced it -- it
might even be possible to produce a TeX input file that would reproduce
that stream of token (although the catcodes make this non-trivial in
several cases) -- but this can hardly be described as unpacking, even
though it would often produce something fairly close to the original. In
the case of cmr10.tfm you haven't got a chance in the world to recover
anything which is even remotely similar to cmr10.mf.

Not considering a format or a font as "built" is playing wordgames. OTOH I
don't think it would be necessary to restrict distributions of formats,
since requiring a changed banner line is a sufficient modification flag.
This part of the argument is therefore unnecessary.

>What is then a build time for TeX? The usual usage of the word
>"compilation" in our community refers to a TeX run, which takes a file
>foo.tex and produces a file tex.dvi. This is reasonable because it has
>all features of compilation: you have source code (foo.tex+a number of
>.sty or .tex files), you have binary .dvi files and you have even log
>files. You can have compilation errors, compilation warnings, aborted
>compiles etc. Therefore TeX documents are NOT documents in the sense
>.txt documents -- they are PROGRAMS.

This part I agree with. (It might be observed that TeX is Turing complete,
whereas the format of a .dvi file is very much like a stream of commands
for a phototypesetting device, with commands such as: put character $n,
switch to font $m, move horizontally $x units, and so on.)

>However, there is a big difference between TeX programs and, say, C or
>Perl programs. The innards of the C compiler or Perl interpreters are
>hidden from the user program. You cannot patch your compiler or
>interpreter DURING the run. TeX is different by design. You can patch
>it from the program it runs. Everything defined in plain.tex, cmr10.mf
>(or latex.ltx and article.sty) can go under knife from
>foo.tex. Therefore you obviously CAN patch the sources during the
>build and CAN distribute both the built files AND patches.

This is, as Simon Law remarked, mostly wrong. The meanings of macros and
commands can be changed at any point, but that (i) doesn't change output
that has already been shipped out -- when cmr10.mf has been read to end
then the font in its entirety has already been written to file --, (ii) is
no different from the situation in many other languages, and (iii) it isn't
possible to achieve everything by redefining the commands before reading
the unmodified input. In the case of TeX it is possible (but unreasonably
expensive) to redefine \catcodes and so on so that input is given a
completely different meaning, but METAFONT has a much more rigid parser. In
particular I'm pretty sure that you cannot (as you can with LaTeX, and in
that case fairly simply) patch the input command to load a different file
than it normally would, since the argument of METAFONT's input primitive is
not like any kind of argument that a METAFONT macro can grab.

>I think it makes TeX, TeX fonts and LaTeX DFSG-free not by the virtue
>of the licenses, bu by the design of the software itself. It was
>*designed* to be free.
>
>Am I not right?

That last part is all wrong. Whether a piece of software is free has
nothing to do with how it works -- what matters is what you may do to it.

However, the idea of patching at runtime certainly can be applied to TeX
and METAFONT sources, just as it might be applied to Web sources. The v1.2
LPPL certainly _does_ allow distribution of .diff files with original files
-- if they are not considered to be derivatives then they are not governed
by the LPPL and if they are derivatives then the LPPL gives permission to
distribute them as long as the filename is different from that of the
original, which would anyway be rather pointless -- although there is
certainly room for making this more explicit. The LPPL does not
_explicitly_ allow software built from patched sources to be distributed,
but I don't think it is controversial to add a clause that it is irrelevant
for matters of distribution whether a .dvi file was created by a TeX that
was reading patched sources at the time. With respect to format files I
suspect it will cause some grumbling, but it is likely to be accepted (I
presume it is still possible to require that the LaTeX banner message is
modified). Hence it would be possible to modify the LPPL so that it is free
by virtue of the first two sentences of DFSG4 rather than according to the
third, whose interpretation is more controversial.

OK, so the patch files can be distributed, but where is the mechanism which
causes TeX to use them? Well, the DFSG doesn't say there has to be one!
Patch files must be allowed to be distributed, but there is no condition
that requires any software to actually use them, or even to have the
ability to use them. On the other hand it would be mean to simply stop at
that observation, since the spirit of the condition is clearly that such
patching should be a simple operation. Therefore I suggest the following
mechanism for patching TeX input files, which can be implemented in
kpathsea, in Web2C, in teTeX, or merely in the Debian distribution of teTeX:

 1. Patch files have the same format as is used by the Unix (or is that
    GNU?) patch and diff programs. (This is more robust than the
    traditional format of Web change files.)

 2. Whenever TeX wants to input a file, foo.sty say, that may be patched
    then the patching mechanism also looks for a file with the same name
    to which has been appended a .diff, thus foo.sty.diff. Patch files
    are looked for along a separate search path, which by default is
    identical to the search path for TeX inputs.

    If no patch file is found, then TeX prints "(<filename>" on the
    terminal and log file, and inputs the file as usual. If a patch
    file is found then TeX prints "(<filename>{<patch-file-name>}"
    on the terminal and log file. Then TeX will read input as if it
    read from the requested file patched with the .diff patch file.

 3. Whether a file may be patched is determined by a kpathsea variable.
    (Kpathsea variables are like environment variables, and environment
    variable values are mirrored to corresponding kpathsea variables at
    program initialisation, but normally it is all handled independently
    of the program environment.) The value of this variable is a list
    of glob-style file patterns; a file may be patched if and only if
    its name matches any the patterns in this list. By default the
    variable is empty (meaning no files may be patched) but it can be set
    through an environment variable, by a system-vide configuration file
    (I'm not sure what it is called right now; it might be texsys.cnf),
    or via a command line switch.

 4. As a courtesey, if some input files were patched then a line should
    be written on the terminal at the end of the run, saying how many
    files were patched. (Thus instead of

       Output written on foo.dvi (3 pages, 30000 bytes).
       Transcript written on foo.log.

    It might say

       Output written on foo.dvi (3 pages, 30000 bytes).
       Transcript written on foo.log.
       3 input files were patched.

    It is not necessary to determine how many _distinct_ input files
    were patched, just a simple counter will do.)

It would seem to me that such a patching mechanism would be a more
convenient tool for distributors to use, should they ever find the need to
distribute a modified system, than the file mapping mechanism described in
cfgguide.tex. And of course, it could just as well be built into METAFONT
and all the other programs in a TeX system.

Are there then any catches in this? There are at least two:

It is essential that .sty etc. files generated by docstrip are not
considered to be "built software" in the sense of DFSG4 sentence 2, since
that would render the necessary protection of file names useless. Hopefully
the fact that these files _can_ be patched by the standard patch program is
sufficient for not making them "built software".

It is also perfectly possible that the license for the Computer Modern
fonts, if a clear and authorized statement of such could be found, would
disallow such runtime patching of the Computer Modern METAFONT sources. In
that case the CM fonts might well turn out to be non-DFSG-free, but that
needs not be a critical problem, as all the TFMs could be included in main
anyway and there are free Type 1 fonts that can be used instead of the
MF-generated bitmaps.

Lars Hellström




Reply to: