[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Questions to candidates: what is source?

The Debian Free Software Guidelines states that "The program must
include source code".

1. How do you define "source code" yourself?
2. I think that people have different ideas of what "source code" means.
   Do you agree? Are there significant disagreements regarding this
   issue within the Debian Project?
3. (If you answered "yes" to 2) Is that a problem?
4. (If you answered "yes" to 3) Is it necessary to amend DFSG?
5. (If you answered "yes" to 4) How it should be amended?

6. Which of the following satisfies DFSG #2? What is the general
   principle? Or should it be case-by-case?

 * ELF binary without C source
 * Java class file without Java source
   (This is reasonably decompilable: cf. jad package)
 * Python bytecode without Python source
   (This is easily decompilable: cf. decompyle package)
 * Binary firmware data
 * configure script without configure.in
 * C source generated by Bison without .y source
 * In general, automatically generated source without good way to
   (But generated file may include every line of original source,
    perhaps as comments "This is generated from original line blah
 * Prebuilt HTML file without LaTeX source
   (cf. python-doc)
 * Prebuilt CHM (Compiled HTML) file without source HTML
   (This can be extracted: cf. chmlib, but perhaps not indexing
 * True type font made with autotracing without original bitmaps
   (cf. autotrace, potrace)
 * Opening book for board games without editing tools
   (gnuchess-book and gnugo package have opening books, but these
    are in well-known PGN and SGF format, so this is a hypothetical
 * Binary encoded data without source or encoding tools
   (Wordlist, thesarus, etc. cf. bug #241279)
 * Automatically generated character set encoding table without
   tools originally used for generation.
   (This rarely changes, so it's possible even the upstream doesn't
    have tools anymore)
 * Dump of neural network data without training data or without
   exact method to duplicate the network
 * In general, statistical data gathered from large amount of samples
   (I am not sure, but I think Mozilla's "Universal Charset Detection"
    uses character distribution table of East Asian languages gathered
    from large samples)
 * JPEG image without higher quality image from which it was compressed
   (JPEG is lossy)
 * Bitmap image merged from many layers without layer information
   (e.g. GIMP's XCF format)
 * Bitmap image without corresponding vector format
   (e.g. SVG)
 * MP3 compressed sound without original sound source
   (MP3 encoders patent-encumbered? Also MP3 is lossy)
 * Ogg Vorbis compressed sound without original sound source
   (Ogg is lossy)
 * FLAC compressed sound without original sound source
   (FLAC is not lossy)
 * Offline version of documentations in Wiki or FAQ CGI script, etc.
   downloaded by, say, wget, without original Wikitext or FAQ database
 * Binary image of programming environment used for bootstrapping
   purpose, but not exactly correspond to environment to be bootstrapped
   (Think Lisp, Smalltalk, etc.)
 * What else?

Seo Sanghyeon

Reply to: