On Tue, Oct 22, 2002 at 08:40:37AM -0500, John Goerzen wrote: > On Mon, Oct 21, 2002 at 03:47:55PM -0500, Branden Robinson wrote: > > Is "original" completely meaningless? > > No, I am merely saying that you have not proven your implied premise that > "all wordlists are created by extraction from dictionaries." I am saying > that it is possible for someone to create a wordlist from scratch, as a new > original work based on original research, rather than as a derivitive from a > dictionary. So what? If the end product is unoriginal, it doesn't matter what the nature or extent of the research is: In Feist Publications v. Rural Telephone Service Company, the Supreme Court recently put to rest the "sweat of the brow" doctrine, holding that originality is a sine qua non of copyright law, regardless of the author's efforts in collecting and assembling facts.[1] > > What is "original" about piping the contents of 5 different dictionaries > > throough the moral equivalent of "awk {print $1}"? > > Perhaps nothing; but in any case, I maintain that the equivalent of "awk > {print $1}" is not the only way to create a wordlist. No, but it also doesn't really matter. A mere list of words, whether unordered or organized according to some algorithm[2], is not copyrightable, for the same reason that individual words are not copyrightable. There is no opportunity for "originality" to inject itself. If there is some sort of staggering cleverness in the algorithm used to organize the words, you might want to pursue a patent claim, but that is even farther beside the point than the current discussion is ranging. I don't think aspell or any similar program in Debian is using a patented algorithm to sort its word lists. This isn't to say that I don't think you couldn't sneak a patent on alphabetization past the corporate-boot-licking, rubber-stamping automatons at the USPTO. That is, however, irrelvant. > But your conclusion is based on the premise that the only way to create a > wordlist is by derivation from a dictionary (or some similar work). I still > don't buy that. No, my premise is that one mechanism is as good as any other because the end result as just as unoriginal for the problem domain under discussion. > > Coincidence, automated generation, and independent innovation are all > > evidence that a given expression is unoriginal. This is a point that > > seems to be completely lost on most pundits today. "First past the > > More importantly, it may be lost on most *courts* today. If that is the > case, then your analysis, while possibly correct, may be irrelevant anyway. We'll have to wait and see. The reasons to take up the black flag grow more numerous every day anyway. > While we're at it, we should also consider international copyright laws and > treaties; they may have bearing on the situation as well. I do not know > what they are, though. Now it's my turn to point out the (grim?) realities. :-/ U.S. law has been seen to be the tail that wags the Debian Project and Free Software Foundation dogs. Word lists like those under discussion aren't copyrightable in the United States under the controlling Supreme Court precedent[3]. > What about uncommon words? "Uncommon" doesn't mean "original". > Words from ancient Greek? Nordic words? If one is copyrightable, why > not common English words? No single word is copyrightable in and of itself. Furthermore, even if they were, any word in usage before A.D. 1900 is in the public domain even under the U.S.'s draconian copyright terms. > What if, say, Oxford comes up with a list of common English words 5 > times larger than the existing lists? So what if they do? You're positing hypotheticals that are irrelevant to the thread. A word list containing nothing but words that no one uses, or which are unique to one person, are not worth distributing. > Copyright law does not distinguish based on quality or usefulness. That's correct, but neither Debian nor the GNU Project are concerned with that which is useless. The subject at issue, in case you'd missed the Subject header, is the English word list for aspell. > Now here's the other point. Let's take your originality argument and expand > upon it a bit. Let's say that you generated a wordlist by your awk method > that was substantially similar to an existing wordlist we consider > authoritative. Your wordlist would therefore by uncopyrightable. However, > the first wordlist may well still be copyrightable. No, because under the standard, I have proposed, the original list is provably unoriginal. > In fact, it may have been created well before automated processes even > existed -- 50 years after the author's death could extend well before > the digital computer era. Your wordlist could be considered (possibly > incorrectly) infringing on the first one. And there would be no > reason to suppose that the first one was not original. What's original about it? Your standard of originality is an unfalsifiable hypothesis. When we cannot distinguish a human-made original work from that of a computer engaged in an algorithmic process, "originality" has no meaning. Facts cannot be copyrighted. The existence of a word in a word list, by the very purpose and nature of the list, indicates a statement of fact about a word in usage. That the expression of that fact is the simple rendering of the word itself cannot be construed as an original expression of that fact, else we are permitting copyrights on words themselves, the vast majority of which were in use before before the author of such a word list was even born. I have been unable to find a cite that says one can't copyright individual words, but this is very strongly implied by every resource on copyright law I've been able to find. Perhaps the notion is considered part of the common law, and no judge has yet been moronic enough to let a claim of copyright infringement over a single word go to trial, and thus produce any holdings on the matter. If we're going to permit ourselves to be restrained from doing our work on Free Software because some person, whether through malice or stupidity, asserts copyright on an uncopyrightable thing, we might as well find some other line of work. If this joker who claims to hold a copyright on an alphabetized list of common words in the English language gives us any guff, we should simply threaten him with a claim under Title 17, Section 506(c)[4]. If the problem is due to ignorance, then I'm sure this misunderstanding can be easily cleared it up. If it isn't, then we're a long way toward establishing the fraudulent intent required by 506(c). (c) Fraudulent Copyright Notice. - Any person who, with fraudulent intent, places on any article a notice of copyright or words of the same purport that such person knows to be false, or who, with fraudulent intent, publicly distributes or imports for public distribution any article bearing such notice or words that such person knows to be false, shall be fined not more than $2,500. Copyright law only has the power that we permit it to have. If we quail at every bogus, nonsense claim, then we grant it the omnipotence that the authors of DMCA seek. We must have the backbone to fight, and decry bullshit for what it is. We're very clearly in the right here. It's a no brainer. You can't legimitately assert copyright on an alphabetized[5] list of words that were coined before you were born. The notion is ludicrous. [1] http://www.lgu.com/cr38.htm [2] I don't consider the writing of a poem or novel to be an algorithmic process. I also really don't feel like getting into the Penrose-versus-Dennett-style argument that debating this point would entail. [3] _Feist Publications, Inc., v. Rural Telephone Service Company, Inc., 499 U.S. 340 (1991)_ <URL:http://www.law.cornell.edu/copyright/cases/499_US_340.htm> [4] http://caselaw.lp.findlaw.com/casecode/uscodes/17/chapters/5/sections/section_506.html [5] binary tree, red-black, whatever -- it doesn't matter -- G. Branden Robinson | "I came, I saw, she conquered." Debian GNU/Linux | The original Latin seems to have branden@debian.org | been garbled. http://people.debian.org/~branden/ | -- Robert Heinlein
Attachment:
pgpsxrqjLGike.pgp
Description: PGP signature