[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [aspell-devel] Problems with aspell-en license

On Tue, Oct 22, 2002 at 08:40:37AM -0500, John Goerzen wrote:
> On Mon, Oct 21, 2002 at 03:47:55PM -0500, Branden Robinson wrote:
> > Is "original" completely meaningless?
> No, I am merely saying that you have not proven your implied premise that
> "all wordlists are created by extraction from dictionaries."  I am saying
> that it is possible for someone to create a wordlist from scratch, as a new
> original work based on original research, rather than as a derivitive from a
> dictionary.

So what?  If the end product is unoriginal, it doesn't matter what the
nature or extent of the research is:

	In Feist Publications v. Rural Telephone Service Company, the
	Supreme Court recently put to rest the "sweat of the brow"
	doctrine, holding that originality is a sine qua non of
	copyright law, regardless of the author's efforts in collecting
	and assembling facts.[1]

> > What is "original" about piping the contents of 5 different dictionaries
> > throough the moral equivalent of "awk {print $1}"?
> Perhaps nothing; but in any case, I maintain that the equivalent of "awk
> {print $1}" is not the only way to create a wordlist.

No, but it also doesn't really matter.  A mere list of words, whether
unordered or organized according to some algorithm[2], is not
copyrightable, for the same reason that individual words are not
copyrightable.  There is no opportunity for "originality" to inject
itself.  If there is some sort of staggering cleverness in the algorithm
used to organize the words, you might want to pursue a patent claim, but
that is even farther beside the point than the current discussion is
ranging.  I don't think aspell or any similar program in Debian is using
a patented algorithm to sort its word lists.

This isn't to say that I don't think you couldn't sneak a patent on
alphabetization past the corporate-boot-licking, rubber-stamping
automatons at the USPTO.  That is, however, irrelvant.

> But your conclusion is based on the premise that the only way to create a
> wordlist is by derivation from a dictionary (or some similar work).  I still
> don't buy that.

No, my premise is that one mechanism is as good as any other because the
end result as just as unoriginal for the problem domain under discussion.

> > Coincidence, automated generation, and independent innovation are all
> > evidence that a given expression is unoriginal.  This is a point that
> > seems to be completely lost on most pundits today.  "First past the
> More importantly, it may be lost on most *courts* today.  If that is the
> case, then your analysis, while possibly correct, may be irrelevant anyway.

We'll have to wait and see.  The reasons to take up the black flag grow
more numerous every day anyway.

> While we're at it, we should also consider international copyright laws and
> treaties; they may have bearing on the situation as well.  I do not know
> what they are, though.

Now it's my turn to point out the (grim?) realities.  :-/

U.S. law has been seen to be the tail that wags the Debian Project and
Free Software Foundation dogs.

Word lists like those under discussion aren't copyrightable in the
United States under the controlling Supreme Court precedent[3].

> What about uncommon words?

"Uncommon" doesn't mean "original".

> Words from ancient Greek?  Nordic words?  If one is copyrightable, why
> not common English words?

No single word is copyrightable in and of itself.  Furthermore, even if
they were, any word in usage before A.D. 1900 is in the public domain
even under the U.S.'s draconian copyright terms.

> What if, say, Oxford comes up with a list of common English words 5
> times larger than the existing lists?

So what if they do?  You're positing hypotheticals that are irrelevant
to the thread.

A word list containing nothing but words that no one uses, or which are
unique to one person, are not worth distributing.

> Copyright law does not distinguish based on quality or usefulness.

That's correct, but neither Debian nor the GNU Project are concerned
with that which is useless.  The subject at issue, in case you'd missed
the Subject header, is the English word list for aspell.

> Now here's the other point.  Let's take your originality argument and expand
> upon it a bit.  Let's say that you generated a wordlist by your awk method
> that was substantially similar to an existing wordlist we consider
> authoritative.  Your wordlist would therefore by uncopyrightable.  However,
> the first wordlist may well still be copyrightable.

No, because under the standard, I have proposed, the original list is
provably unoriginal.

> In fact, it may have been created well before automated processes even
> existed -- 50 years after the author's death could extend well before
> the digital computer era.  Your wordlist could be considered (possibly
> incorrectly) infringing on the first one.  And there would be no
> reason to suppose that the first one was not original.

What's original about it?  Your standard of originality is an
unfalsifiable hypothesis.  When we cannot distinguish a human-made
original work from that of a computer engaged in an algorithmic process,
"originality" has no meaning.

Facts cannot be copyrighted.  The existence of a word in a word list, by
the very purpose and nature of the list, indicates a statement of fact
about a word in usage.  That the expression of that fact is the simple
rendering of the word itself cannot be construed as an original
expression of that fact, else we are permitting copyrights on words
themselves, the vast majority of which were in use before before the
author of such a word list was even born.  I have been unable to find a
cite that says one can't copyright individual words, but this is very
strongly implied by every resource on copyright law I've been able to
find.  Perhaps the notion is considered part of the common law, and no
judge has yet been moronic enough to let a claim of copyright
infringement over a single word go to trial, and thus produce any
holdings on the matter.

If we're going to permit ourselves to be restrained from doing our work
on Free Software because some person, whether through malice or
stupidity, asserts copyright on an uncopyrightable thing, we might as
well find some other line of work.

If this joker who claims to hold a copyright on an alphabetized list of
common words in the English language gives us any guff, we should simply
threaten him with a claim under Title 17, Section 506(c)[4].  If the
problem is due to ignorance, then I'm sure this misunderstanding can be
easily cleared it up.  If it isn't, then we're a long way toward
establishing the fraudulent intent required by 506(c).

      (c) Fraudulent Copyright Notice. - Any person who, with
    fraudulent intent, places on any article a notice of copyright or
    words of the same purport that such person knows to be false, or
    who, with fraudulent intent, publicly distributes or imports for
    public distribution any article bearing such notice or words that
    such person knows to be false, shall be fined not more than $2,500.

Copyright law only has the power that we permit it to have.  If we quail
at every bogus, nonsense claim, then we grant it the omnipotence that
the authors of DMCA seek.  We must have the backbone to fight, and decry
bullshit for what it is.  We're very clearly in the right here.  It's a
no brainer.  You can't legimitately assert copyright on an
alphabetized[5] list of words that were coined before you were born.
The notion is ludicrous.

[1] http://www.lgu.com/cr38.htm

[2] I don't consider the writing of a poem or novel to be an algorithmic
process.  I also really don't feel like getting into the
Penrose-versus-Dennett-style argument that debating this point would

[3] _Feist Publications, Inc., v. Rural Telephone Service Company, Inc.,
499 U.S. 340 (1991)_

[4] http://caselaw.lp.findlaw.com/casecode/uscodes/17/chapters/5/sections/section_506.html

[5] binary tree, red-black, whatever -- it doesn't matter

G. Branden Robinson                |      "I came, I saw, she conquered."
Debian GNU/Linux                   |      The original Latin seems to have
branden@debian.org                 |      been garbled.
http://people.debian.org/~branden/ |      -- Robert Heinlein

Attachment: pgpsxrqjLGike.pgp
Description: PGP signature

Reply to: