Re: location of UnicodeData.txt
On Wed, Nov 27, 2002 at 03:54:35PM -0500, Branden Robinson wrote:
> This is a question for -legal, FYI.
Okay, here's my analysis. I've studied the language closely and it
might after all be free.
> > Limitations on Rights to Redistribute This Data
> > Recipient is granted the right to make copies in any form for
> > internal distribution and to freely use the information supplied in
> > the creation of products supporting the Unicode^TM Standard.
"for internal distribution" is useless to us, though it's nice to know
that ingesting the files is okay. The rest of the clause is restated
with more precision further on. In any case, I think this clause waives
whatever "Database rights" exist on the character maps.
> > The
> > files in the Unicode Character Database can be redistributed to
> > third parties or other organizations (whether for profit or not) as
> > long as this notice and the disclaimer notice are retained.
This makes it freely redistributable.
> > Information can be extracted from these files and used in
> > documentation or programs, as long as there is an accompanying
> > notice indicating the source.
If you hold this up to the light in the right way, it might be permission
to distribute modified versions.
>From what little I know of Unicode, the Consortium never expected people
to use these files directly in their programs. Instead, they expected
programs to compress and recode the information in these files, for
efficiency. Many of the tables have been designed with this in mind.
Would converting a file to Yaml count? Yaml would still be a good
format for making modifications, i.e. "source". What about compressing
it with gzip?
In general, what kinds of changes would count as "extracting information"?
More to the point, what kinds of changes would _not_? I can't really
think of an operation on the files that does not count either as
"redistribution" or as "extracting information". Even simply deleting
parts of the files falls within the intended use -- the Constortium
explicitly acknowledges that many programs don't need to support the
full Unicode range and can optimize by stripping unneeded information
from the character tables.
There's another limitation in this clause, though: "... and used in
documentation or programs". This would preclude using the files as
part of a work of art, for example. Is that too limiting for the DFSG?
I get the impression that this license only needs some clarification
to be free. The Unicode Consortium might be willing to change it
enough to make these files usable in free software, and it would
certainly be in their interests to do so. Its stated goals are
"to develop, extend and promote use of the Unicode Standard".