[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: is statistical data extracted from web DFSG compliant?



On Tue, 2008-11-11 at 00:00 +0800, Kov Chai wrote:
> Thanks a lot for your insightful analysis, Neil. =)
> But I am still confused about some problems.

I can't give a definitive answer because I'm not on the ftpmaster team,
but I can try to give some idea of how I'd expect it to be handled
within that team.

> Actually, it is another package (sunpinyin-slm) that supports generation 
> of the data. sunpinyin-slm is still in its ITP phase [1]. And it only 
> supports generation/merge/query of data. Sunpinyin-SLM stores the data 
> in a trie for character sequences of different lengths. So I guess it is 
> easier for this software to regenerate language model from modified corpus 
> to fix bugs in the language model than modify generated file.
> 
> Will this change your assessment? Or is the acceptance of sunpinyin-slm the
> prerequisite of the package (sunpinyin) in question?

Yes - the tools to generate/merge/query the binary blob should exist in
Debian *before* or with the tools that require the binary blob (that
package must depend on the tools to generate the blob).

README.debian should also explicitly (and verbosely) describe how to use
the tools.

debian/copyright must include a short note about how the binary blob is
generated and how it can be modified or regenerated using tools already
in Debian main.

If the tools need to go into non-free, then the package relying on the
data needs to be in contrib.

I would not advise uploading sunpinyin until sunpinyin-slm is accepted
into Debian main.

-- 


Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/


Attachment: signature.asc
Description: This is a digitally signed message part


Reply to: