[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#683881: RFP: registered-domain-libs -- Extract the registered domain from a DNS label using the public suffix list



On 08/04/2012 11:21 PM, Daniel Kahn Gillmor wrote:
> Package: wnpp
> Severity: wishlist
> 
> * Package name    : registered-domain-libs
>   Version         : 20120705
>   Upstream Author : Florian Sager <sager@agitos.de>
> * URL             : http://www.dkim-reputation.org/regdom-lib-downloads/
> * License         : Apache
>   Programming Lang: C, Perl, PHP
>   Description     : Extract the registered domain from a DNS label using the public suffix list
> 
> The Registered Domain libraries provide a mechanism for code to
> determine (via the public suffix list) what the registered domain name
> is.  For example, foo.example.org is part of example.org, but
> foo.example.org.uk is part of example.org.uk.
> ..
> This source package provides C, Perl, and PHP libraries that embed the
> public suffix list directly.  Given the nature of the public suffix
> list, this package may be a candidate for frequent updates, comparable
> to tzdata.

hmm, reading the source for regdom-libs, i'm not convinced that they're
structured particularly well for packaging and redistribution.

There are several other tools that work with the public suffix list as
distributed by http://publicsuffix.org/ and mozilla, including the list
here:

http://stackoverflow.com/questions/288810/get-the-subdomain-from-a-url#answer-960790

It's possible that a better approach for debian might be a
frequently-updated package that just contains the public suffix list's
effective_tld_names.dat [0] in a single canonical location in the
filesystem, and then libraries that parse this file and can compare
domains against it (e.g. Domain::PublicSuffix [1]).

It's not clear to me how the regdom-libs conversion tables
(effectiveTLD.inc.php, etc) actually get updated, too; if they're
programmatically generated, it would be good to know the mechanism
(which doesn't appear to be included with the tarballs).  If it's done
by hand, it would be nicer to have more automated tests to ensure
nothing breaks.

	--dkg

[0]
https://mxr.mozilla.org/mozilla-central/source/netwerk/dns/effective_tld_names.dat?raw=1

[1] http://search.cpan.org/~nmelnick/Domain-PublicSuffix/

Attachment: signature.asc
Description: OpenPGP digital signature


Reply to: