[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#105364: installer allows user to insert underscores in the hostname



Matt Kraai <kraai@debian.org> writes:

> On Mon, Jul 23, 2001 at 12:54:31PM +0200, Kjetil Torgrim Homme wrote:
> > It's very early, yet.  But a few things are reasonably clear.  They'll
> > use Unicode, they just haven't decided on the encoding.  That is, all
> > characters which aren't US-ASCII will probably be added to the list of
> > allowed characters.
> 
> I don't think so.  According to [1], Appendix F, there are quite a
> few prohibited non-ASCII code points.
>
> 1. http://www.ietf.org/internet-drafts/draft-ietf-idn-nameprep-04.txt

You are right.  (btw, it's now replaced by -05)

> There are also some normalization rules.  I'm not sure if the
> normalizations should be performed in dbootstrap or in some lower
> layer, however.

Ugh.  Do we really want to go there?

> > The limit of 63 octets per name component probably won't change,
> > but notice that the number of characters will be less, depending
> > on the encoding.
> > 
> > One thing: It would be good to disallow the use of ASCII Compatible
> > Encoding-prefixes.  They look like "xx--", where x is an arbitrary
> > letter.
> 
> I've never seen these before, this being my first foray into
> internationalization.  I'll keep this in mind, however.
> 
> Will the input be encoded in UTF-8?

No, that will break too many protocols.  That's the reason for ASCII
Compatible Encoding, using only characters "a-z0-9/-".

Look at some of the examples (/^Exampl) in
  http://www.i-d-n.net/draft/draft-ietf-idn-amc-ace-w-00.txt
Notice that UTF8 is inefficient for Hangeul and other scripts, even if
it uses the full 8 bits instead of 5.

(Personally, I hope something like
 http://www.i-d-n.net/draft/draft-ietf-idn-udns-02.txt
 passes.  I'm not too optimistic, this reminds me of all the warts of
 MIME we probably never will be rid of.)


Kjetil T.



Reply to: