Re: Bug#575209 closed by Holger Levsen <email@example.com> (Re: Bug#575209: general: Error resolving hostname [resent])
reassign 575209 eglibc
found 575209 2.10.2-6
found 575209 2.11-0exp6
severity 575209 important
retitle 575209 Please resolv domain names with hyphens as border chars
tags 575209 + patch
Hi Holger et al (please drop -devel out of the list of CCs if you feel
this is getting off-topic),
sorry, but I find it unacceptable to close this bug referring to a
single paragraph in a (random) RFC . However, there is a multitude
of other reasons why I think this bug *is* an issue:
- Sites with domain names like <ker-.deviantart.com> do already exist!
Do you think they should be accessible by any other proprietary
operating system, but not Debian? Not really!
- There is already an inconsistency among the different
implementations in Debian (or Linux as a whole), as e.g. ping and any
other program using gethostbyname() fail to resolv, whereas nslookup
and host succeed.
- The advice in the cited RFC is already ignored. Domain names that
start with a digit, e.g. 12345.foo.bar, can be resolved, whereas the
RFC tells us "They [labels] must start with a letter, end with a
letter or digit [...]". So let's just relax the rules in the RFC (they
are only recommendations after all) a bit more to also allow hyphens
as border characters in labels. It doesn't harm anyone, it just
enables us to resolv a few more actual domain names!
For further discussion, please see the bug reports opened against
ubuntu  and upstream :
Technically speaking, what IMHO needs to be done is to allow
hyphenchar as a borderchar in resolv/res_comp.c in eglibc. Please find
my patch below (and while we are at at, why not allow underscorechar
@@ -146,8 +146,8 @@
|| ((c) >= 0x61 && (c) <= 0x7a))
#define digitchar(c) ((c) >= 0x30 && (c) <= 0x39)
-#define borderchar(c) (alphachar(c) || digitchar(c))
-#define middlechar(c) (borderchar(c) || hyphenchar(c) ||
+#define borderchar(c) (alphachar(c) || digitchar(c) || hyphenchar(c))
+#define middlechar(c) (borderchar(c) || underscorechar(c))
#define domainchar(c) ((c) > 0x20 && (c) < 0x7f)
 There are even other RFCs that either relax or contradict against
the advice of RFC 1035 (thanks Christoph Loehr, who could even write a
short essay about this):
Don't use digits at the beginning of the name.
Many programs accept a numerical internet address as well as a
name. Unfortunately, some programs do not correctly
distinguish between the two and may be fooled, for example, by
a string beginning with a decimal digit.
Names consisting entirely of hexadecimal digits, such as
"beef", are also problematic, since they can be interpreted
entirely as hexadecimal numbers as well as alphabetic strings.
Don't use non-alphanumeric characters in a name.
Don't expect case to be preserved.
This is a mitigation of RFC 1035, as there is no mention of hyphen
characters at all.
No blank or space characters are permitted as part of a name. No
distinction is made between upper and lower case. The first character
must be an alpha character. The last character must not be a minus
sign or period. A host which serves as a GATEWAY should have
"-GATEWAY" or "-GW" as part of its name. Hosts which do not serve as
Internet gateways should not use "-GATEWAY" and "-GW" as part of their
names. A host which is a TAC should have "-TAC" as the last part of
its host name, if it is a DoD host. Single character names or
nicknames are not allowed.
this is contradictory, since there is c.psi.net, which is resolved by
The syntax of a legal Internet host name was specified in RFC-952
[DNS:4]. One aspect of host name syntax is hereby changed: the
restriction on the first character is relaxed to allow either a letter
or a digit. Host software MUST support this more liberal syntax.
This is clearly another mitigation of RFC 1035.