[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Host lookup for different apps...



I'm seeing a very weird thing on my Linux box (Debian testing, but not
the latest).  I'm wondering if anyone has any insight.

So, I have a local hosts file with a few hosts in it.  I also have NIS,
with a whole _boatload_ of hosts in it (8782, to be precise).  And, I
have DNS nameservers.

My /etc/host.conf file says:

  order hosts,bind
  multi off

My /etc/nsswitch.conf file says:

  hosts:          files dns

and my /etc/resolv.conf file says:

  search one.domain.com two.domain.com three.domain.com four.domain.com
  nameserver xx.yy.zz.qq
  nameserver ss.tt.uu.vv

Suppose I have a host which is in my /etc/hosts, and in NIS and DNS.
If I use nslookup on this host, it is found on my first nameserver, as
myhost.one.domain.com.  I cal also use nslookup on the IP address and
get back the right value.

If I write a tiny program which invokes gethostbyname() on my hostname,
and I use strace on it, I see that it obeys the settings in host.conf,
gets the value from /etc/hosts, and never tries to contact my DNS
servers.  Ditto for programs like rsh and rlogin.

_However_, if I try to use telnet or ssh it _doesn't_ stop at
/etc/hosts, but rather continues and tries to look up the hostname in
DNS.  Strace shows this definitively.

I see an open of /etc/hosts, then a read of the contents, then a close.
Then it loads libresolv.so.2, then I see a connect to the first
nameserver entry, and a send() whose argument includes the first domain
on the resolv.conf "search" path.  Then I see a recvfrom(), that returns
an OK value (>0).  Then I see another connect to the same server, and a
send of the host plus the second domain on the "search" path.  This also
succeeds.  Then it appears to try to do it couple of times more, but
doesn't ask about the other two domains in the search path.

Now that is weird enough, but it gets more strange: there are _some_
hosts where it doesn't stop at the second domain, but instead also tries
the third and fourth.  Now, these domains don't know about this host, so
they fail.  The poll waiting for a result from the send() request takes
4 seconds to return (strace -r shows an elapsed time of 3.921295s).
Then it tries the fourth, with similar results.  Then it looks like it
just tries the hostname by itself.  All fail, but not before waiting 4+
seconds.

This latter behavior is what started me investigating: in order to
telnet or ssh to these particular hosts I have to wait 15-20 seconds,
just sitting there, before I get a prompt.  With other hosts its
essentially instantaneous.  And, I can't come up with any differences in
the way these hosts are configured; adding them or not to /etc/hosts
makes no difference, and they're equally available in my DNS server.

I'm stumped!

BTW, I should point out that I have some Solaris boxes whose telnet and
OpenSSH don't have this delay to any hosts...

-- 
-------------------------------------------------------------------------------
 Paul D. Smith <psmith@baynetworks.com>    HASMAT--HA Software Methods & Tools
 "Please remain calm...I may be mad, but I am a professional." --Mad Scientist
-------------------------------------------------------------------------------
   These are my opinions---Nortel Networks takes no responsibility for them.



Reply to: