Re: I suspect the kernel: `ping', and name resolution in general, hangs
Since name resolution works with 2.0.36, your /etc/resolv.conf is
probably fine.
Can you ping or traceroute -n to your DNS successfully with 2.2.12 (use
the ip address you have in resolv.conf)?
On Thu, Oct 07, 1999 at 09:23:47AM -0700, Eric Hanchrow wrote:
>
> Last month I had a problem: in short, I installed potato, and noticed
> that name resolution hung, although things worked fine if I used a
> numeric IP address.
>
> I've included below the plea for help that I sent last month. It
> describes the problem in detail.
>
> Well, in case anyone's interested, I have some more information that
> leads me to suspect that the problem is in the kernel (and thus,
> presumably, in the Vortex driver). Here's what I did:
>
> * I installed potato from scratch. I did this by installing slink
> from an official Debian 2.1 CD, and then doing `apt-get
> dist-upgrade' with my /etc/apt/sources.list pointing at
>
> http://http.us.debian.org/debian unstable
>
> Thus I wound up with the latest (as of this morning) binaries, but
> with the 2.0.36 kernel from the CD. (Apparantly `apt-get
> dist-upgrade' didn't automatically give me a new kernel.) This
> system worked flawlessly; in particular, name resolution worked
> fine.
>
> * I installed kernel-image-2.2.12 (version 2.2.12-3), and rebooted.
> Name resolution hung exactly as described below.
>
> * I reinstalled kernel-image-2.0.36, and rebooted; name resolution
> worked just fine.
>
> So it seems to me that the newer kernel is doing something wrong. If
> anyone would like me to perform some experiments, so as to isolate the
> problem, I'd be happy to do them; just tell me what you need done.
> Unfortunately, I know nothing about how the net card driver works, so
> I don't know how to investigate this on my own.
>
> Here's the plea that I sent last month:
>
> Can anyone tell me what's wrong with my system? At first I assumed it
> was a bug in the resolver library, and opened a bug against libc6 in
> Debian potato (http://www.debian.org/Bugs/db/45/45912.html); but the
> Debian libc6 maintainer is sure that my system is merely
> misconfigured.
>
> Here's the problem:
>
> When I type `ping blarg.net' at a shell, `ping' hangs. I expect it to display
>
> PING blarg.net (206.124.128.1): 56 data bytes
> 64 bytes from 206.124.128.1: icmp_seq=0 ttl=62 time=25.7 ms
> ...
>
> Other name resolution also fails. For example, Netscape hangs when
> trying to visit web pages on machines other than mine.
>
> On the other hand, if I type `ping 206.124.128.1', that works fine.
> So I know that IP and the network card aren't entirely broken.
>
> I've never sat around and waited to see if `ping' eventually gets
> unstuck; I've always given up and hit control-C after no more than
> perhaps a minute.
>
> I'm using potato (that is, the still-unreleased version of Debian
> GNU/Linux), which I installed by first installing slink (i.e., Debian
> 2.1) from an official CD-ROM, and then using `apt-get dist-upgrade'
> from
>
> http://http.us.debian.org/debian unstable main
>
> I did that update around 24 September.
>
> Here is some information about the broken system:
>
> Package: netbase
> Version: 3.16-2
>
> Package: kernel-image-2.2.9
> Version: 2.2.9-2
>
> My network card driver is 3c59x:
>
> Sep 24 07:21:13 potato kernel: 3c59x.c:v0.99H 11/17/98 Donald Becker http://cesdis.gsfc.nasa.gov/linux/drivers/vortex.html
> Sep 24 07:21:13 potato kernel: eth0: 3Com 3Com Boomerang (unknown version) at 0xb800, 00:50:04:1b:f6:df, IRQ 11
> Sep 24 07:21:13 potato kernel: 8K byte-wide RAM 5:3 Rx:Tx split, autoselect/Autonegotiate interface.
> Sep 24 07:21:13 potato kernel: MII transceiver found at address 24, status 182d.
> Sep 24 07:21:13 potato kernel: Enabling bus-master transmits and whole-frame receives.
>
> This problem didn't always happen, although I don't remember exactly
> when it started. I know for certain that it didn't happen immediately
> after I installed slink, nor did it happen immediately after I
> upgraded to potato the first time.
>
> I've also seen this problem on a different installation of slink (on
> the same machine with the same hardware), but that problem
> mysteriously went away. I now have both slink and potato on this
> machine, and slink works flawlessly. Only potato has this
> name-resolution problem.
>
> I haven't noticed any error messages -- certainly none at the shell on
> which I ran `ping', and none in /var/log.
>
> I connect to the Internet via DSL, using a Cisco 675 router, which is
> a little grey box that sits on the floor (the phone company gave it to
> me when I signed up for DSL). I have a phone cord that connects the
> router and my phone jack; I have an Ethernet cable that connects the
> router and my network card.
>
> The router is quite configurable, and perhaps its configuration is
> relevant:
>
> * I've got it set to act as a DHCP server, although since I don't know
> how to make Debian use DHCP, I've told Debian to use a static IP
> address. Since I only have one computer, there is no risk of having
> two IP addresses conflict.
>
> * It's doing something called `network address translation', which, as
> I understand it, means that my machine "appears" to the outside
> world to have a different IP address than what the machine thinks.
> That is (as you can see below in my network configuration files), my
> machine thinks its IP address is 10.0.0.2, but the outside world
> uses 206.124.128.30 (that address might change from time to time,
> because the router might be a DHCP client of my ISP). Also, if I
> were to connect other machines to the router (with an Ethernet hub),
> they would get IP addresses like 10.0.0.3, 10.0.0.4, etc.; but they
> would *all* appear to the outside world as 206.124.128.30. It would
> appear that this would cause total confusion, but it doesn't;
> somehow this `network address translation' keeps things from getting
> confused. I don't understand how it does this, but it seems to work
> OK. (The place I work used to have a similar setup; they had five
> machines connected to the Internet, all "sharing" an outside IP
> address; the machines all worked fine.) The one tradeoff that I
> know of is that nobody in the outside world can connect to any
> servers that I run, because the network address translation
> apparantly futzes with port numbers. For example, my SMTP server
> listens on port 25, but someone who tries to connect to that port
> using my outside IP address 206.124.128.30 won't be able to.
> Presumably, if they could guess the port to which the router has
> "mapped" port 25, they could connect to that port.
>
> There may be some more information about the configuration of this
> box that is relevant. Please feel free to ask me about it, if you
> think it would help.
>
> Perhaps some of the following network configuration files are
> relevant:
>
> /etc/resolv.conf:
> nameserver 206.124.128.1
> nameserver 206.124.128.3
>
> /etc/hosts:
> 127.0.0.1 localhost loopback
> 10.0.0.1 cisco-router
> 10.0.0.2 potato
>
> /etc/init.d/network:
> #! /bin/sh
> ifconfig lo 127.0.0.1
> route add -net 127.0.0.0
> IPADDR=10.0.0.2
> NETMASK=255.255.255.0
> NETWORK=10.0.0.0
> BROADCAST=10.0.0.255
> GATEWAY=10.0.0.1
> ifconfig eth0 ${IPADDR} netmask ${NETMASK} broadcast ${BROADCAST}
> route add -net ${NETWORK}
> [ "${GATEWAY}" ] && route add default gw ${GATEWAY} metric 1
>
> Note that those three files are almost-exact copies of the same files
> on my slink system, which as I said works fine. The only differences
> are
> --- /slink/etc/resolv.conf Sun Sep 12 04:06:13 1999
> +++ /potato/etc/resolv.conf Mon Sep 20 22:00:49 1999
> @@ -1,3 +1,2 @@
> -search hanchrow.org
> nameserver 206.124.128.1
> nameserver 206.124.128.3
>
> (I don't know what that `search' line is doing on my slink system; I
> assume that it got put there when I installed the system)
>
> --- /slink/etc/hosts Sun Sep 12 12:49:07 1999
> +++ /potato/etc/hosts Tue Sep 21 22:29:44 1999
> @@ -1,3 +1,4 @@
> 127.0.0.1 localhost loopback
> 10.0.0.1 cisco-router
> - 10.0.0.2 snowball
> \ No newline at end of file
> + 10.0.0.2 potato
> +
>
> Now, here's the kicker: the problem goes away if I run `tcpdump': I do
>
> tcpdump &
> ping blarg.net
>
> and `ping' responds correctly. I can then kill `tcpdump', and until
> the next time I boot, the network works fine. It's as if `tcpdump'
> changed something, and that change allows name resolution to work.
>
> So that's the deal. Any ideas why my system is behaving this way, and
> what I can do about it?
>
> Thanks
>
>
> --
> Unsubscribe? mail -s unsubscribe debian-user-request@lists.debian.org < /dev/null
>
--
Bob Nielsen Internet: nielsen@primenet.com
Tucson, AZ AMPRnet: w6swe@w6swe.ampr.org
DM42nh http://www.primenet.com/~nielsen
Reply to: