| Summary: | Hanging while connecting | ||
|---|---|---|---|
| Product: | Portable OpenSSH | Reporter: | Mike Harrold <ao> |
| Component: | ssh | Assignee: | OpenSSH Bugzilla mailing list <openssh-bugs> |
| Status: | CLOSED WORKSFORME | ||
| Severity: | normal | CC: | onu |
| Priority: | P2 | ||
| Version: | -current | ||
| Hardware: | UltraSPARC | ||
| OS: | Linux | ||
|
Description
Mike Harrold
2003-04-08 06:42:49 AEST
Is there a firewall, packet filter or NAT device between client and server? Does the hostname that you are trying to connect to resolve to multiple addresses? I am experiencing the same problem.
First, the answer to both the questions of Darren and Damien is no.
My SSH server is a Sun Ultra 5 running Debian. The choice of client machine
seems irrelavant. Connections using protocol 1 seem much less likely to hang.
The problem occurs both with the Ultra 5's built-in network interface as well
as a 3Com network card I installed to diagnose this problem. Connecting
through the loopback network interface on the server, the connection is
successful.
The connection seems to hang in different places, for example:
(client) debug1: SSH2_MSG_KEXINIT sent
(server) debug1: kex: server->client aes128-cbc hmac-md5 none~.
debug3: preauth child monitor started
debug3: mm_request_receive entering
or
(client) debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
(server) debug1: expecting SSH2_MSG_NEWKEYS
Please let me know if you would like me to provide any additional information.
Does the server use ssh-rand-helper? Linuxes normally have a /dev/[u]random device. Does the server have any iptables/ipchains rules? It may be DNS reverse-resoultion, you can try starting sshd with -u0 to prevent DNS lookups. Though it's not entirely clear whether /dev/randon or /dev/urandom is used (see http://article.gmane.org/gmane.linux.debian.ports.sparc/3037), it does seem that ssh-rand-helper isn't involved. iptables is not part of the equation either: Chain INPUT (policy ACCEPT) target prot opt source destination Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination I tried running sshd with -u0, but I didn't find the number of hung connections changed much. What continues to puzzle me is that connections from the server to itself (ie. ssh localhost) never fail. Nonetheless, changing network cards didn't help. I wonder if some other hardware component might be faulty. If it works fine on localhost then it really does sound like an MTU problem, although you don't seem to have any of the usual suspects for that (eg NAT). Humour me and set your network interface's MTU to 576 (make a note of the current settings then run "ifconfig eth0 mtu 576") and retest. With an MTU setting of 576, connections would still hang, but noticeably less often. Encouraged by this, I set the MTU to 200. With this MTU, I haven't experienced a single hung session. I've tried two different network cards, one switch and one hub and in all cases, only with an MTU of 200 can I avoid hung sessions. I still don't understand whether this is a software or a hardware problem. I'll probably leave the MTU to 200 for now. In a month or two, I expect to have a second identical Ultra 5. Maybe it will help diagnose the problem. Could your Ultra 5 be connected to a Cisco hub/switch? Some Sun hardware is notorious for getting into negotiation loops with the switch on network parameters. Ie. 10/100, Half/Full duplex. The solution is to lock down at least one side with the parameters you want. It may not be your problem but it's worth a try. According to the RFC 1122, a host must be able to handle a MTU of 576. If your machine still has hangs with that MTU but works with a lower one then that would seem to indicate a hardware (or driver) problem with the ethernet (or possibly the PCI bus). Does it have problems with other network applications? (Particularly one that does full-duplex communications?) What kernel version is it running? For completeness, I'll mention that I'm probably not suffering from auto-negotiation problems. The NIC is being set to 10Mb/s, half-duplex, as it should. I discovered only yesterday that downloading files using the wget program is also problematic. Sometimes the download completes, other times not. I am using a 'stock' Debian kernel, version 2.4.19. OK, then it seems to be a system or hardware problem unrelated to OpenSSH. Mass change of RESOLVED bugs to CLOSED |