Bug 613 - sshd hanging
Summary: sshd hanging
Status: CLOSED INVALID
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: PAM support (show other bugs)
Version: 3.6.1p2
Hardware: ix86 Linux
: P2 critical
Assignee: OpenSSH Bugzilla mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-07-07 18:50 AEST by Ervin Németh
Modified: 2004-04-14 12:24 AEST (History)
0 users

See Also:


Attachments
server log (4.96 KB, text/plain)
2003-07-07 18:54 AEST, Ervin Németh
no flags Details
sshd strace output (59.13 KB, text/plain)
2003-07-07 20:42 AEST, Ervin Németh
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ervin Németh 2003-07-07 18:50:49 AEST
On my system (i686, Linux From Scratch) I have upgraded several software.  Now
the OpenSSH daemon is not working.

Symptom: after connection the deamon hangs (in sleeping state).  If the client
is killed, the forked server process remains "stuck".

I changed to glibc-2.3.2 and Linux-PAM-0.77 and also recompiled openssh-3.6.1p2.
Comment 1 Ervin Németh 2003-07-07 18:54:48 AEST
Created attachment 353 [details]
server log

This is the output of "sshd -ddd -p 2121".  After the last line is printed
there is no response from the daemon.
Comment 2 Darren Tucker 2003-07-07 18:59:04 AEST
Do you have a firewall, NAT or packet filter between client and server?
Try "ifconfig eth0 mtu 576" and see if you can reproduce your problem.
Comment 3 Ervin Németh 2003-07-07 19:13:30 AEST
There is a simple packet filter on the server, no NAT.  The rules were not changed.

After the ifconfig command sshd prints even fewer lines.  The last lines are:

debug3: mm_auth_password: waiting for MONITOR_ANS_AUTHPASSWORD
debug3: monitor_read: checking request 10
debug3: mm_request_receive_expect entering: type 11
debug3: mm_request_receive entering
Comment 4 Darren Tucker 2003-07-07 19:30:43 AEST
The things I mentioned are notorious for hanging sessions caused but MTU 
problems (not just SSH although that seems to be more sensitive to problems).

When it's in the hung state, can you do "netstat -an" and check the values in 
the Recv-Q and Send-Q columns?

Also, can you reproduce the problem without the packet filter?
Comment 5 Ervin Németh 2003-07-07 19:55:28 AEST
After clearing iptables the problem persists.

Both Recv-Q and Send-Q are zero in netstat.  For all connections.
Comment 6 Darren Tucker 2003-07-07 20:07:09 AEST
OK, it looks like it's proably not MTU.  One other thing to check: can you do 
the netstat on the client too?.

You can try strace'ing a session which might give a better idea exactly where 
it's hanging (I suggest you use -D for no-daemon rather than the debug options 
as it will keep the output volume down).

strace -f /path/to/sshd -D -p [portno]

Warning: There will be a lot of output, and the trace will contain your 
password if you're using password auth, so blank it out (or change it for the 
test then change it back).
Comment 7 Ervin Németh 2003-07-07 20:42:19 AEST
Created attachment 354 [details]
sshd strace output

Here is the strace output.  Looks like sshd is hanging in rt_sigsuspend().

Just discovered: if I connect from localhost, the login is successfull.
Comment 8 Darren Tucker 2003-07-07 22:26:53 AEST
Did some digging and there are similar reports relating to glibc/libpthreads.  
This appears to be a glibc bug #1305 (against glibc-2.2.4).  What version of 
glibc do you have?  Is sshd linked against pthreads and if so can you compile 
without it?

My guess is pthreads messes up select() somehow.

See also:
http://bugs.gnu.org/cgi-bin/gnatsweb.pl?database=glibc&pr=1305
http://gcc.gnu.org/ml/gcc/2001-12/msg00814.html
http://sources.redhat.com/ml/libc-alpha/1999-09/msg00078.html
Comment 9 Ervin Németh 2003-07-07 23:04:09 AEST
I have come to similar conclusions, too.  Searching the web for rt_sigsuspend
gives lots of problems but no solution.

My glibc is 2.3.2, the kernel is 2.4.21.  They are really not old.

Ssh configure hasn't got option to choose a threading model, so I checked sshd.
 It is not linked against libpthread.  The trace shows it is a dependency of
libnss_lwres.

So I disabled lwres in nsswitch.conf and rejoice, sshd is now working.

This is clearly not an sshd bug.  Marking invalid.

Thanks for helping, Darren.
Comment 10 Damien Miller 2004-04-14 12:24:19 AEST
Mass change of RESOLVED bugs to CLOSED