When connecting to a server as root with a key-pair if stacked PAM modules are being used, the connection hangs upon disconnect. This only affects the root user and only when connection is made with the key-pair. I have (or will have) attached the /etc/pam.conf in question, the debug output from both the client and the server with the hang point indicated, the build output and a stack backtrace. The server in question is a fairly recently patched Solaris 8 (117350-28), and I would be happy to answer any questions about anything else. The PAM module in question, by the way, is from RSA to provide SecurID access.
Created attachment 1133 [details] Build options
Created attachment 1134 [details] Stack backtrace
Created attachment 1135 [details] /etc/pam.conf file
Created attachment 1136 [details] Debug output from server
Created attachment 1137 [details] Debug output from client
Additional testing reveals that 1) the hang is caused by having the PAM module in question alone performing authentication - it doesn't have to be stacked 2) non-root users will also hang using pubkey auth if sshd is configured without PrivSep 3) not all PAM modules exhibit this behavior I suppose this bug boils down to one of, if pubkey auth succeeded, why would the auth PAM modules be getting touched at all? Even if I have a clunky PAM module, I would have thought it wouldn't matter if it is not being called for auth. I am about to attach the output of truss -vpoll -f -d on the sshd command in question. The hang occurs between the timestamps 15.69 and 26.18 (which is where I hit Ctrl-C). Thanks in advance for any help or pointers to a clue, if I am overlooking something (aside from getting rid of the PAM module in question).
Created attachment 1138 [details] Truss output from sshd (truss -vpoll -f -d)
(In reply to comment #6) > Additional testing reveals that > > 1) the hang is caused by having the PAM module in question alone > performing authentication - it doesn't have to be stacked > 2) non-root users will also hang using pubkey auth if sshd is > configured without PrivSep > 3) not all PAM modules exhibit this behavior > > I suppose this bug boils down to one of, if pubkey auth succeeded, why > would the auth PAM modules be getting touched at all? Even if I have a > clunky PAM module, I would have thought it wouldn't matter if it is not > being called for auth. pam_setcred() uses the auth stack too and that's called regardless of the ssh authentication method. > I am about to attach the output of truss -vpoll -f -d on the sshd > command in question. The hang occurs between the timestamps 15.69 and > 26.18 (which is where I hit Ctrl-C). > > Thanks in advance for any help or pointers to a clue, if I am > overlooking something (aside from getting rid of the PAM module in > question). Try lsof'ing (or equivalent) the hanging sshd (and/or its shell subprocess if it still has one). I suspect that your recalcitrant module is leaking file descriptors and sshd is waiting for the leaked desriptor to close. Excellent bug report, btw :-)
I'm attaching the lsof and pfiles output of the child sshd process (the shell process is still there, but labelled a defunct process with no open files) - I am not familiar enough with the mechanics of sshd at this point to spot a leaked FD awaiting closure, but ain't nothing leaping out to me. I'll also open a case with RSA about their module to see if they can shed any light. Thanks for the help.
Created attachment 1140 [details] lsof of child sshd process
Created attachment 1141 [details] pfiles of child sshd process
Updated summary for accuracy
Descriptor 8 in the lsof output seems a likely suspect. I went back to the truss, and one thing jumped out at me: the child process closes descriptor 8 then exits. This makes me think that the cause is what is described in bug #926. There's a patch in that bug which is not right, but I think will solve your problem enough to prove whether or not this guess is correct, could you please try it? Thanks.
It DOES help in the privsep case. As a side note, it doesn't help when privsep is turned off (though this appears to be noted in the 926 bug report). If I am reading this correctly, then, this patch is "doing the right thing" as long as you keep privsep enabled? I would be happy to perform any testing that people like for this patch or any others that come down the pike in order to confirm that. Thanks again for the help. I guess this bug can be labelled a duplicate of 926.
(In reply to comment #14) > It DOES help in the privsep case. As a side note, it doesn't help when > privsep is turned off (though this appears to be noted in the 926 bug > report). If I am reading this correctly, then, this patch is "doing the > right thing" as long as you keep privsep enabled? Yeah that's basically it. Doing the same thing for privsep=no would also mean breaking it for other situations where it currently works (or maybe adding another process per connection, which I'm not wild about). Patch #1143 doesn't change the behaviour for privsep=no, and is almost certainly an improvement on what we have now for privsep=yes, so I would like to see it or something similar in the next release. > I would be happy to > perform any testing that people like for this patch or any others that > come down the pike in order to confirm that. Based on the timing, I'm guessing you tested patch #1143? I would be interested to know if it also solves your problem for privsep=yes and user=root, assuming you permit this. > Thanks again for the help. I guess this bug can be labelled a duplicate > of 926. Thanks, marking as duplicate of 926. *** This bug has been marked as a duplicate of bug 926 ***
Yes, I tested patch 1143 (sorry I wasn't specific - I didn't see that that patch had been posted just this morning). The only case with trouble when privsep was on was root via pubkey - non-root users only had trouble when privsep was off - so this solved my issue. Again, I'd be happy to test any future patches against this known test case. Thanks for the help.
(In reply to comment #16) > Yes, I tested patch 1143 (sorry I wasn't specific - I didn't see that > that patch had been posted just this morning). The only case with > trouble when privsep was on was root via pubkey - non-root users only > had trouble when privsep was off - so this solved my issue. That's what I suspected. When privsep=yes and you're logging in as root then after successful authentication, post-auth privsep is disabled (since there's no point). I'll think about this some more.
Change all RESOLVED bug to CLOSED with the exception of the ones fixed post-4.4.