Tested on Solaris 8 HW02/02 SPARC fully patched with latest recommended set. On 3.6.1p2 openssh would correctly authenticate via the PAM framework against all modules. 3.7p1 and 3.7.1p1 logins do not successfully perform a keylogin via the pam_dhkeys.so.1 module. This results in the users not having correctly set creds in the keyserv and therefore cannot authenticate against "sec=dh" shares or other services requiring DH authentication.
Created attachment 434 [details] pam_dhkeys debug option This shows that the module is not finding any keys from NIS+.
After looking at the problem today, I believe I have found the source of the problem. auth-pam.c spawns processes in order to perform the authentication and credential setting in sub-processes. The workaround I have used is to force the use of USE_POSIX_THREADS and use -lpthread. As an example. in.telnetd will call pam_sm_authenticate() and then fork. Using the same PAM handle, the child will then perform a pam_sm_setcred() and then exec to the shell. What sshd is doing is the main process (A) initializes the PAM framework, then by simulating "pthreads" process A spawns a process B. Process B performs the pam_sm_authenticate(). Sometime later A spawns a process C. Process C performs the pam_sm_setcred(). Then A spawns D to exec and become the shell. pam_sm_setcred (according to truss) seems to also be called in D. The problem is that any module specific data set calling pam_sm_authenticate() and pam_sm_setcred() are in separate copies of the PAM handle (i.e. in B,C,D). It seems that there is something that is set at each stage that the other components rely upon. Most likely it is becuase the password is stored in B (via pam_set_item(...,PAM_AUTHTOK,...)), and hence C or D cannot perform the keylogin (in pam_sm_setcred) as the password is not present in the module data defined via the PAM handle.
Here's my understanding of what's going on. Currently this is only known to affect Solaris, but it's possible the problem exists on other PAM-using systems. During pam_authenticate, the modules in question (pam_dhkeys, pam_krb5) stash some private data using the pam_set_data() calls. In the normal case, this data is present in a separate process (the "authentication thread") and is lost when that process exits after completing the authentication. Later, when pam_setcred is called to establish the process credentials (eg a PAG for AFS or the stored DH keys, however they are stored), that private data is not available to module, so the credentials are not established. The data stored by pam_set_data is completely inacessible to the application (ie sshd). If is was stored via pam_set_item, pam_putenv or the normal environment space, it can be copied to the main sshd process (and in 3.8 and up, it is). Currently, the only known workaround is to enable the use of POSIX threads, as Paul discovered. This is because the module-private data is stored in the same address space as the main sshd, and thus survives the termination of the "authentication thread". It is then copied to the child forked to run the user's shell an is available to the pam_setcred() calls in session.c. Note that enabling threads opens up sshd to any number of races with any PAM module, so is not recommended unless absolutely necessary and you better hope your PAM modules are thread-safe. Thread support in sshd is currently a necessary evil for these cases. Once there is a better solution, thread support becomes an *unnecessary* evil, and will be removed.
*** Bug 768 has been marked as a duplicate of this bug. ***
*** Bug 717 has been marked as a duplicate of this bug. ***
Created attachment 642 [details] Forces Storage of Kerberos Credentials right after authentication via PAM This is useful for situations where the password is validated through Kerberos via a PAM module and the credentials obtained during authentication need to be stored.
Created attachment 643 [details] Enables storage of forwarded GSSAPI credentials Storage of forwarded GSSAPI credentials when privilege separation is enabled fails. The pam_session part is reexecuted after the delegated GSSAPI credentials are stored.
Re-adding PAM+password authentication support is one possible solution to this (see bug #874).
Note that the patch in bug #874, which will be in the release, provides a solution to this if you're using password authentications.
pre-3.7p1 style PAM password auth was in 3.9p1. Does this solve your problem?
Given that no practical solution to this exists for the general case right now, this is not going to be in the next release.
The "use_shmem=sshd" option of pam_krb5-2.2.0-0.5 (or later?) from the RedHat CVS tree appears to work around this bug, without needing to disable challenge-response in openssh.
I would like to know what the viewpoint on the problem faced here is: is this still considered an OpenSSH-specific bug or a general problem of PAM implementations and fork()ing applications such as OpenSSH? Or is it even the wisest thing to advocate PAM module writers to use own mechanisms instead of pam_set_data, like pam_krb5 does with shared memory?
Created attachment 1347 [details] uset setjmp/longjmp instead of pthreads or fork Here's a patch for this problem that I have started using on Solaris for pam_dhkeys. It avoids pthreads and fork by using setjmp/longjmp and sigaltstack and cooperatively yielding only when reading from the IPC socket. (https://bugsrc.quest.com/show_bug.cgi?id=280)