Bug 273 - sshd hangs on shell exit if user spawned child with /bin/nohup
Summary: sshd hangs on shell exit if user spawned child with /bin/nohup
Status: CLOSED DUPLICATE of bug 52
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sshd (show other bugs)
Version: -current
Hardware: UltraSPARC Solaris
: P2 normal
Assignee: OpenSSH Bugzilla mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-06-12 05:54 AEST by Kerry Schwab
Modified: 2004-04-14 12:24 AEST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kerry Schwab 2002-06-12 05:54:35 AEST
Basically, if during a remote session, the user starts
something with "nohup", sshd hangs when they try to
exit the shell.

Running truss on sshd shows it running poll() on fd0,fd1,fd2.

If I wrap nohup with something that closes stdin/out/err before
it calls nohup, everything works fine.

So, I suspect that for whatever reason, sshd doesn't get the SIGCHLD
from the shell (ksh in this case), and the fd0/1/2 are open because
the process spawned via nohup has them open.

This problem exists in both 3.1p1 and 3.2.3p3.
Comment 1 Kerry Schwab 2002-06-12 06:26:32 AEST
Debug output, with comments:
Get this once i'm in:
>>[some omitted for brevity]
>>debug1: session_new: session 0
>>debug1: Allocating pty.
>>debug1: session_pty_req: session 0 alloc /dev/pts/12
>>debug1: fd 4 setting TCP_NODELAY
>>debug1: Entering interactive session.
>>debug1: fd 7 setting O_NONBLOCK
>>debug1: fd 10 setting O_NONBLOCK
>>debug1: fd 11 setting O_NONBLOCK
>>debug1: server_init_dispatch_13
>>debug1: server_init_dispatch_15
Now I start the "nohup job" with ./nohup somescript &
I then try to exit the ksh shell.
Ksh first tells you "You have running jobs" ( normal...)
then I give the second exit.
At that point, my ksh becomes a <defunct> process (zombie).
sshd apparently gets the SIGCHLD:
>debug1: Received SIGCHLD.
But, my ssh session is "hung".

sshd itself is running poll() over and over on fd0,1,2.

If I then kill the nohupped process ( from another session ), the session
is closed:

>>debug1: End of interactive session; stdin 20, stdout (read 584, sent 584), 
>>stderr 0 bytes.
>>debug1: Command exited with status 0.
>>debug1: Received exit confirmation.
>>debug1: session_close: session 0 pid 24164
>>debug1: session_pty_cleanup: session 0 release /dev/pts/12

Please let me know if you need more detail.
I'll be happy to help any way I can.



Comment 2 Sam 2002-07-13 01:22:26 AEST
i also have this exact same problem using
OpenSSH_3.4p1, SSH protocols 1.5/2.0, OpenSSL 0x0090603f

i start a program with "nohup <program> &" and upon returning to my native
machine, the terminal is balnk, frozen with no bash prompt. i must kill the ssh
pid form another terminal to get my original terminal back.

thanks,
Sam
Comment 3 Jim Knoble 2002-07-13 02:35:48 AEST
> i start a program with "nohup <program> &" and
> upon returning to my native machine, the terminal
> is balnk, frozen with no bash prompt.

Don't do that.

Do this instead:

  (nohup <program> &)

Note the order of the '&' and the enclosing parentheses.
Comment 4 Kerry Schwab 2002-07-13 04:04:54 AEST
(nohup ./script &) is a workaround of sorts,
but that's not really a good answer.

People expect ssh to be an rsh replacement,
and rsh doesn't require a double fork
to avoid the hang-on-exit.
Comment 5 Damien Miller 2003-01-07 18:14:05 AEDT
This is a variant of the infamous hang-on-exit bug

*** This bug has been marked as a duplicate of 52 ***
Comment 6 Damien Miller 2004-04-14 12:24:18 AEST
Mass change of RESOLVED bugs to CLOSED