Bug 1463

Summary: Running nohup sleep 70 & and then exiting shell, hangs ssh
Product: Portable OpenSSH Reporter: Jeff White <jwhite>
Component: sshdAssignee: Assigned to nobody <unassigned-bugs>
Status: CLOSED FIXED    
Severity: normal CC: djm, dtucker, jwhite
Priority: P2    
Version: 5.0p1   
Hardware: SPARC   
OS: Solaris   
Bug Depends on:    
Bug Blocks: 1452    
Attachments:
Description Flags
Debug that Damien requested
none
Do not rely on isatty(ptymaster) none

Description Jeff White 2008-05-13 06:27:00 AEST
Overview:

Testing to make sure that nohupping a process will allow me to disconnect with SSH (DOES work with telnet).

Steps to reproduce:

1) Install OpenSSH 5.0p1 on both machines
2) ssh user@machineA (from machineB)
3) Give user's password
4) When shell appears (ksh in my case), type nohup sleep 70 & and then type exit twice to TRY to log out.
5) Session will hang until the sleep finishes.

Actual result: Hanging...

Expected result: Disconnect like telnet, just drop the connection.

Build & platform:
$ ssh -V
OpenSSH_5.0p1, OpenSSL 0.9.8g 19 Oct 2007 on Solaris 10 8/07 (Update 4)SPARC

Additional Builds & platforms
$ ssh -V
OpenSSH_5.0p1, OpenSSL 0.9.8g 19 Oct 2007 on Solaris 10 5/08 (Update 5) x86
Comment 1 Jeff White 2008-05-13 06:32:04 AEST
This is holding us up moving to SSH exclusively. Telnet does not do this when someone nohups a process and throws it into the background. The shell warns you once and then allows you to exit if you type exit on the shell again. You are immediately returned to your host and the process is still running.

I do thank you for your time looking into this. Have a good day.
Comment 2 Damien Miller 2008-05-13 11:50:44 AEST
What vendor/version SSH server are you connecting to? If it is OpenSSH, could you please attach a debug trace from the server (/path/to/sshd -ddd)
Comment 3 Jeff White 2008-05-14 01:00:47 AEST
Created attachment 1501 [details]
Debug that Damien requested
Comment 4 Jeff White 2008-05-14 01:03:50 AEST
Version of sshd is:

OpenSSH_5.0p1, OpenSSL 0.9.8g 19 Oct 2007 compiled for and running on Solaris 10 x86. Solaris 10 version is Solaris 10 x86 (5/08). Thank you again.

Client I am connecting from is

OpenSSH_5.0p1, OpenSSL 0.9.8g 19 Oct 2007 compiled for and running on Solaris 10 SPARC.
Comment 5 Darren Tucker 2008-06-15 05:19:21 AEST
I can reproduce this on Solaris 10/x86.  trussing sshd shows this:

20238:  close(10)                                       = 0
20238:  waitid(P_ALL, 0, 0x080478F0, WEXITED|WTRAPPED|WNOHANG) Err#10 ECHILD
20238:  lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
20238:  pollsys(0x080478D0, 3, 0x00000000, 0x00000000)  = 2
20238:  read(4, "\0", 1)                                = 1
20238:  read(4, 0x08047983, 1)                          Err#11 EAGAIN
20238:  lwp_sigmask(SIG_SETMASK, 0x00020000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
20238:  lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000) = 0xFFBFFEFF [0x0000FFFF]
20238:  write(5, " 6 wCAC0FD89 ,8DBE |96 %".., 48)      = 48
20234:  read(6, 0x080475A8, 4)          (sleeping...)
20238:  pollsys(0x080478D0, 3, 0x00000000, 0x00000000) (sleeping...)

Descriptor 5 is the socket connected to the client, 4 and 6 are pipes, presumably connected to the shell.
Comment 6 Darren Tucker 2008-06-15 15:36:21 AEST
Created attachment 1528 [details]
Do not rely on isatty(ptymaster)

Please try this patch against 5.0p1.

Damien figured it out.  sshd relies on isatty() to determine whether or not a given channel is connected to a pty, however this doesn't work on Solaris for a pty master (which is entirely reasonable, since it's not a tty).  This flag, in turn, controls whether or not the infamous hang-on-exit fixes are used.

This diff adds an extra parameter that controls whether or not a given channel is connected to a pty, rather than relying on undefined semantics.
Comment 7 Damien Miller 2008-06-16 08:02:00 AEST
A similar patch has been applied and will be in OpenSSH 5.1. Thanks for the report.
Comment 8 Jeff White 2008-06-18 08:00:30 AEST
Thanks for the update, guys. I appreciate your help. I will get these patches added when I can. I hope to have something by the end of this week, but I'm not sure. Anyway, I'll try your fix out and report back.
Comment 9 Jeff White 2008-06-20 02:31:36 AEST
This patch does work. It solves my problem. Thank you for your time. I need to test it on a SPARC install of Solaris 10, but the sshd I just compiled for Solaris 10 x86 works fine. Thanks for your help.
Comment 10 Jeff White 2008-06-20 03:21:24 AEST
This patch also works on SPARC Solaris 10. Thanks.
Comment 11 Damien Miller 2008-07-22 12:22:17 AEST
Mass update RESOLVED->CLOSED after release of openssh-5.1