Bug 396 - sshd orphans processes when no pty allocated
Summary: sshd orphans processes when no pty allocated
Status: ASSIGNED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sshd (show other bugs)
Version: -current
Hardware: All All
: P2 enhancement
Assignee: OpenSSH Bugzilla mailing list
URL:
Keywords: openbsd, patch
Depends on:
Blocks:
 
Reported: 2002-09-12 23:23 AEST by Darren Tucker
Modified: 2020-01-21 08:14 AEDT (History)
9 users (show)

See Also:


Attachments
Send HUP to sshd child procs on exit (1.07 KB, patch)
2002-09-12 23:26 AEST, Darren Tucker
no flags Details | Diff
Updated patch for the Openssh 4.0p1 release (978 bytes, patch)
2005-07-02 02:03 AEST, Brian M. Rzycki
no flags Details | Diff
Modified patch for openSSH-4.6p1 (4.31 KB, patch)
2007-07-11 23:02 AEST, Matthieu Hautreux
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Darren Tucker 2002-09-12 23:23:45 AEST
When using ssh to run a command that doesn't terminate, sshd will leave the 
process orphaned when it exits.  Using ssh -t to force a pty allocation allows 
the process to terminate when sshd does (presumably when the pty closes).  This 
has been observed on Solaris (7,8) and AIX, and probably occurs on others.

# ssh localhost nc localhost 22
SSH-2.0-OpenSSH_3.4p1
[kill ssh from another window]
# ps -eaf |grep nc
 dtucker  5919     1  0 21:05:08 ?        0:00 nc localhost 22

The following patch (against -cvs) sends a HUP to the child process(es) when 
sshd exits for protcols 1 and 2.  It assumes that there's only one session for 
v1. (Is that valid?)

It has been tested on Solaris 7 (including regression tests).
Comment 1 Darren Tucker 2002-09-12 23:26:08 AEST
Created attachment 145 [details]
Send HUP to sshd child procs on exit
Comment 2 Darren Tucker 2003-05-04 11:14:37 AEST
Does anyone object to this patch?  And if not, is it something that should go to 
OpenBSD?
Comment 3 Markus Friedl 2003-06-05 00:09:05 AEST
sending signals could be dangerous, depending on the permissions
of the sending process, e.g. a root-owned sshd sending to
a setuid process. but i'm not sure. we had similar code there before.

Comment 4 Darren Tucker 2003-06-05 00:40:01 AEST
Maybe I'm being thick but I don't see 
how this could be dangerous (or, at 
least, any more dangerous than the 
SIGHUP it would get anyway if it had a 
pty).
Comment 5 Markus Friedl 2003-06-05 00:49:26 AEST
hm, i'm not sure, but it might behave different if it had a pty. but
perhaps i'm mixing up things.

in any case, the kill should be done from within session.c
on session cleanup, i think.
Comment 6 fdjsouthey 2003-06-18 07:56:52 AEST
Just for comparision, the man page for 'rsh' states:

"Interrupt, quit and terminate signals  are  propagated  to
 the  remote  command;"

It seems reasonable to me for ssh to behave like rsh in this respect.
Comment 7 Darren Tucker 2004-03-30 15:12:09 AEST
Take another swing at this for the next major release...
Comment 8 Damien Miller 2005-02-14 11:55:40 AEDT
Is this not another manifestation of the infamous bug #52?
Comment 9 Darren Tucker 2005-02-14 12:19:35 AEDT
(In reply to comment #8)
> Is this not another manifestation of the infamous bug #52?

I don't think so.

Bug #52 is sshd waiting for descriptors to close on a clean shutdown.  This one
is subprocesses not knowing that sshd (and thus their stdin/stdout) has gone
away on abnormal termination, usually because they don't check if their read()
calls on stdin return zero.

(This may be because some platforms return zero for reads in some cases even
when the descriptor hasn't closed.  Such behaviour would appear to be in
violation of POSIX.)
Comment 10 Darren Tucker 2005-03-07 16:11:08 AEDT
Unfortunately this won't be making 4.0 either.
Comment 11 Brian M. Rzycki 2005-07-02 02:03:38 AEST
Created attachment 934 [details]
Updated patch for the Openssh 4.0p1 release

This patch is funcitionally equivalent to the previous patch for the 4.0p1 code
release.  

I would like to see this bug considered for inclusion at a later date as this
causes zombie processes on clusters running apps such as mpich if the run is
aborted.  This can be difficult to clean up if the cluster is large.
Comment 12 Matthieu Hautreux 2007-07-11 23:02:19 AEST
Created attachment 1324 [details]
Modified patch for openSSH-4.6p1

Modified patch for the Openssh 4.6p1 release

This patch add an option in sshd_config (RemoteCommandCleanup no|(yes)) that enables sshd to send a HUP signal to child process group when no tty was allocated (remote command execution) and session is closing. The signal must be send to process group because child process is often the user shell invoked to launch the real command.

I think that this problem should be solve in the next openSSH release. It causes a lot of orphan processes on the server and wastes resources.
Furthermore, in a secure environment, you need a forwarded credential to access the file system, credential is removed when sshd exits and then the file system can no longer be accessed. There is no reason to let processes run if they are not allowed to access FS.
Comment 13 Martin d'Anjou 2008-12-13 13:37:21 AEDT
Looks like it has not made it to ssh 5 yet.
$ ssh localhost "trap 'echo got int' INT; sleep 40"
^CKilled by signal 2.
$ ps -ef | grep sleep
martin    5639     1  0 21:35 ?        00:00:00 bash -c trap 'echo got int' INT; sleep 40
martin    5642  5639  0 21:35 ?        00:00:00 sleep 40
martin    5644  5625  0 21:35 pts/5    00:00:00 grep --colour=auto sleep
$ ssh -v
OpenSSH_5.1p1, OpenSSL 0.9.8g 19 Oct 2007

Anyone got a patch for this version? Thanks.
Comment 14 Marc Herbert 2009-06-11 04:29:55 AEST
(In reply to comment #9)

> Bug #52 is sshd waiting for descriptors to close on a clean shutdown.  This one
> is subprocesses not knowing that sshd (and thus their stdin/stdout) has gone
> away on abnormal termination, usually because they don't check if their read()
> calls on stdin return zero.

People like me not comfortable with patching their ssh server can probably take advantage of this workaround I wrote. This is just a "EOF to SIGHUP" converter implemented in 4 lines of shell script (it simply inserts a "cat" between sshd and the no longer orphan process).

TUBE=/tmp/myfifo.$$; mkfifo "$TUBE"
<"$TUBE"   yourApplicationIgnoringEOFMoreOftenThanNot   &  appPID=$!
# unlike the above, cat WILL notice the EOF and politely die
cat >"$TUBE"
kill -HUP -$appPID; rm "$TUBE"

Feedback is welcome; do not hesitate to email me.
Comment 15 Marc Herbert 2009-07-02 00:33:14 AEST
(In reply to comment #14)
>(it simply inserts a "cat" between sshd and the no longer orphan process).

Ahem, except sshd seems to create standard input in non-blocking mode, which makes cat intermittently fail:

  cat: -: Resource temporarily unavailable

See this discussion:
http://lists.mindrot.org/pipermail/openssh-unix-dev/2005-July/023090.html

I initially used a recent ssh version 5.1 and did NOT hit this non-blocking problem. Then I tried to use my cat trick with ssh version 3.4 and version 4.3 and DID intermittently hit this non-blocking problem. Could this be due to version 5.1 now passing 3 pipes for stdin/out/err as opposed to a socketpair in earlier versions?


Anyway, I switch to using socat instead of cat, and socat seems to deal with non-blocking stdin in a more robust fashion. Here is the updated workaround:

TUBE=/tmp/myfifo.$$; mkfifo "$TUBE"
<"$TUBE"   yourApplicationIgnoringEOFMoreOftenThanNot   &  appPID=$!
# unlike the above, socat WILL notice the EOF and politely die
socat -u STDIN "PIPE:$TUBE"
kill -HUP -$appPID; rm "$TUBE"
Comment 16 Andrew McNabb 2012-02-10 05:21:40 AEDT
I maintain pssh, a parallel SSH tool that runs openssh, and I've had several users complain about orphaned processes due to this bug. Adding a "-t" option is the only workaround I know of, but it has other effects, too. Having a RemoteCommandCleanup option seems very reasonable, and I notice that there have been various patches for a long time. Is there any hope of resolution in the near future?
Comment 17 Andrew McNabb 2012-02-22 11:34:22 AEDT
So it turns out that the "-t" workaround is not an option if stdin is not a terminal (which makes ssh ignore the "-t" option). Ouch. Is anyone aware of any other workaround?
Comment 18 Salvador Fandiño 2012-02-23 04:13:40 AEDT
does -tt work in that case?
Comment 19 Andrew McNabb 2012-02-23 04:18:58 AEDT
(In reply to comment #18)
> does -tt work in that case?

I've seen that option before but for some reason I had forgotten it. Thanks.

I'm still hoping for a real solution that doesn't have the side effects of "-t". :)
Comment 20 Tyler Riddle 2012-09-28 08:46:48 AEST
I just hit the condition of processes invoked using ssh remote command invocation not receiving the standard signals because no pty is involved. It's a problem for me like everyone else but forcing a pty to be allocated is causing the control characters for the terminal to mess with the app.

Unfortunately the patch here solves my problem and has for some time but doesn't ship in OpenSSH almost certainly because changing this behavior at the granularity of an entire ssh daemon is going to cause lots of problems for anything expecting the original behavior. If the ssh client included a new option that instructed the ssh daemon it was connecting to, if possible, setup the same signal deliveries that would happen if a pty was involved and the default was to use the original behavior and the ssh daemon could reject the request and this all fits in the protocol I think it's a safe solution that opens up the features required so us cluster users can invoke remote commands in an inherently safe way. 

Just had to throw in my own two cents to help move this along towards being solved.
Comment 21 Ulrich Sibiller 2018-11-22 08:14:05 AEDT
This bug is pending for 16 years now. And the feature to kill the remote processes is still missing. So far I have not found a reason why. Please integrate that as an option.