When using ssh to run a command that doesn't terminate, sshd will leave the process orphaned when it exits. Using ssh -t to force a pty allocation allows the process to terminate when sshd does (presumably when the pty closes). This has been observed on Solaris (7,8) and AIX, and probably occurs on others. # ssh localhost nc localhost 22 SSH-2.0-OpenSSH_3.4p1 [kill ssh from another window] # ps -eaf |grep nc dtucker 5919 1 0 21:05:08 ? 0:00 nc localhost 22 The following patch (against -cvs) sends a HUP to the child process(es) when sshd exits for protcols 1 and 2. It assumes that there's only one session for v1. (Is that valid?) It has been tested on Solaris 7 (including regression tests).
Created attachment 145 [details] Send HUP to sshd child procs on exit
Does anyone object to this patch? And if not, is it something that should go to OpenBSD?
sending signals could be dangerous, depending on the permissions of the sending process, e.g. a root-owned sshd sending to a setuid process. but i'm not sure. we had similar code there before.
Maybe I'm being thick but I don't see how this could be dangerous (or, at least, any more dangerous than the SIGHUP it would get anyway if it had a pty).
hm, i'm not sure, but it might behave different if it had a pty. but perhaps i'm mixing up things. in any case, the kill should be done from within session.c on session cleanup, i think.
Just for comparision, the man page for 'rsh' states: "Interrupt, quit and terminate signals are propagated to the remote command;" It seems reasonable to me for ssh to behave like rsh in this respect.
Take another swing at this for the next major release...
Is this not another manifestation of the infamous bug #52?
(In reply to comment #8) > Is this not another manifestation of the infamous bug #52? I don't think so. Bug #52 is sshd waiting for descriptors to close on a clean shutdown. This one is subprocesses not knowing that sshd (and thus their stdin/stdout) has gone away on abnormal termination, usually because they don't check if their read() calls on stdin return zero. (This may be because some platforms return zero for reads in some cases even when the descriptor hasn't closed. Such behaviour would appear to be in violation of POSIX.)
Unfortunately this won't be making 4.0 either.
Created attachment 934 [details] Updated patch for the Openssh 4.0p1 release This patch is funcitionally equivalent to the previous patch for the 4.0p1 code release. I would like to see this bug considered for inclusion at a later date as this causes zombie processes on clusters running apps such as mpich if the run is aborted. This can be difficult to clean up if the cluster is large.
Created attachment 1324 [details] Modified patch for openSSH-4.6p1 Modified patch for the Openssh 4.6p1 release This patch add an option in sshd_config (RemoteCommandCleanup no|(yes)) that enables sshd to send a HUP signal to child process group when no tty was allocated (remote command execution) and session is closing. The signal must be send to process group because child process is often the user shell invoked to launch the real command. I think that this problem should be solve in the next openSSH release. It causes a lot of orphan processes on the server and wastes resources. Furthermore, in a secure environment, you need a forwarded credential to access the file system, credential is removed when sshd exits and then the file system can no longer be accessed. There is no reason to let processes run if they are not allowed to access FS.
Looks like it has not made it to ssh 5 yet. $ ssh localhost "trap 'echo got int' INT; sleep 40" ^CKilled by signal 2. $ ps -ef | grep sleep martin 5639 1 0 21:35 ? 00:00:00 bash -c trap 'echo got int' INT; sleep 40 martin 5642 5639 0 21:35 ? 00:00:00 sleep 40 martin 5644 5625 0 21:35 pts/5 00:00:00 grep --colour=auto sleep $ ssh -v OpenSSH_5.1p1, OpenSSL 0.9.8g 19 Oct 2007 Anyone got a patch for this version? Thanks.
(In reply to comment #9) > Bug #52 is sshd waiting for descriptors to close on a clean shutdown. This one > is subprocesses not knowing that sshd (and thus their stdin/stdout) has gone > away on abnormal termination, usually because they don't check if their read() > calls on stdin return zero. People like me not comfortable with patching their ssh server can probably take advantage of this workaround I wrote. This is just a "EOF to SIGHUP" converter implemented in 4 lines of shell script (it simply inserts a "cat" between sshd and the no longer orphan process). TUBE=/tmp/myfifo.$$; mkfifo "$TUBE" <"$TUBE" yourApplicationIgnoringEOFMoreOftenThanNot & appPID=$! # unlike the above, cat WILL notice the EOF and politely die cat >"$TUBE" kill -HUP -$appPID; rm "$TUBE" Feedback is welcome; do not hesitate to email me.
(In reply to comment #14) >(it simply inserts a "cat" between sshd and the no longer orphan process). Ahem, except sshd seems to create standard input in non-blocking mode, which makes cat intermittently fail: cat: -: Resource temporarily unavailable See this discussion: http://lists.mindrot.org/pipermail/openssh-unix-dev/2005-July/023090.html I initially used a recent ssh version 5.1 and did NOT hit this non-blocking problem. Then I tried to use my cat trick with ssh version 3.4 and version 4.3 and DID intermittently hit this non-blocking problem. Could this be due to version 5.1 now passing 3 pipes for stdin/out/err as opposed to a socketpair in earlier versions? Anyway, I switch to using socat instead of cat, and socat seems to deal with non-blocking stdin in a more robust fashion. Here is the updated workaround: TUBE=/tmp/myfifo.$$; mkfifo "$TUBE" <"$TUBE" yourApplicationIgnoringEOFMoreOftenThanNot & appPID=$! # unlike the above, socat WILL notice the EOF and politely die socat -u STDIN "PIPE:$TUBE" kill -HUP -$appPID; rm "$TUBE"
I maintain pssh, a parallel SSH tool that runs openssh, and I've had several users complain about orphaned processes due to this bug. Adding a "-t" option is the only workaround I know of, but it has other effects, too. Having a RemoteCommandCleanup option seems very reasonable, and I notice that there have been various patches for a long time. Is there any hope of resolution in the near future?
So it turns out that the "-t" workaround is not an option if stdin is not a terminal (which makes ssh ignore the "-t" option). Ouch. Is anyone aware of any other workaround?
does -tt work in that case?
(In reply to comment #18) > does -tt work in that case? I've seen that option before but for some reason I had forgotten it. Thanks. I'm still hoping for a real solution that doesn't have the side effects of "-t". :)
I just hit the condition of processes invoked using ssh remote command invocation not receiving the standard signals because no pty is involved. It's a problem for me like everyone else but forcing a pty to be allocated is causing the control characters for the terminal to mess with the app. Unfortunately the patch here solves my problem and has for some time but doesn't ship in OpenSSH almost certainly because changing this behavior at the granularity of an entire ssh daemon is going to cause lots of problems for anything expecting the original behavior. If the ssh client included a new option that instructed the ssh daemon it was connecting to, if possible, setup the same signal deliveries that would happen if a pty was involved and the default was to use the original behavior and the ssh daemon could reject the request and this all fits in the protocol I think it's a safe solution that opens up the features required so us cluster users can invoke remote commands in an inherently safe way. Just had to throw in my own two cents to help move this along towards being solved.
This bug is pending for 16 years now. And the feature to kill the remote processes is still missing. So far I have not found a reason why. Please integrate that as an option.