SFTP receives SIGTERM and SIGCHLD signals at the same time, which may kill almost all processes of the linux system(kill SIGTERM -1). 1、A process sends SIGTERM to sftp 2、sigchld_handler is running,after executing "if (sshpid > 1)", sftp receive SIGCHLD static void killchild(int signo) { if (sshpid > 1) { <-------- Receive SIGCHLD signal kill(sshpid, SIGTERM); (void) waitpid(sshpid, NULL, 0); } _exit(1); } 3、subprocess(ssh) connection closed,it send SIGCHLD to sftp 4、sigchld_handler start execution on the same cpu,so sshpid =-1 static void sigchld_handler(int sig) { ....... (void)write(STDERR_FILENO, msg, sizeof(msg) - 1); sshpid = -1; <---- } 5、killchild continue to finish,and exec "kill(-1, SIGTERM);" I think add a patch diff --git a/sftp.c b/sftp.c index b66037f..833e034 100644 --- a/sftp.c +++ b/sftp.c @@ -2297,6 +2297,7 @@ static void connect_to_server(char *path, char **args, int *in, int *out) { int c_in, c_out; + struct sigaction kact,sact; #ifdef USE_PIPES int pin[2], pout[2]; @@ -2343,12 +2344,21 @@ connect_to_server(char *path, char **args, int *in, int *out) _exit(1); } - signal(SIGTERM, killchild); - signal(SIGINT, killchild); - signal(SIGHUP, killchild); - signal(SIGTSTP, suspchild); - signal(SIGTTIN, suspchild); - signal(SIGTTOU, suspchild); + kact.sa_handler = killchild; + sigemptyset(&kact.sa_mask); + sigaddset(&kact.sa_mask, SIGCHLD); + sigaction(SIGTERM, &kact, NULL); + sigaction(SIGINT, &kact, NULL); + sigaction(SIGHUP, &kact, NULL); + + sact.sa_handler = suspchild; + sigemptyset(&sact.sa_mask); + sigaddset(&sact.sa_mask, SIGCHLD); + sigaction(SIGTSTP, suspchild); + sigaction(SIGTTIN, suspchild); + sigaction(SIGTTOU, suspchild); + + signal(SIGCHLD, sigchld_handler); close(c_in); close(c_out);
Created attachment 3337 [details] test local copy of sshpid IMO this is a better fix - take a local copy of sshpid and test/kill that rather than using a racy test,kill sequence
Comment on attachment 3337 [details] test local copy of sshpid ok dtucker >+ pid_t pid; >+ >+ pid = sshpid; you could combine these into one line. As a separate thing I'm working on replacing all of the signal() calls with sigaction(), but that can go in later when it's ready.
and for the record: strictly speaking I don't think there's any guarantee that pid_t is an atomic type, although in practice it probably is the same as sig_atomic_t.
patch applied and will be in OpenSSH 8.2 - thanks!
close bugs that were resolved in OpenSSH 8.5 release cycle