Created attachment 2893 [details] add systemd bits Ever since systemd, there are problems with keeping track of the running daemon process [1] and errors [2]. We were trying to avoid this solution, but after several bugs and many failed attempts from various people, it looks like there is no other working way than to use their API to notify them that the daemon is running. Also Debian moved to this solution [3]. Unfortunately Colin didn't manage to send the patch upstream, even though it would be useful for others (adding to CC, sorry if it was intentional) so I am trying to do so. The code is very simple, guarded by the ifdefs and enabled only if --with-systemd switch is added to configure. If it will not get to OpenBSD, I thing carrying it in portable would certainly make a sense. I also added a sample systemd service file to contrib directory to reflect its usage (not particularly needed). [1] https://bugzilla.redhat.com/show_bug.cgi?id=1381997 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1291172 [3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=778913
Created attachment 2896 [details] do not call daemon again when restarting on sighup I'm not wild about letting that particular camel's nose into the tent. How about we fix the sighup behaviour to not call daemon, keep the same pid and not touch the pidfile?
Sorry for a quite late answer. I got bitten by this while trying to rebase to the OpenSSH 7.4. Your patch got into this release in 7fc4766a, but was partially reverted in f2398eb7. But to my understanding, the daemonized() function returns 1 even for the first call running from systemd service file and therefore prevents call to daemon() and therefore systemd does not keep track of the service again. sshd[12655]: debug3: already daemonized systemd[1]: Starting OpenSSH server daemon... Reverting the chunk with the condition if (!(debug_flag || inetd_flag || no_daemon_flag || already_daemon)) { makes it working again.
Jakub: That's only a problem because for some reason Fedora doesn't use the -D option when running from systemd (if you did, then no_daemon_flag would be true so the value of already_daemon would be irrelevant). When you have a smart service supervisor, IMO -D makes much more sense.
Colin, the -D was omitted (and Type=Forking added) because systemd was unable to report failures (when configuration file had an error), as described in the Red Hat bugzilla [1]. Certainly, the -D makes more sense these days, but with that, typing systemctl restart sshd is not producing any error with -D option. Also no error goes into the journal (as in the description of the linked bug). The one solution is "letting that particular camel's nose into the tent" as used in Ubuntu/Debian, but I still hope, there is a better way to do that. Clearly, what I am trying to point out here is that the applied change is modifying the behavior not only for re-exec (reload), but also for the "start", which was not intentional to my understanding. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1291172
The test if the process is daemon passes for every process that is run as a systemd service so it does not solve the problem for us. As the result with the pushed change the 7.4p1 will not write PID file when running as a systemd service. I would consider a bit different approach. We can remember if we already called daemon() using environment variable and prevent the repeated calls if this variable is already set. This finally made it working for me. The patch can look somehow like this: diff -up openssh-7.4p1/misc.c.daemon openssh-7.4p1/misc.c --- openssh-7.4p1/misc.c.daemon 2017-02-03 13:08:14.751282516 +0100 +++ openssh-7.4p1/misc.c 2017-02-03 13:08:14.778282474 +0100 @@ -1273,6 +1273,9 @@ daemonized(void) return 0; /* parent is not init */ if (getsid(0) != getpid()) return 0; /* not session leader */ + if (getenv("_SSH_DAEMONIZED") == NULL) + return 0; /* already reexeced */ + debug3("already daemonized"); return 1; } diff -up openssh-7.4p1/sshd.c.daemon openssh-7.4p1/sshd.c --- openssh-7.4p1/sshd.c.daemon 2017-02-03 13:08:14.755282510 +0100 +++ openssh-7.4p1/sshd.c 2017-02-03 13:09:29.765164356 +0100 @@ -1866,6 +1866,7 @@ main(int ac, char **av) if (daemon(0, 0) < 0) fatal("daemon() failed: %.200s", strerror(errno)); + setenv("_SSH_DAEMONIZED", "1", 1); disconnect_controlling_tty(); } /* Reinitialize the log (because of the fork above). */
Created attachment 2950 [details] fixed patch Never mind. Nothing from above resolves the race condition between systemd reading PID file and sshd after reexec writing it, except adding SD_NOTIFY code so I gave up. I updated the patch to move the SD_NOTIFY to the place where it will not be called by children for every connection (as currently used in Debian).
(In reply to Jakub Jelen from comment #6) > Created attachment 2950 [details] > fixed patch > > Never mind. Nothing from above resolves the race condition between > systemd reading PID file and sshd after reexec writing it, except > adding SD_NOTIFY code so I gave up. What about reading the pidfile first to see if it has the correct PID before rewriting it? I did something like that to work around a problem in pam_loginuid (LinuxPAM ticket #23, I'd link to it but fedorahosted.org seems to have imploded). Does systemd even need a pidfile?
(In reply to Darren Tucker from comment #7) > (In reply to Jakub Jelen from comment #6) > > Created attachment 2950 [details] > > fixed patch > > > > Never mind. Nothing from above resolves the race condition between > > systemd reading PID file and sshd after reexec writing it, except > > adding SD_NOTIFY code so I gave up. > > What about reading the pidfile first to see if it has the correct > PID before rewriting it? I did something like that to work around a > problem in pam_loginuid (LinuxPAM ticket #23, I'd link to it but > fedorahosted.org seems to have imploded). I don't think that would help the initial race condition, when systemd tries to read the PID file before it is written after the daemon(). > Does systemd even need a pidfile? From man systemd.service: > If this setting [Type=forking] is used, it is recommended to also use the PIDFile= option, so that systemd can identify the main process of the daemon. systemd will proceed with starting follow-up units as soon as the parent process exits. There is also GuessMainPID= option, but from the documentation in the same manual page I am not convinced that it is something that we would like to use in case we can specify the PID file reliably.
In a non-forking type=notify case systemd keeps track of the sshd service as it is a child process. Any termination of the child process is a signal to the system launching process and all operations are handled in this way. I wouldn't worry about the pid file at all.
Created attachment 3099 [details] notify systemd when daemon reloading I observed a race similar to restarts during SSH daemon reloads. Environment: SLES12 SP2 openssh-7.2p2-74.1 systemd-228-150.15.2 Reproducer: ssh $REMOTE_HOST systemctl reload sshd.service ; ssh $REMOTE_HOST Actual behavior: The second SSH connection fails in roughly 50% cases. Expected behavior: First SSH blocks until daemon reloads and second connection succeeds. I propose to extend the suggested patch in comment 6 with another sd_notify() call (attached) so that systemd can properly track reloading state of the daemon. The two patches fixed the behavior in my testcase (I admit I didn't check actual systemd handling of the notification).
I don't understand the problem here. Can you not arrange for systemd to run sshd with the -D flag to begin with? When sshd restarts for SIGHUP it will reexec itself with the same commandline argument and will therefore not daemonise. wrt reporting errors when sshd is run with the -D flag, I don't understand the problem there either: sshd reports config parsing errors to stderr, and this isn't affected by the presence of the -D flag at all.
(In reply to Damien Miller from comment #11) > I don't understand the problem here. > > Can you not arrange for systemd to run sshd with the -D flag to > begin with? When sshd restarts for SIGHUP it will reexec itself with > the same commandline argument and will therefore not daemonise. That is an option, yet it has other drawbacks and one needs to carefully step among all the combinations that have some kind of race condition or similar problems (those were actually the reason for the Debian/RH/SUSE issues referenced above). That said, putting in an ifdeffed init system integration doesn't need to be a bad idea (I intentionally avoided writing 'systemd', as I personally consider this to be a broader issue, which would be present with any other init system that would want to keep track of the state the services are in.) Would it help if the state reporting was init-system agnostic - say split into a separate file(s) the way auditing is? (From my point of view that would be the right thing.)