When in a non-master ssh session connected to a Master session, ~. is not captured, and therefore the connection cannot be terminated, and the process cannot be easily closed. When a normal Master ssh process is present, escapes can be used on it. This is doubly frustrating with ControlPersist, where the Master connection is always in the background and never directly accessible. Open question: Should ~. terminate the master connection? Or should it simply terminate the ssh client
I can't replicate this - ~? and ~. work fine for slave connections of a master running in ControlPersist mode. Are you sure you are running 5.8p1 on the client?
nickuj@nickuj:~$ ssh -V OpenSSH_5.8p1 Debian-1ubuntu3, OpenSSL 0.9.8o 01 Jun 2010 It's the weirdest thing, I can't duplicate it any more either. I'll keep an eye out to see if I can figure out what changed.
Just hit again. Seems like if the network goes out, the master connection stalls, and the slave connection can't do escapes anymore.
Is there any chance you could catch the master where it is stalling with strace/ktrace/truss/gdb?
*** Bug 1938 has been marked as a duplicate of this bug. ***
I'm not a C developer, but if you provide me (link to) instructions how to run strace/ktrace/truss/gdb to get info for you - I'll do it when faced with this problem next time.
gdb: 1. find the pid of the unresponsive ssh process using "ps auxww | grep ssh" 2. attach to it using gdb "gdb /path/to/ssh" then "attach PID" (using the pid you found in step (1). 3. Capture a backtrace "where". The others depend on which platform you are on. Generally it is a matter of running "strace -p PID" (or similar for the others). ktrace is a little different, because you need to start the trace (using ktrace) then dump the output using kdump later. Check the manpage for the tool for your system.
I got a chance to catch a ssh client in this state the other day and connect to it with a debugger for a couple of minutes before ServerAliveInterval killed it. It turns out that the first ~. is actually killing the session, but for some reason the session isn't being released. Notes for next time: 1) clear server_alive_count_max to get a longer debugging session 2) try attaching before issuing ~. 3) reset log channel to syslog and loglevel to debug3
I've been hitting this pretty frequently lately, sshing into a machine with ControlPersist turned on (so the master connection is in a forked-off background process), and then rebooting it. As soon as the connection's gone, the session is frozen and ~. won't help, you have to kill the master process manually.
Created attachment 2297 [details] allow ~. to abandon mux master channels I think I figured this out. When your network changes or goes away and you disconnection with ~. ssh sends a channel close. normally this isn't a problem because the ssh goes away immediately thereafter. when you do it in a mux client, the mux client goes away but the mux master stays up. normally that's not a problem either, because the mux master is similarly wedged and can be ~.'ed too. that is, unless you also use controlpersist. when all of these things happen together the ssh mux master, which is backgrounded, hangs around waiting for the channel close confirmation from the server, which isn't going to happen because, hey, the network is busted. that wouldn't be a problem either except that the backgrounded mux master won't exit until all its channels are closed, and until it exits the controlmaster socket remains there preventing you from making a new one. the net result is that you can't make any new connections until you find and kill the backgrounded mux master. you can't just free the channel on ~. because in the case where the network is not broken you'll get a channel close from the server for a non-existant channel and the mux master will fatal. what this patch does is add a new "ABANDONED" state, which is basically the same as CLOSED or INPUT_DRAINING except it's not counted as an active channel. the ~. sequence then sends a close on the channel and puts it into this state. if the server confirmation comes back the channel is freed as per normal, but if not it's just kept around but not used. Please try the attached patch (it's against -current, I'll make an equivalent one against 6.2p2).
Created attachment 2298 [details] allow ~. to abandon mux master channels, diff vs 6.2p2
BTW, steps to reproduce: 1) start the master with controlpersist and wait for it to log in. 2) start the slave. 3) disconnect your network. 4) in one session. type some keystrokes then quit it with ~. 5) do the same in the other 6) profit! I put a lot of wear and tear on my laptop's ethernet port working on this...
Thanks. Patch applied, it will be in 6.3.
Set all RESOLVED bugs to CLOSED with release of OpenSSH 7.1