Bug 2572 - dead sessions aren't closed despite ClientAlive enabled
Summary: dead sessions aren't closed despite ClientAlive enabled
Status: CLOSED DUPLICATE of bug 2252
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sshd (show other bugs)
Version: 6.9p1
Hardware: All Linux
: P5 major
Assignee: Assigned to nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-05-26 13:12 AEST by Christoph Anton Mitterer
Modified: 2016-08-02 10:41 AEST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christoph Anton Mitterer 2016-05-26 13:12:57 AEST
Hi.

I'm experiencing the following every now and then:
A ssh session somehow gets stuck and never gets closed despite ClientAlive messages being enabled.

Unfortunately I do not know how to reproduce it, nor did I found any other indicative log messages or so.
It happens with the Debian sid version of ssh, but I think I experience it since 6.9 (I think it wasn't happening in 6.8) - but maybe I mix things up here.
systemd is used sshd run in daemon mode.

I have amongst other the following set in sshd_config:
ClientAliveInterval     15
ClientAliveCountMax     8
TCPKeepAlive    no

AFAIU, ClientAlive messages should do more or less the same just not on the TCP level but within the encrypted SSH connection. So if that is gone and the client doesn't reply anymore, I'd expect sshd to kill the connection.

A current example shows me:
# w
 05:08:19 up 2 days,  5:19,  3 users,  load average: 0,00, 0,05, 0,05
USER     TTY      FROM             LOGIN@   IDLE   JCPU   PCPU WHAT
root     pts/0    141.[snipsnap]   Tue14   39:08m  0.23s  0.23s -bash
root     pts/1    142.[snipsnap]   Tue14   38:04m  0.23s  0.23s -bash
root     pts/2    2001:[snipsnap]  01:36    1.00s  0.34s  0.00s w

The ones on pts 0 and 1 are dead (they were made from the same laptop that makes the connection to 2, just from another network, and the laptop has been rebooted several times since then.


# netstat --inet --inet6 -pn | grep ssh
tcp        0      0 85.[snipsnap]:22        141.[snipsnap]:34016     ESTABLISHED 15847/sshd: root@pt 
tcp        0      0 85.[snipsnap]:22        142.[snipsnap]:51726     ESTABLISHED 17000/sshd: root@pt 
tcp6       0    276 2a01:[snipsnap]:46538 ESTABLISHED 29362/sshd: root@pt 

interestingly, the kernel doesn't kill of the connections either, despite them being definitely gone


Any ideas how to further debug that?

Thanks,
Chris.
Comment 1 Darren Tucker 2016-05-27 08:54:42 AEST
If you have time based rekeying enabled, maybe this:
https://anongit.mindrot.org/openssh.git/commit/?id=988e429d903acfb298bfddfd75e7994327adfed0


Failing that, setting "LogLevel debug3" in sshd_config would give some clues (but would be very noisy).
Comment 2 Christoph Anton Mitterer 2016-06-04 05:13:29 AEST
With timebased re-keying you mean e.g.:
/etc/ssh$ grep -i rekey *config
ssh_config:RekeyLimit		default 1h
sshd_config:RekeyLimit		default 1h
(which are also the values I've set it with).

Apart from that, I'll try to make your logs later,... unfortunately I cannot easily reproduce all different kinds of situations in which this problem happens (maybe they're all the same problem, maybe not), but simply disconnecting the network seems to be one case.
Comment 3 Damien Miller 2016-06-07 23:09:12 AEST
Yes, that's time-based rekeying. The commit Darren mentioned should fix your problem.
Comment 4 Darren Tucker 2016-07-20 11:03:21 AEST
We believe this is a duplicate of bug#2252, the fix for which will be in the 7.3 release.  If 7.3 doesn't fix it (you could try a snapshot now) then please reopen this bug.

Thanks.

*** This bug has been marked as a duplicate of bug 2252 ***
Comment 5 Christoph Anton Mitterer 2016-07-20 11:21:43 AEST
Hey.

Sorry that I somehow completely forgot my promise to produce the logs with the patch :-(

I think it's best now to simply wait until 7.3 hits Debian, an in case the I'd notice the issue again after that, I'd simply reopen :-)

Cheers and thanks,
Chris.
Comment 6 Damien Miller 2016-08-02 10:41:57 AEST
Close all resolved bugs after 7.3p1 release