I'm using OpenSSH v4.3p2 against a NetApp filer. It does NOT implement ServerAlive*, and correctly returns SSH2_MSG_UNIMPLEMENTED if sent a ServerAlive packet (as seen with ssh -v). ssh unfortunately however still disconnects the client after sending ServerAliveCountMax unanswered packets even though the server correctly told the client that it is not implemented. ssh should either disable ServerAlive* when the server says it is not implemented, or perhaps treat SSH2_MSG_UNIMPLEMENTED packets as a valid ServerAlive ack. (My vote is the former). (Who is also submitting an RFE to get ServerAlive implemented in NetApp's sshd, but isn't holding his breath).
Created attachment 1263 [details] Reset activity counters on SSH2_MSG_UNIMPLEMENTED and SSH2_IGNORE messages. SSH2_MSG_UNIMPLEMENTED is handled in packet.c does not make it back up to where it could reset the inactivity timer. This should probably be fixed (SSH2_MSG_IGNORE too). Please try the attached patch (against -current but will probably apply to 4.6p1 too).
Comment on attachment 1263 [details] Reset activity counters on SSH2_MSG_UNIMPLEMENTED and SSH2_IGNORE messages. >- case SSH2_MSG_UNIMPLEMENTED: >- seqnr = packet_get_int(); >- debug("Received SSH2_MSG_UNIMPLEMENTED for %u", >- seqnr); >- break; > default: > return type; I just realized that we could replace the break with /* FALLTHROUGH */ and save a few bytes by not having to duplicate the debug in serverloop.c and clientloop.c
Created attachment 1264 [details] Simplify patch #1263.
The patch looks generally OK, but I'm undecided on whether SSH2_MSG_IGNORE should tickle the keepalive watchdog. It comes down to whether the keepalive is a unidirectional "is the other end still alive?" or end-to-end thing "is the other end still responding to me?". If it is an end-to-end thing then I think SSH2_MSG_IGNORE should not restart the keepalive timer as these may be sent by the peer as their own keepalives, whereas SSH2_MSG_UNIMPLEMENTED will only be sent in response to something that the local end has send (fulfilling the end-to-end test requirement).
(In reply to comment #4) > The patch looks generally OK, but I'm undecided on whether > SSH2_MSG_IGNORE should tickle the keepalive watchdog. It comes down to > whether the keepalive is a unidirectional "is the other end still > alive?" or end-to-end thing "is the other end still responding to me?". i think it's a "does the other end still exist and send messages" thing.
(In reply to comment #4) > If it is an end-to-end thing then I think SSH2_MSG_IGNORE should not > restart the keepalive timer as these may be sent by the peer as their > own keepalives, whereas SSH2_MSG_UNIMPLEMENTED will only be sent in > response to something that the local end has send (fulfilling the > end-to-end test requirement). Unless I'm misreading the code, any global request from the other end (eg a port forward request) other than IGNORE or UNIMPLEMENTED will reset the keepalive timer, even if it's not in response to a keepalive packet. I think that it IGNORE messages should reset the keepalive as they're just as good an indication that the other end is alive as any other packet (more, if the peer is using them for its own keepalive).
Comment on attachment 1264 [details] Simplify patch #1263. >--- packet.c 17 Jan 2007 00:00:14 -0000 1.146 >+++ packet.c 17 Apr 2007 04:51:57 -0000 >@@ -967,9 +967,10 @@ packet_read_expect(int expected_type) > * packet_process_incoming. If so, reads the packet; otherwise returns > * SSH_MSG_NONE. This does not wait for data from the connection. > * >- * SSH_MSG_DISCONNECT is handled specially here. Also, >- * SSH_MSG_IGNORE messages are skipped by this function and are never returned >- * to higher levels. >+ * SSH_MSG_DISCONNECT is handled specially here. Also, SSH_MSG_IGNORE >+ * messages are skipped by this function and are never returned >+ * to higher levels, although SSH2_MSG_IGNORE are since they are needed >+ * for keepalives. This comment is a little wordy and confusing. Maybe just "SSH_MSG_IGNORE is passed up without processing for use as a keepalive"?
(In reply to comment #7) > This comment is a little wordy and confusing. Maybe just > "SSH_MSG_IGNORE is passed up without processing for use as a > keepalive"? It's wordy because it talks about SSH_MSG_IGNORE (Protocol 1) and SSH2_MSG_IGNORE (Protocol 2) separately because they are handled differently. It could probably be clearer. David, can you confirm whether or not patch #1264 resolves the problem for you? Thanks.
This has missed the window for 4.7 so target 4.8. David, could you please confirm that the patch resolves your problem? Thanks.
Created attachment 1424 [details] Hack sshd to provide testcase for this change. This patch is a nasty hack to make sshd behave the same as described here to test this change. Against the hacked sshd, an unpatched client disconnects with "Timeout, server not responding." after ServerAliveInterval * ServerAliveCountMax. A patched client keeps running. David, it would be appreciated if you could verify that patch #1264 resolves the problem with the actual device(s) where the behaviour was observed. Thanks.
Patch #1264 has been committed and will be in 4.8. Thanks.
This change causes a problem if the other end sends an ignore message while we're waiting in packet_read_expect(). See: http://marc.info/?l=openbsd-tech&m=120014160714738 I will attach an alternative patch.
Created attachment 1442 [details] Update keepalives inside packet.c instead.
Patch #1442 has been applied and will be in 4.8 and hopefully it will stay fixed this time. Thanks.
Fix shipped in 4.9/4.9p1 release.