Bug 2265 - ServerAlive{Interval,CountMax} ignored if using an active -R or -L tunnel
Summary: ServerAlive{Interval,CountMax} ignored if using an active -R or -L tunnel
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: ssh (show other bugs)
Version: -current
Hardware: All All
: P5 normal
Assignee: Assigned to nobody
URL:
Keywords:
Depends on:
Blocks: V_8_4
  Show dependency treegraph
 
Reported: 2014-08-26 07:34 AEST by openssh
Modified: 2023-01-13 13:27 AEDT (History)
5 users (show)

See Also:


Attachments
ServerAliveInterval doesn't work if client keeps trying to send data (3.09 KB, patch)
2020-06-26 13:54 AEST, Darren Tucker
no flags Details | Diff
Make ServerAlive behave correctly during client port forward activity (3.01 KB, patch)
2020-06-26 15:41 AEST, Darren Tucker
no flags Details | Diff
Move the ServerAlive scheduling into a helper function. (3.59 KB, patch)
2020-06-26 15:56 AEST, Darren Tucker
no flags Details | Diff
Move the ServerAlive scheduling into a helper function. (3.59 KB, patch)
2020-06-26 16:01 AEST, Darren Tucker
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description openssh 2014-08-26 07:34:27 AEST
Scenario:

1. Set up a local socket server that sends data slowly enough so that buffers would take hours to fill up:

  $ (until false; do echo -n X; sleep 2; done) | nc -l 8000 &

2. Connect through an unreliable connection, asking to detect a broken connection within 10 seconds (5 second "alive" signals, 2 missing maximum)

  $ ssh -R 8001:127.0.0.1:8000 \
        -o 'ServerAliveInterval 5' -o 'ServerAliveCountMax 2' \
        -o 'ProxyCommand nc 127.0.0.1 22' \
        127.0.0.1 'telnet 127.0.0.1 8001'

(this assumes you can ssh into localhost using either a password or public key authentication)

3. Observe that indeed, you are getting 'X' printed every 2 seconds, through the ssh tunnel.

4. Suspend the intermediate proxy - in another terminal / screen session (or after backgrounding the ssh command above), do:

   $ pkill -STOP -xf 'nc 127.0.0.1 22'

5. Wait 10 seconds for ServerAlive detection to kick in. Or 10 hours. ServerAlive detection never actually kicks in.

6. Tear down everything (it is enough to Ctrl-C the ssh command)

7. Repeat steps 1-5, this time, with 'sleep 2' replaced by 'sleep 30'. This time, ServerAlive detection kicks in as expected.

This happens on every openssh version I've tried (All on linux, the versions on ubuntu 8.04, 10.04, 10.10, 12.04, 14.04), and is still in current from browsing the source code.

The problem is the "ServerAlive" logic (and I assume, also the ClientAlive logic on the server side - though I haven't verified that yet): A connection is deemed "alive" if the select() waiting for data did not time out. 

However, it should be deemed alive only if there has been data on the ssh connection itself - not the local ends of a -L / -R tunnel and whatever other local sockets might be waited upon with select(). 
As the above example shows, even though the connection to the server is effectively dead, it will not be detected.

This setup is artificial, and is easier to debug than a real world setting. It includes:

- the ssh server
- an intermediate pipe ('nc 127.0.0.1 22') that can be kill -STOPped without dropping the connection
- the ssh client
- a slow server that trickles data through a tunnel

In a real world scenario, the intermediate pipe is likely to be an unreliable network connection (e.g. an intermediate router somewhere along the way that is not directly connected to a client interface - and that stops routing traffic in the middle of the session). If this is the case, then eventually the ssh client will have a TCP timeout (2 mins, usually) and detect the broken connection -- which is why I suppose this was not previously reported. However, if there is no indication the intermediate connection died (like in the example I gave above), then the ssh client will hang forever, despite the "ServerAlive*" settings.

As I mentioned, this likely applies to the sshd, ClientAliveInterval, ClientAliveCountMax respectively, though I haven't verified it.
Comment 1 openssh 2014-09-03 19:02:44 AEST
Note that in some circumstances this can be leveraged into a denial-of-service attack - if an attacker is able to disconnect a remote connection and feed data locally at the same time, they can avoid new data coming in.

(I found this out while investigating what looked like a DOS but eventually wasn't)
Comment 2 jxraynor 2020-06-01 13:02:58 AEST
The patch sent to the mailing list here:

https://lists.mindrot.org/pipermail/openssh-unix-dev/2020-May/038522.html

...will fix this issue.  However, the patch is currently in limbo, neither accepted nor rejected.
Comment 3 Darren Tucker 2020-06-26 13:54:16 AEST
Created attachment 3417 [details]
ServerAliveInterval doesn't work if client keeps trying to send data

Patch in question for commenting.
Comment 4 Darren Tucker 2020-06-26 15:31:16 AEST
Comment on attachment 3417 [details]
ServerAliveInterval doesn't work if client keeps trying to send data

Looks mostly ok, there's a couple of long lines and one comment:

>+		timeout_secs = server_alive_time - now;
>+		if (timeout_secs < 0)
>+			timeout_secs = 0;

This can be a MAXIMUM(..) which is shorter and consistent with the rest of the code.

I'll attach an updated patch shortly.
Comment 5 Darren Tucker 2020-06-26 15:41:11 AEST
Created attachment 3419 [details]
Make ServerAlive behave correctly during client port forward activity
Comment 6 Darren Tucker 2020-06-26 15:56:49 AEST
Created attachment 3420 [details]
Move the ServerAlive scheduling into a helper function.

To me this is a bit easier to read.
Comment 7 Darren Tucker 2020-06-26 16:01:20 AEST
Created attachment 3421 [details]
Move the ServerAlive scheduling into a helper function.

fix typo
Comment 8 Darren Tucker 2020-07-03 15:10:46 AEST
(modified) patch applied and and will be in the 8.4 release.  Thanks for the report and patch.
Comment 9 Damien Miller 2021-03-04 09:54:44 AEDT
close bugs that were resolved in OpenSSH 8.5 release cycle