Bug 1360 - Connection aborted on large data -R transfer
Summary: Connection aborted on large data -R transfer
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: ssh (show other bugs)
Version: 4.7p1
Hardware: All All
: P2 major
Assignee: Assigned to nobody
URL:
Keywords:
Depends on:
Blocks: V_4_8
  Show dependency treegraph
 
Reported: 2007-09-12 02:54 AEST by Tomas Mraz
Modified: 2008-03-31 15:21 AEDT (History)
4 users (show)

See Also:


Attachments
Fix - undo one patch. (671 bytes, patch)
2007-09-18 05:42 AEST, Jan Kratochvil
no flags Details | Diff
Correctly set max packet size for remote TCP forwarded connections. (661 bytes, patch)
2007-12-29 04:43 AEDT, Darren Tucker
no flags Details | Diff
Set correct SSH packet size in remote TCP forwarding and agent forwarding. (1.01 KB, patch)
2007-12-29 07:33 AEDT, Darren Tucker
dtucker: ok?
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tomas Mraz 2007-09-12 02:54:56 AEST
Using SSH port forwarding for mail transfers, large mails abort it.

Steps to Reproduce:
1. ssh -v -t -R 2852:127.0.0.1:852 REMOTE sleep-command
   specifically: ssh -v -t -L 2525:127.0.0.1:25 -R 2852:127.0.0.1:852 -R
2022:127.0.0.1:22 paulina.vellum.cz 'while :;do echo -n .;sleep 1m;done';sleep
1m;done
2. Start transfer at the REMOTE host to port 2852.

Actual results:
local ssh:
debug1: channel 3: new [127.0.0.1]
debug1: confirm forwarded-tcpip
debug3: channel 3: waiting for connection
debug1: channel 3: connected
...debug2: channel 3: window 935334 sent adjust 1161818
debug2: channel 3: window 966656 sent adjust 1130496
buffer_get_string_ret: bad string length 557056
buffer_get_string: buffer error

It seems to be a new regression in 4.7p1.

See also https://bugzilla.redhat.com/show_bug.cgi?id=286181
and http://lists.mindrot.org/pipermail/openssh-unix-dev/2007-September/025661.html
Comment 1 Damien Miller 2007-09-12 14:07:53 AEST
What is the server doing at this time? The mailing list post had a fatal buffer error at that end too but it isn't clear where it is coming from.

It would be really helpful if someone who is able to reproduce this bug could capture debug output from the server at the time of the client crash (sshd -ddd) and ideally a stack trace.

To get a stack trace, replace the "cleanup_exit(255);" with "abort();" in fatal.c and recompile. 
Comment 2 Jan Kratochvil 2007-09-18 05:42:54 AEST
Created attachment 1349 [details]
Fix - undo one patch.

IMO the problem is due to the patch:
   - markus@cvs.openbsd.org 2007/06/11 09:14:00
     [channels.h]
     increase default channel windows; ok djm

The attached patch workarounds it (tested only briefly and only the client side).

The problem is reported from packet.c:
                if (packet_length < 1 + 4 || packet_length > 256 * 1024) {
#ifdef PACKET_DEBUG
                        buffer_dump(&incoming_packet);
#endif
                        packet_disconnect("Bad packet length %u.", packet_length);
                }
and the code is right - the size like 557056 is definitely > 256KB.

Removing only this check does not help, the server then crashes on:
Sep 17 21:25:45 host1 sshd[4072]: fatal: buffer_append_space: len 1326080 not supported

The tested server (different than in my original bugreport) is:
openssh-4.3p2-19.fc6.i386

I hope there is now enough info for understanding the problem.
I expect you are aware of the window sizes negotiations across SSH versions and their maximum allowed values permitted by the protocol.

Reproducer is:
$ nc -l 5000 >/dev/null & ssh -vvvv -R 5000:localhost:5000 REMOTE_HOST 'nc </dev/urandom localhost 5000'
(with local x86_64 openssh-4.7 under the test and remote openssh-4.3p2-19.fc6.i386, running over 11Mbit connection)
Comment 3 steven_parkes 2007-12-28 06:29:00 AEDT
Observed the same issue, going between two Gentoo boxes, one 32 bit, the other 64. The patch worked.

Can't do the debugging right now, but might be able to in the future, if necessary.
Comment 4 Darren Tucker 2007-12-29 04:43:32 AEDT
Created attachment 1425 [details]
Correctly set max packet size for remote TCP forwarded connections.

I believe I have found the root cause of this.

I could not reproduce the connection termination, but I suspect that's simply a function of the relative speed of the hosts and/or link.  Using the supplied command produced this gem in the debug output on the server side:

debug1: channel 3: new [forwarded-tcpip]
debug2: channel 3: open confirm rwindow 2097152 rmax 2097152

This means that, for this channel, the client is advertising a maximum (SSH) packet size of 2MB, which is silly as there is a hard coded sanity check in packet.c which limits the packet size to 256KB.

I believe the attached patch will resolve the problem, without resorting to reverting the patch that increases the channel window sizes (and thus performance on long, fat pipes).

Could someone who is having the problem please try the patch and let us know if it resolves the problem?

Thanks.
Comment 5 steven_parkes 2007-12-29 05:15:10 AEDT
Works for me.
Comment 6 Darren Tucker 2007-12-29 07:33:16 AEDT
Created attachment 1426 [details]
Set correct SSH packet size in remote TCP forwarding and agent forwarding.

It turns out there's a second instance where this can happen in the agent forwarding code, however it's extremely unlikely to be ever triggered.

I'm including the updated patch here in case anyone picks this up for a backport, in which case they may as well get the fixes for both.
Comment 7 Darren Tucker 2007-12-29 07:35:38 AEDT
Thanks, Steven.
Comment 8 Darren Tucker 2007-12-29 09:45:37 AEDT
The patch in attachment #1426 [details] has just been committed and will be in 4.8.

Thanks.
Comment 9 Damien Miller 2008-03-31 15:21:14 AEDT
Fix shipped in 4.9/4.9p1 release.