Using SSH port forwarding for mail transfers, large mails abort it. Steps to Reproduce: 1. ssh -v -t -R 2852:127.0.0.1:852 REMOTE sleep-command specifically: ssh -v -t -L 2525:127.0.0.1:25 -R 2852:127.0.0.1:852 -R 2022:127.0.0.1:22 paulina.vellum.cz 'while :;do echo -n .;sleep 1m;done';sleep 1m;done 2. Start transfer at the REMOTE host to port 2852. Actual results: local ssh: debug1: channel 3: new [127.0.0.1] debug1: confirm forwarded-tcpip debug3: channel 3: waiting for connection debug1: channel 3: connected ...debug2: channel 3: window 935334 sent adjust 1161818 debug2: channel 3: window 966656 sent adjust 1130496 buffer_get_string_ret: bad string length 557056 buffer_get_string: buffer error It seems to be a new regression in 4.7p1. See also https://bugzilla.redhat.com/show_bug.cgi?id=286181 and http://lists.mindrot.org/pipermail/openssh-unix-dev/2007-September/025661.html
What is the server doing at this time? The mailing list post had a fatal buffer error at that end too but it isn't clear where it is coming from. It would be really helpful if someone who is able to reproduce this bug could capture debug output from the server at the time of the client crash (sshd -ddd) and ideally a stack trace. To get a stack trace, replace the "cleanup_exit(255);" with "abort();" in fatal.c and recompile.
Created attachment 1349 [details] Fix - undo one patch. IMO the problem is due to the patch: - markus@cvs.openbsd.org 2007/06/11 09:14:00 [channels.h] increase default channel windows; ok djm The attached patch workarounds it (tested only briefly and only the client side). The problem is reported from packet.c: if (packet_length < 1 + 4 || packet_length > 256 * 1024) { #ifdef PACKET_DEBUG buffer_dump(&incoming_packet); #endif packet_disconnect("Bad packet length %u.", packet_length); } and the code is right - the size like 557056 is definitely > 256KB. Removing only this check does not help, the server then crashes on: Sep 17 21:25:45 host1 sshd[4072]: fatal: buffer_append_space: len 1326080 not supported The tested server (different than in my original bugreport) is: openssh-4.3p2-19.fc6.i386 I hope there is now enough info for understanding the problem. I expect you are aware of the window sizes negotiations across SSH versions and their maximum allowed values permitted by the protocol. Reproducer is: $ nc -l 5000 >/dev/null & ssh -vvvv -R 5000:localhost:5000 REMOTE_HOST 'nc </dev/urandom localhost 5000' (with local x86_64 openssh-4.7 under the test and remote openssh-4.3p2-19.fc6.i386, running over 11Mbit connection)
Observed the same issue, going between two Gentoo boxes, one 32 bit, the other 64. The patch worked. Can't do the debugging right now, but might be able to in the future, if necessary.
Created attachment 1425 [details] Correctly set max packet size for remote TCP forwarded connections. I believe I have found the root cause of this. I could not reproduce the connection termination, but I suspect that's simply a function of the relative speed of the hosts and/or link. Using the supplied command produced this gem in the debug output on the server side: debug1: channel 3: new [forwarded-tcpip] debug2: channel 3: open confirm rwindow 2097152 rmax 2097152 This means that, for this channel, the client is advertising a maximum (SSH) packet size of 2MB, which is silly as there is a hard coded sanity check in packet.c which limits the packet size to 256KB. I believe the attached patch will resolve the problem, without resorting to reverting the patch that increases the channel window sizes (and thus performance on long, fat pipes). Could someone who is having the problem please try the patch and let us know if it resolves the problem? Thanks.
Works for me.
Created attachment 1426 [details] Set correct SSH packet size in remote TCP forwarding and agent forwarding. It turns out there's a second instance where this can happen in the agent forwarding code, however it's extremely unlikely to be ever triggered. I'm including the updated patch here in case anyone picks this up for a backport, in which case they may as well get the fixes for both.
Thanks, Steven.
The patch in attachment #1426 [details] has just been committed and will be in 4.8. Thanks.
Fix shipped in 4.9/4.9p1 release.