Bug 277 - X11 forwarding problem behind Router/NAT box
Summary: X11 forwarding problem behind Router/NAT box
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sshd (show other bugs)
Version: -current
Hardware: SPARC Solaris
: P2 normal
Assignee: OpenSSH Bugzilla mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-06-15 01:30 AEST by Bruce Allen
Modified: 2004-04-14 12:24 AEST (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Bruce Allen 2002-06-15 01:30:54 AEST
I have a DSL line at home, and want to use X11 forwarding to run X clients on a
machine at work.
The X11 forwarding works fine when the home laptop is connected directly to the
DSL modem.
However I use a router at home so that I can connect several machines to the net
via the same DSL line.  The X11 forwarding does NOT work when I try to connect
to a solaris host from behind
the router.

The strange thing is that if I log into a different host (same version of sshd,
but running under linux)
then the X11 forwarding does work OK, even from behind the router.

This router does Network Address Translation (and is set up to forward port
22 to my laptop, so that I can also log into the laptop at home from my machine
at work)

So here is a summary:

without router:
  X11 forwarding from home laptop to linux box WORKS 
  X11 forwarding from home laptop to solaris box WORKS
with router
  X11 forwarding from home laptop to linux box WORKS
  X11 forwarding from home laptop to solaris box FAILS

I made a transcript using ssh -vX comparing a connection to the solaris box
with and without the router.  The transcripts (apart from the dates and the
phantom DISPLAY
values) are identical.

When I try to start an x client (say an xterm or xclock) the window freezes, and
I can not
use it any more.  I have to kill the shell in which I invoked ssh on the laptop.

I am enclosing below a transcript of a failed session.  I'd be happy to do some
additional
diagnostic work, but don't know where to go from here, and need guidance.

Thanks!
    Bruce Allen[ballen@dsl-65-187-169-17 /root]$ ssh -vX
ballen@dirac.phys.uwm.edu
OpenSSH_3.1p1, SSH protocols 1.5/2.0, OpenSSL 0x0090603f
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug1: Rhosts Authentication disabled, originating port will not be trusted.
debug1: restore_uid
debug1: ssh_connect: getuid 500 geteuid 500 anon 1
debug1: Connecting to dirac.phys.uwm.edu [129.89.57.19] port 22.
debug1: temporarily_use_uid: 500/500 (e=500)
debug1: restore_uid
debug1: temporarily_use_uid: 500/500 (e=500)
debug1: restore_uid
debug1: Connection established.
debug1: identity file /home/ballen/.ssh/identity type -1
debug1: identity file /home/ballen/.ssh/id_rsa type -1
debug1: identity file /home/ballen/.ssh/id_dsa type 2
debug1: Remote protocol version 1.99, remote software version OpenSSH_3.0.2p1
debug1: match: OpenSSH_3.0.2p1 pat OpenSSH*
Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_3.1p1
debug1: Credentials Expired
debug1: proxy expired: run grid-proxy-init or wgpi first 
        File=/tmp/x509up_u500
  Function:proxy_init_cred
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug1: kex: server->client aes128-cbc hmac-md5 none
debug1: kex: client->server aes128-cbc hmac-md5 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug1: dh_gen_key: priv key bits set: 129/256
debug1: bits set: 1632/3191
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug1: Host 'dirac.phys.uwm.edu' is known and matches the RSA host key.
debug1: Found key in /home/ballen/.ssh/known_hosts2:14
debug1: bits set: 1629/3191
debug1: ssh_rsa_verify: signature correct
debug1: kex_derive_keys
debug1: newkeys: mode 1
debug1: SSH2_MSG_NEWKEYS sent
debug1: waiting for SSH2_MSG_NEWKEYS
debug1: newkeys: mode 0
debug1: SSH2_MSG_NEWKEYS received
debug1: done: ssh_kex2.
debug1: send SSH2_MSG_SERVICE_REQUEST
debug1: service_accept: ssh-userauth
debug1: got SSH2_MSG_SERVICE_ACCEPT
debug1: authentications that can continue:
publickey,password,keyboard-interactive
debug1: next auth method to try is publickey
debug1: try privkey: /home/ballen/.ssh/identity
debug1: try privkey: /home/ballen/.ssh/id_rsa
debug1: try pubkey: /home/ballen/.ssh/id_dsa
debug1: authentications that can continue:
publickey,password,keyboard-interactive
debug1: next auth method to try is keyboard-interactive
debug1: authentications that can continue:
publickey,password,keyboard-interactive
debug1: next auth method to try is password
ballen@dirac.phys.uwm.edu's password: 

debug1: packet_send2: adding 64 (len 60 padlen 4 extra_pad 64)
debug1: ssh-userauth2 successful: method password
debug1: channel 0: new [client-session]
debug1: send channel open 0
debug1: Entering interactive session.
debug1: ssh_session2_setup: id 0
debug1: channel request 0: pty-req
debug1: Requesting X11 forwarding with authentication spoofing.
debug1: channel request 0: x11-req
debug1: channel request 0: shell
debug1: fd 3 setting TCP_NODELAY
debug1: channel 0: open confirm rwindow 0 rmax 16384
Last login: Fri Jun 14 00:33:29 2002 from dsl-65-187-169-
Sun Microsystems Inc.   SunOS 5.8       Generic February 2000
Sun Microsystems Inc.   SunOS 5.8       Generic February 2000
You have mail.
ballen@dirac> xterm &
[1] 1617
ballen@dirac> debug1: client_input_channel_open: ctype x11 rchan 3 win 4096 max
2048
debug1: client_request_x11: request from 129.89.57.19 33305
debug1: fd 7 setting O_NONBLOCK
debug1: channel 1: new [x11]
debug1: confirm x11

This is where everything hangs.


I've also printed out the environment on the machine after I have connected. 
Here it is:
ballen@dirac> env
USER=ballen
LOGNAME=ballen
HOME=/home/ballen
PATH=/usr/ccs/bin:/usr/local/Office51/bin:/home/ballen/bin:/usr/openwin/bin:/opt/Acrobat4/bin:/usr/sbin:/usr/local/bin:/usr/dt/bin:/usr/openwin/bin:/opt/dt/bin:/opt/SUNWspro/bin:/opt/SUNWste/bin:/opt/SUNWneo/bin:/opt/SUNWste/bin:/opt/SUNWimap/bin:/opt/SUNWsmsjc/bin:/opt/SUNWicg/bin:/opt/SUNWvts/bin:/opt/SUNWsms/bin:/opt/SUNWcorba/bin:/opt/SUNWsymon/bin:/opt/SUNWrtvc/bin:/usr/local/X11/bin:.:/home/ballen:/bin:/usr/bin:/usr/ucb:/etc:.:/usr/ccs/bin:/usr/ccs/lib:/usr/local/mpi/bin:/usr/lib/lp/postscript:/home/ballen/rvplayer5.0:/opt/hpnp/bin
MAIL=/var/mail//ballen
SHELL=/bin/tcsh
TZ=US/Central
SSH_CLIENT=65.187.169.17 64439 22
SSH_TTY=/dev/pts/33
TERM=xterm
DISPLAY=dirac:28.0
HOSTTYPE=sun4
VENDOR=sun
OSTYPE=solaris
MACHTYPE=sparc
SHLVL=1
PWD=/home/ballen
GROUP=uwmlsc
HOST=dirac
REMOTEHOST=dsl-65-187-169-17.telocity.com
MOZILLA_HOME=/usr/local/netscape
EDITOR=/usr/openwin/bin/textedit
CVSROOT=/home/cvs/CVS_REPOSITORY/repository_GRASP
NNTPSERVER=news.uwm.edu
ENSCRIPT=-fTimes-Roman10
TG_HOME=/local/tgraph
TG_HOST=dirac.phys.uwm.edu
MANPATH=/usr/openwin/man:/opt/SUNWspro/man:/opt/SUNWste/license_tools/man:/usr/share/man:/usr/local/man:/usr/local/mpi/man:/opt/hpnp/man:
INFOPATH=/usr/local/info
TMPDIR=/tmp/
LD_LIBRARY_PATH=/usr/local/lib:/opt/hpnp/lib
PRINTER=hp2200_1
Comment 1 Kevin Steves 2002-06-15 14:19:25 AEST
i don't know what this is:

debug1: Credentials Expired                                                     
debug1: proxy expired: run grid-proxy-init or wgpi first                        
        File=/tmp/x509up_u500                                                   
  Function:proxy_init_cred

i don't have any guesses now.  would like to see sshd -ddd
on solaris for the fail case.
Comment 2 Darren Tucker 2002-06-15 20:07:56 AEST
Here's an edited version from a previous (emailed) answer to this:
 
Short answer:
You probably have an MTU/fragmentation problem. For each network interface on 
both client and server set the MTU to 576, eg "ifconfig ethX mtu 576". If the 
problem goes away, read on.

Long answer:
At each routing hop, IP packets bigger than the outgoing interface's MTU get 
fragmented. Only the first fragment has TCP port numbers. Firewalls usually drop 
everything but the first fragment since it can't be matched against the 
rulebase. Some NAT configuration (eg many-to-one NAT or port address 
translation) can't match the fragments against their translation state tables.

Logging in and using the shell will normally generate relatively small packets, 
however if you something that generates a lot of data (eg cat'ing a big file or 
starting an X app, you may generate a packet bigger than the MTU.

Let's say it's a 1500 byte IP packet and the router has 2 different MTUs (say 
1500 & 1484) and no firewall. When the router goes to forward it, the packet is 
too big for the interface MTU (1484), so the router breaks it into 2 fragments, 
0 and 1. Fragment 0 contains the first 1484 bytes (including the TCP source and 
dest ports) and fragment 1 contains the remaining 16 bytes. Both fragments are 
sent on to their destinations.

When the first fragment reaches its target, it's held by the IP stack until the 
remaining fragments arrive, at which time the IP packet is reassembled and 
passed up the stack to TCP. If all fragments are not received by the timeout, 
the entire IP packet is discarded and an ICMP "timeout during reassembly" error 
is sent back.

Now add your firewall, which drops fragment 1. Your 1500 byte IP packet times 
out during reassembly and TCP retries, by sending another 1500 byte packet. 
Repeat. Eventually, TCP will time out and you'll get a connection termination.

IP stack parameters (such as Path MTU Discovery) and external variable (such as 
the MTUs of all the hops between hosts) can also affect whether or not a given 
connection will be affected.

Maybe I ought to submit this to the FAQ maintainer....
Comment 3 Bruce Allen 2002-06-15 23:47:46 AEST
Darren -- you were correct -- it was fragmented packets not getting forwarded
by the NAT box.   I am closing out the bug report.  Details follow.

Thanks you!

The following command on the Solaris box:
  ifconfig hme0 mtu 576
solved the problem.  Unfortunately this Solaris box has some NFS mounted
partitions.  These small MTU values really clobber NFS performance so I'll
probably need to reset the mtu value each time I want to to X11 forwarding.
Sigh.

I'll experiment to find the largest acceptable MTU value.
I don't know where the packets are getting fragmented -- probably by my DSL
provider.  And I agree that you should add this to the FAQ -- I read the FAQ
closely before posting my bug report so if I had seen your posting in the FAQ
it would have saved everyone's time and bandwidth!

Thanks again!  I still can't believe how well the open-source model works
when the developers are committed to their products.

Bruce



******************************************

Kevin -- the thing that you didn't recognize is a (failed) certificate-based
authentication attempt.  This is there because I use some Globus Grid resources
which use strictly certificate-based authentication.  I don't know if this
is part of the standard ssh client or if mine has been linked against some
Globus-enhanced libraries.  In any case, it's not the source of my problem,
which Darren correctly identified.
Comment 4 Damien Miller 2004-04-14 12:24:18 AEST
Mass change of RESOLVED bugs to CLOSED