Bug 1086 - X forwarding won't start when a command is executed in background
Summary: X forwarding won't start when a command is executed in background
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sshd (show other bugs)
Version: 4.2p1
Hardware: All Linux
: P2 normal
Assignee: Damien Miller
URL:
Keywords:
Depends on:
Blocks: 1047
  Show dependency treegraph
 
Reported: 2005-09-20 07:38 AEST by Tomas Mraz
Modified: 2006-10-07 11:42 AEST (History)
0 users

See Also:


Attachments
Probable fix (13.28 KB, patch)
2005-09-24 11:47 AEST, Damien Miller
no flags Details | Diff
Better patch (5.93 KB, patch)
2005-09-26 08:24 AEST, Damien Miller
no flags Details | Diff
Improved fix (5.94 KB, patch)
2005-09-26 21:43 AEST, Damien Miller
dtucker: ok+
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tomas Mraz 2005-09-20 07:38:43 AEST
This is actually a regression against openssh-4.1p1.

Try: ssh -Y user@host "(sleep 5; xterm) &"

After 5 seconds it will print: xterm Xt error: Can't open display: localhost:10.0

It will work fine on openssh-4.1p1 server.
Comment 1 Darren Tucker 2005-09-20 11:45:09 AEST
It appears to be fallout from this change (channels.c rev 1.200 and session.c
rev 1.302):

   - djm@cvs.openbsd.org 2005/07/17 06:49:04
     [channels.c channels.h session.c session.h]
     Fix a number of X11 forwarding channel leaks:
     1. Refuse multiple X11 forwarding requests on the same session
     2. Clean up all listeners after a single_connection X11 forward, not just
        the one that made the single connection
     3. Destroy X11 listeners when the session owning them goes away
     testing and ok dtucker@

It's probably (3) since the "()&" causes the child of sshd to exit immediately.
 Maybe it needs to wait until the session close completes before destroying the
X11 listeners? 
Comment 2 Damien Miller 2005-09-23 18:57:08 AEST
hm, this is tricky: the /session/ is closed immediately upon child exit, but the
/channel/ persists after this time (until read EOF or error). 

The X11 listeners need to be shut down when the /channel/ goes away, but we
don't have any way to mark the listeners as dependant on the channel. The
choices are:

1. Move x11_chanids from "struct Session" into "struct Channel", so we can
deallocate them with a cleanup function. (ugly layering violation, but we
already have several of these)

2. Don't session_close() a session until its undelying channel goes away
entirely. This may be a good way to do it - just don't deallocate the channel in
session_exit_message(), instead let the channel close callback take care of it
(include X11 closure)

3. Make all sessions share a single X11 listener. This could be simple, but it
would move us further away from supporting multiple X11 forwarding in the future
(if we ever chose to do it)

Comment 3 Damien Miller 2005-09-23 19:36:49 AEST
4. Continue tracking the x11_chanids in session.c in a separate structure to the
session and detach them using a channel callback registered after the normal one
is detached in session_exit_message()

(the obvious approach [2] turns out not to work for reasons that aren't entirely
clear to me).
Comment 4 Damien Miller 2005-09-24 11:47:31 AEST
Created attachment 964 [details]
Probable fix

This is approach #4, creating a new list of X11 channels and the session
channels that they depend upon in session.c.

Please test - we need tests of this particular bug, but also regress and
general usage.
Comment 5 Damien Miller 2005-09-24 12:00:39 AEST
approach #2 didn't work because I wasn't able to arrange for the correct
sequence of "send exit message, close session channel and then close dependent
X11 channels" to occur. If anyone wants to try this, make sure you run the
regression tests, particularly t-exit-status.sh 
Comment 6 Damien Miller 2005-09-26 08:24:21 AEST
Created attachment 965 [details]
Better patch

This is a better (more simple) fix, making approach #2 work right.
Comment 7 Damien Miller 2005-09-26 21:43:38 AEST
Created attachment 967 [details]
Improved fix

This patch is better, it doesn't leak sessions after they are closed.
Comment 8 Tomas Mraz 2005-09-27 03:27:11 AEST
I've tested the patch from comment 7 and the reported problem is fixed.

However it makes it fail the dynamic-forward.sh test. I haven't tried skipping
it so I don't know if later test succeed.
Comment 9 Tomas Mraz 2005-09-27 03:29:53 AEST
I've forgot to put here the test output:
run test dynamic-forward.sh ...
Waiting for forwarded connections to terminate...
The following connections are open:
  #1 direct-tcpip: listening port 4243 for 127.0.0.1 port 4242, connect from
127.0.0.1 port 35822 (t4 r3 i0/0 o3/0 fd 10/10 cfd -1)
nc: unexpected reply size 31 (expected 10)
ssh_exchange_identification: Connection closed by remote host
cmp: EOF on /home/mraz/rhcvs/openssh/FC-4/openssh-4.2p1/regress/ls.copy
corrupted copy of /bin/ls
failed dynamic forwarding
Comment 10 Damien Miller 2005-09-27 08:02:52 AEST
I am sure that you are seeing a bug in nc that was fixed between OpenBSD 3.7 and
3.8. Could you try updating it and running the test again.
Comment 11 Tomas Mraz 2005-09-27 21:57:32 AEST
Actually it was a bug in a patch in Fedora Core 4 nc RPM package. After fixing
it the patched openssh passes all tests fine.
Comment 12 Darren Tucker 2005-10-03 15:02:15 AEST
Comment on attachment 967 [details]
Improved fix

Attachment id #967 looks and tests OK here.
Comment 13 Damien Miller 2005-10-10 20:24:01 AEST
Fix committed - will be in 4.3
Comment 14 Darren Tucker 2006-10-07 11:42:08 AEST
Change all RESOLVED bug to CLOSED with the exception of the ones fixed post-4.4.