Bug 160 - Race condition in clientloop.c?
Summary: Race condition in clientloop.c?
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: ssh (show other bugs)
Version: -current
Hardware: All All
: P2 major
Assignee: OpenSSH Bugzilla mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-03-13 08:04 AEDT by Nicolas Williams
Modified: 2004-04-14 12:24 AEST (History)
0 users

See Also:


Attachments
Debug output, lsof output, etc... (43.79 KB, text/plain)
2002-03-13 09:10 AEDT, Nicolas Williams
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Nicolas Williams 2002-03-13 08:04:07 AEDT
We sometimes see SCP exit while leaving a hung SSH behind. SSH is left stuck in
a poll()/select() waiting for the SSH connection socket to be readable.

Nothing in the -v -v -v output is untoward. The server-side sshd -d -d -d output
is the same whether the client hangs or not. In either case the client and
server both close and free the last open channel (the session channel).

See the "scp completes but ssh subprocess in deadlock with sshd" thread on the
openssh-unix-dev post list.

I will attach a tar file containing ssh -vvv and ssh -ddd output, lsof output,
etc... for good scps and hanging sshs. Note that a select() wrapper was
LD_PRELOADed into ssh that prints the list of file descriptors being selected
for in every call to select(); source will be attached.

This bug appears to be a race condition in the client. Versions of OpenSSH
affected apparently include 2.9p2, 3.0.2p1 and 3.1p1.

See these openssh-unix-dev posts:

http://marc.theaimsgroup.com/?l=openssh-unix-dev&m=101588612615615&w=2
http://marc.theaimsgroup.com/?l=openssh-unix-dev&m=101596073221780&w=2
Comment 1 Nicolas Williams 2002-03-13 09:10:10 AEDT
Created attachment 40 [details]
Debug output, lsof output, etc...
Comment 2 Nicolas Williams 2002-03-14 08:25:23 AEDT
Aha!

Yes, there is a race. It's there in 2.9p2, but apparently not in 3.0.2p1.

Essentially the

"if (compat20 && session_closed && !channel_still_open())"

check at the top of the client loop is not close enough to the
call to select() in client_wait_until_can_do_something(). In fact,
client_wait_until_can_do_something() calls channel_prepare_select()
which calls channel_handler() which may well call chan_is_dead()
which may leave no channels open and yet
client_wait_until_can_do_something() will still go into the
select().
Comment 3 Damien Miller 2004-04-14 12:24:18 AEST
Mass change of RESOLVED bugs to CLOSED