Bug 595 - lots of simultaneous ssh's cause sporadic failures
Summary: lots of simultaneous ssh's cause sporadic failures
Status: CLOSED INVALID
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: ssh (show other bugs)
Version: -current
Hardware: All All
: P2 normal
Assignee: OpenSSH Bugzilla mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-06-17 06:24 AEST by Jason Duell
Modified: 2004-04-14 12:24 AEST (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jason Duell 2003-06-17 06:24:07 AEST
Openssh seems to fail sporadically if you issue lots of simultaneous
ssh commands.  Take the following program:

    #!/bin/sh

    for NUM in 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15; do
        ssh foo.bar.com echo $NUM &
    done

So, we're running 16 ssh commands at once, each of which just prints out a
different number.

When I run this program, several of the ssh commands fail with 

    ssh_exchange_identification: Connection closed by remote host

Interestingly, when I run 10 or fewer ssh commands, they all work OK, at
least on my linux box (I'm using OpenSSH 3.5p1-6 on Redhat Linux 9).  On 
some other platforms the number is different:  OpenSSH 3.2.3p1 on an IBM 
SP doesn't like more than 8 simultaneous ssh's in the background, while
OpenSSH_3.6.1p1 on Tru64 does around 9 max.

There doesn't seem to be any pattern in terms of which ssh's get 
killed--the first, second and third jobs (ie, those that print 0, 1, and 
2) generally run OK, but which of the following ones die seems to be
random.

This smells like some kind of race condition.

Why on earth would I want to run a dozen ssh jobs simultaneously?  I'm
writing a compiler that needs to ship some files and run some commands 
on a remote server as part of the compilation process.  The latency for
doing this is rather high, so I want to allow users to do a 'make -j' to
parallelize the build, in order to hide the network latency.

I guess for now I'll tell users to run 'make -j N' with N < 6 or so
(which is probably not a bad idea anyway).  But I can't imagine I'll be
the last person to pound on ssh like this...
Comment 1 Jason Duell 2003-06-17 10:08:36 AEST
Oops.  This is not a bug...

Sounds like someone is not reading the manpages. <smile>

man sshd_config
[..]
     MaxStartups
             Specifies the maximum number of concurrent unauthenticated con-
             nections to the sshd daemon.  Additional connections will be
             dropped until authentication succeeds or the LoginGraceTime ex-
             pires for a connection.  The default is 10.

             Alternatively, random early drop can be enabled by specifying the
             three colon separated values ``start:rate:full'' (e.g.,
             "10:30:60").  sshd will refuse connection attempts with a proba-
             bility of ``rate/100'' (30%) if there are currently ``start''
             (10) unauthenticated connections.  The probability increases lin-
             early and all connection attempts are refused if the number of
             unauthenticated connections reaches ``full'' (60).

 
Default is '10'.. I bump it up to 20 and your script works fine.

- Ben
Comment 2 Damien Miller 2004-04-14 12:24:19 AEST
Mass change of RESOLVED bugs to CLOSED