Bug 1700

Summary: ssh-agent dies under high load
Product: Portable OpenSSH Reporter: PFudd <kernel>
Component: ssh-agentAssignee: Assigned to nobody <unassigned-bugs>
Status: CLOSED INVALID    
Severity: minor CC: t8m
Priority: P2    
Version: 5.1p1   
Hardware: ix86   
OS: Linux   

Description PFudd 2010-01-21 17:52:18 AEDT
Hi..

I've found that under high load, the ssh-agent process dies.

The high load environment:
Home computer:  Fedora 10,  openssh-5.1p1
Gateway computer: Fedora 11, openssh-5.2p1
Cluster head node: MacOSX 10.6, openssh-5.2p1
6 Child nodes: MacOSX 10.4, 10.5, 10.6, various versions

The program:
On the cluster head node, run "ssh node### blastall -p blastp ..." hundreds of times (80 jobs simultaneously, using 'make -j 80'), where node### is a random child node.

The crash:
The ssh-agent process running on the home computer dies after a few minutes of this, causing the saved key to be lost, and new ssh connections require the passphrase for my private key.  Running 'ssh-agent tcsh' and 'ssh-add' fixes the problem at least for the children of one terminal window.  Logging out and back in restores full functionality.

Observations:
Running 40 jobs simultaneously works.
I should probably be using an ssh-agent on the cluster head node, or some other method that doesn't use ssh-agent.
I didn't see any rate-limiting code in ssh-agent for openssh-5.1p1, so this is probably a bug, not a feature.

Thanks!
Comment 1 Tomas Mraz 2010-01-21 18:17:14 AEDT
Are you sure that the ssh-agent running on the home computer (Fedora 10) is a real ssh-agent and not for example gnome-keyring-daemon emulating ssh-agent?

echo $SSH_AUTH_SOCK
will tell you more.
Comment 2 PFudd 2010-01-21 18:50:19 AEDT
(In reply to comment #1)
> Are you sure that the ssh-agent running on the home computer (Fedora
> 10) is a real ssh-agent and not for example gnome-keyring-daemon
> emulating ssh-agent?
> 
> echo $SSH_AUTH_SOCK
> will tell you more.

Ah, I see; it is the gnome-keyring-daemon.  I'll go file this bug with them.

Thanks!
Comment 3 PFudd 2010-01-21 19:15:08 AEDT
Apparently it's already been fixed in a later version.

I'm using gnome-keyring-2.24.1-1.fc10.i386.rpm, and the bug was reported in https://bugzilla.gnome.org/show_bug.cgi?id=580068 , and apparently fixed around August 2009.
Comment 4 Damien Miller 2010-04-16 15:49:40 AEST
Mass move of bugs RESOLVED->CLOSED following the release of openssh-5.5p1