Bug 2576 - ssh-agent enters busy loop when running out of fds
Summary: ssh-agent enters busy loop when running out of fds
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: ssh-agent (show other bugs)
Version: 7.2p1
Hardware: Other Linux
: P5 minor
Assignee: Damien Miller
URL:
Keywords:
Depends on:
Blocks: V_7_8
  Show dependency treegraph
 
Reported: 2016-05-30 21:37 AEST by Jakub Jelen
Modified: 2018-10-19 17:17 AEDT (History)
2 users (show)

See Also:


Attachments
avoid busy-wait on per-process fd exhaustion (4.19 KB, patch)
2016-05-31 11:45 AEST, Damien Miller
no flags Details | Diff
updated to current (3.63 KB, patch)
2018-04-13 14:36 AEST, Damien Miller
dtucker: ok+
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelen 2016-05-30 21:37:38 AEST
>  Lennart Poettering 2016-05-04 18:28:09 CEST

ssh-agent starts eating 100% if it gets bombarded by connections, and runs out of file descriptors to use. Looking at strace, it starts to cycle in a select() loop, where the listening AF_UNIX socket is reported active, which makes ssh-agent invoke accept() which will then fail with EMFILE. It will then immediately invoke select() again, and be in a busy loop from then on.

I figure ssh-agent should enforce a limit on concurrent connections (that is much lower than RLIMIT_NOFILE) and quickly terminate further incoming connections when that limit is hit. Most internet software handles this that way, and I figure ssh-agent should do that too for incoming local clients.

I noticed that while creating a ton of ssh connections to my local system in a tight loop, which uses the ssh keyring.

(When ssh-agent is in this mode, and you start further ssh instances with the & suffix in a shell (to make it background), then they will also enter a busy loop handling of SIGTTOU. I don't have further details about this, though, was too lazy to figure out what is really going on there).

>  Jakub Jelen 2016-05-26 17:01:26 CEST 

I was trying to burn my virtual box with a lot of requests to ssh-agent but only with partial success. But the behavior you explain sounds possible.

My test case:

  eval `ulimit -n 10; ssh-agent`
  ssh-add rsa
  cat rsa.pub >> .ssh/authorized_keys
  for i in `seq 1 128`; do ssh localhost id & done
  ls /proc/$SSH_AGENT_PID/fd/ | wc -w

and I am left with few cycling ssh processes in some cases, or with the ssh-agent live-locked.

-----------------------------------------------------------------------------

Copy from RHBZ#1333105 [1]. I can hack this somehow, but upstream fix with proper evaluation would make more sense, if it is considered as an issue.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1333105
Comment 1 Damien Miller 2016-05-31 11:45:17 AEST
Created attachment 2818 [details]
avoid busy-wait on per-process fd exhaustion

This patch should fix the bad behaviour on per-process fd exhaustion, but AFAIK ssh-agent will still spin if the system is globally exhausted.
Comment 2 Jakub Jelen 2016-05-31 17:18:20 AEST
Thank you for a prompt comment and patch. I build a package and tested successfully. I didn't see any more busy loop nor hang.
Comment 3 Damien Miller 2016-07-22 14:10:51 AEST
retarget unfinished bugs to next release
Comment 4 Damien Miller 2016-07-22 14:14:40 AEST
retarget unfinished bugs to next release
Comment 5 Damien Miller 2016-07-22 14:15:44 AEST
retarget unfinished bugs to next release
Comment 6 Damien Miller 2016-07-22 14:17:13 AEST
retarget unfinished bugs to next release
Comment 7 Damien Miller 2016-12-16 14:31:12 AEDT
OpenSSH 7.4 release is closing; punt the bugs to 7.5
Comment 8 Damien Miller 2017-06-30 13:43:13 AEST
Move incomplete bugs to openssh-7.6 target since 7.5 shipped a while back.

To calibrate expectations, there's little chance all of these are going to make 7.6.
Comment 9 Damien Miller 2017-06-30 13:44:28 AEST
remove 7.5 target
Comment 10 Damien Miller 2018-04-06 13:12:18 AEST
Move to OpenSSH 7.8 tracking bug
Comment 11 Damien Miller 2018-04-13 14:36:35 AEST
Created attachment 3142 [details]
updated to current

I rewrote ssh-agent's mainloop from select(2) to poll(2) a little while ago. It makes this diff quite a bit simpler.
Comment 12 Damien Miller 2018-05-11 13:39:32 AEST
Fix committed and will be in OpenSSH 7.8 - thanks
Comment 13 Damien Miller 2018-10-19 17:17:19 AEDT
Close RESOLVED bugs with the release of openssh-8.0