Bug 2615 - LoginGraceTime bypass (DoS)
Summary: LoginGraceTime bypass (DoS)
Status: NEW
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sshd (show other bugs)
Version: 7.3p1
Hardware: SPARC Solaris
: P5 normal
Assignee: Assigned to nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-09-15 22:49 AEST by Tomas Kuthan
Modified: 2016-09-16 10:10 AEST (History)
1 user (show)

See Also:


Attachments
watchdog process backing-up login_grace_time alarm (4.78 KB, patch)
2016-09-15 22:56 AEST, Tomas Kuthan
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tomas Kuthan 2016-09-15 22:49:36 AEST
On one of our internal systems we have encountered an issue, which we believe could be exploited to mount a deny-of-service attack on sshd.

It was reported, that ssh service on that host was occasionally refusing new connections:

$ ssh triassic date
ssh_exchange_identification: Connection closed by remote host

It turned out, that there were multiple processes hanging in the kernel trying to access user's authorized_keys file on an NFS home directory.

The reason of the NFS hang is not that important (user's home dir has been moved to another server, but the directory was never unmounted, so NFS client was still trying to access the old server, where nfs/server service has already been disabled.)

The monitor processes are blocked in open(), called from auth_openfile(), called from user_key_allowed():

  core 'core.sshd.699975' of 699975:      /usr/lib/ssh/sshd -R 
     00007ff5c3658fbe __systemcall6 () + 1e 
     00007ff5c3622d4a __open () + 1a 
     00007ff5c363dbee open () + 12e 
     000000000045a20d auth_openfile () + 3d 
     0000000000465ccc user_key_allowed () + 3fc 
     000000000046999b mm_answer_keyallowed () + 45b 
     000000000046bf08 monitor_read () + 118 
     000000000046c2f8 monitor_child_preauth () + 308 
     000000000044cba0 main () + 1eb0 
     00000000004492d3 _start () + 43

NFS blocks most signals for the duration of the over-the-wire call, including SIGALRM. The alarm implementing login_grace_time was queued, but never delivered to the process. As a result, sshd process stayed unauthenticated much longer than LoginGraceTime seconds. The user tried ssh-ing in multiple times, eventually wasting up soft limit of MaxStartups connections. After that, sshd started probabilistically dropping connections of other users.

In this case this has happened by an accident.
But an attacker, who controls their NFS home directory, could use this to mount a DoS attacke on sshd. All they needed to do is stop nfs/service on their home dir server and try ssh-in using public key auth in a loop. Eventually it would hoard hard limit of MaxStartup connections and all successive ssh attempts would be dropped.

To reproduce:

----------------------------------
NFS server (servera)
----
zfs create rpool/nora
zfs set share.nfs=on rpool/nora
useradd -u 6378 -d /rpool/nora nora
chown -R nora /rpool/nora
# disable NFS to create hang:
svcadm disable nfs/server

SSH server (serverb)
----
useradd -u 6378 -d /home/nora nora
echo 'nora servera:/rpool/nora' >>/etc/auto_home

SSH client (can be the same host as NFS server)
----
# create key for pubkey auth:
ssh-keygen -t rsa
ssh nora@serverb

Result
----
ssh command hangs indefinitely.
There will be an sshd process on SSH server hanging in NFS.
----------------------------------
Comment 1 Tomas Kuthan 2016-09-15 22:56:57 AEST
Created attachment 2875 [details]
watchdog process backing-up login_grace_time alarm

I have implemented and successfully tested a candidate fix - a single purpose watchdog process backing up login_grace_time alarm in the main process. If the main process doesn't authenticate or exit in login_grace_time seconds, the watchdog kills it by SIGTERM (or eventually SIGKILL). Patch attached.

I have rejected several other fix ideas:
- threads - unlikely to be accepted upstream
- main sshd process keeping track of unauthenticated children
    - too much logic in process listening for new connection
- allow preauth child to send signal to the monitor
    - too much privs to unprivileged process
    - wouldn't work w/o privilege separation
Comment 2 Darren Tucker 2016-09-16 10:10:05 AEST
The down side is that the extra process makes it a bit easier to DoS it by pid exhaustion, although admittedly LoginGraceTime is much more likely to be the limiting factor.  I kinda hope there's another way to do this but I can't think of one offhand :-(