Bug 787 - Minor security problem due to use of deprecated NGROUPS_MAX in uidswap.c (sshd)
Summary: Minor security problem due to use of deprecated NGROUPS_MAX in uidswap.c (sshd)
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sshd (show other bugs)
Version: 3.7.1p2
Hardware: All All
: P2 normal
Assignee: OpenSSH Bugzilla mailing list
URL:
Keywords:
Depends on:
Blocks: 793
  Show dependency treegraph
 
Reported: 2004-01-13 21:55 AEDT by Holger van Lengerich
Modified: 2004-04-14 12:24 AEST (History)
1 user (show)

See Also:


Attachments
Use sysconf where available for NGROUPS_MAX. (3.11 KB, patch)
2004-02-06 20:44 AEDT, Darren Tucker
no flags Details | Diff
NGROUPS patch (3.55 KB, patch)
2004-02-20 13:01 AEDT, Tim Hockin
no flags Details | Diff
new NGROUPS patch (3.60 KB, patch)
2004-02-24 08:33 AEDT, Tim Hockin
no flags Details | Diff
NGROUPS patch with our own get_ngroups() (3.92 KB, patch)
2004-02-24 11:26 AEDT, Tim Hockin
no flags Details | Diff
Size group array using sysconf, if available (3.46 KB, patch)
2004-02-24 12:13 AEDT, Damien Miller
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Holger van Lengerich 2004-01-13 21:55:37 AEDT
I am able to produce the following behaviour by sshd on Solaris 2.8 and Linux
2.4.23, when NGROUPS_MAX at runtime is larger than at compile-time:

On both systems "fatal: getgroups: Invalid argument" gets logged via syslog   
and the sshd is terminating before any authentication is attempted. I located
the problem in uidswap.c, where deprecated NGROUPS_MAX is used.

NGOUPS_MAX is defined in limits.h and tells the maximum number  
of groups which an account can be member of. As NGROUPS_MAX is determined at
compile-time, this limit gets hardcoded into the resulting binary. As
NGROUPS_MAX may be larger at runtime than at compile-time it should be
regarded as deprecated and sysconf(_SC_NGROUPS_MAX) should be used instead.   
(see APUE 2.4.5 also)

In uidswap.c, line 41 NGROUPS_MAX is used to initialize static arrays in
global context. These 2 occurances of NGROUPS_MAX cannot be substituted
through sysconf(_SC_NGROUPS_MAX) easily as memory has to be allocated at runtime.

In the same file NGROUPS_MAX is referenced in line 72 an 81. These occurances
can be replaced easily, once memory for the arrays is allocated according to  
sysconf(_SC_NGROUPS_MAX).

This bug also constitutes a minor security problem as it may be exploited to  
remotely enumerate accounts, which are member of more then NGROUPS_MAX at
compile-time.
Comment 1 Darren Tucker 2004-02-06 19:44:42 AEDT
I will look at this.
Comment 2 Darren Tucker 2004-02-06 20:44:15 AEDT
Created attachment 539 [details]
Use sysconf where available for NGROUPS_MAX.

Please try this patch.	Note: treat with *extreme* suspicion: it contains some
pointer-fu and compiled first try without warnings, so Murphy can't be far
away...
Comment 3 Damien Miller 2004-02-06 23:28:19 AEDT
Comment on attachment 539 [details]
Use sysconf where available for NGROUPS_MAX.

>Index: defines.h
>===================================================================
>RCS file: /usr/local/src/security/openssh/cvs/openssh_cvs/defines.h,v
>retrieving revision 1.109
>diff -u -p -r1.109 defines.h
>--- defines.h	27 Jan 2004 05:40:35 -0000	1.109
>+++ defines.h	6 Feb 2004 09:27:45 -0000
>@@ -541,6 +541,10 @@ struct winsize {
> # define SSH_SYSFDMAX 10000
> #endif
> 
>+#ifdef HAVE_SYSCONF
>+# undef NGROUPS_MAX
>+# define NGROUPS_MAX    (sysconf(_SC_NGROUPS_MAX))

I think that should be:

#if defined(HAVE_SYSCONF) && defined(_SC_NGROUPS_MAX)

We also need to check for sysconf returning -1

I'm wary of this change for 3.8. 

Perhaps a static check for gid >= NGROUPS_MAX?
Comment 4 Tim Hockin 2004-02-20 13:01:01 AEDT
Created attachment 548 [details]
NGROUPS patch
Comment 5 Tim Hockin 2004-02-20 13:01:54 AEDT
The proposed fix is wrong, I think.  Even with sysconf, the NGROUPS_MAX could be
REALLY large.  You don't really want the max possible, you want the current
number, right?

Patch attached.  The only really interesting part is that the bit where it
passes a fake gid_t array to getgrouplist() works, but gets a SEGV on return
from that function (on my RH9 Linux box, anyway).  Changing fake to an array[4]
and setting ngroups to 0 works, though.  Haven't investigated further.  Does
this look more correct, though?
Comment 6 Darren Tucker 2004-02-20 14:31:11 AEDT
Comment on attachment 548 [details]
NGROUPS patch

>diff -u -u -r1.42 uidswap.c
>+	saved_egroups = xrealloc(saved_egroups,
>+	    saved_egroupslen * sizeof(*saved_egroups));

I get a fatal here during authentication:
debug1: temporarily_use_uid: 500/500 (e=0/0)
xrealloc: zero size
Comment 7 Damien Miller 2004-02-23 15:42:26 AEDT
Comment on attachment 548 [details]
NGROUPS patch

There are two changes in this patch. Making the groups_byname array dynamic is
OK, but I don't know about this:

>+		} else {
>+			char gidstr[32];
>+
>+			logit("getgrgid: unknown group id: %d",
>+			    (int)groups_bygid[i]);
>+			snprintf(gidstr, sizeof(gidstr), "%d",
>+			    (int)groups_bygid[i]);
>+			groups_byname[i] = xstrdup(gidstr);
>+		}

I not sure whether it makes sense, but is it a change in behaviour and should
be a separate patch.
Comment 8 Tim Hockin 2004-02-24 06:59:40 AEDT
Would you rather the "unknown group id" string bit be dumped altogether. 
Previously (well, currently, I guess) the code just discards the byname entry
for any groups without a name.  This results in the bygid[] and byname[] arrays
not being parallel for every index.

If you want them to stay parallel, you need a filler for the gids without names.
 Alternatively, you could make them NULL and just catch that case in any code
that touches groups_byname[].

So would you rather I:
a) do it the old way and just drop out unnamed groups (and be non-parallel)
b) stay parallel but make the unnamed entries be NULL (and fix ga_match() and
ga_free())
c) stay parallel but add an informative error message (as it is now)

I'll be happy to resubmit a patch post-haste if someone would care to make that
decision.  Probably a or b sounds fine to me.  On further consideration, c isn't
so tasteful. :)

I'd like top get this patch off my plate, so I can wrap up the NGROUPS stuff. 
Anything else I need to do to make this go in in the near future?
Comment 9 Ben Lindstrom 2004-02-24 07:06:00 AEDT
I think we are in a lock for 3.8.. However I'm more in favor of the existing 
behavior.

The only other issue is the one pointed out by Darren.  "No secondary groups" 

- Ben
Comment 10 Tim Hockin 2004-02-24 07:19:13 AEDT
I've got the "no secondary groups" thing fixed in my tree (I think).  I'll go
for existing behavior on this and repost the patch.  I notice that this bug is
now block #793 (3.8p1 release) so that is fine for me.  As long as the patch is
"approved" it is off my plate and onto yours :)
Comment 11 Tim Hockin 2004-02-24 08:33:59 AEDT
Created attachment 549 [details]
new NGROUPS patch

This should work if you have no supplementary groups.  It also keeps existing
behavior for unnamed groups in groups_byname[].

Issue:	the first call to getgrouplist() in groupaccess.c:ga_init().

On my unpatched RH9 box, this segfaults.  On my RHEL3 box (should be just like
RH9) it works.	Based on stack examination, the getgroupslist() function on my
RH9 box writes the gid list to the stack, heedless of the ngroups parameter. 
The RHEL box seems to do the right thing, except for wantonly assuming that at
least ONE gid will be available and that ngroups is at least 1.

So I work around the case of requiring one gid (not too gross), but what can be
done about it ignoring the ngroups param on RH9?  Nothing that seems
reasonable.  Fix glibc.

So I think this is correct, or as correct as can be.  More testers to confirm
that would be nice.
Comment 12 Damien Miller 2004-02-24 09:55:53 AEDT
We could use OpenBSD's:

$ wc -l /usr/src/lib/libc/gen/getgrouplist.c
      94 /usr/src/lib/libc/gen/getgrouplist.c

but from reading the manpage, it looks like the behaviour of returning the
number of groups is not guaranteed - OpenBSD's implementation will return up to
will return up to *ngroups, but not more. 

We may need OpenBSD's implementation anyway - getgrouplist isn't POSIX, it comes
from 4.4BSD.

I think we will need to feed getgrouplist with the maximum available, i.e:

int ngroups;
gid_t *gidlist

ngroups = NGROUPS_MAX;
#if defined(HAVE_SYSCONF) && defined(_SC_NGROUPS_MAX)
ngroups = MAX(ngroups, sysconf(_SC_NGROUPS_MAX));
#endif

gidlist = xmalloc(sizeof(*gidlist) * ngroups));

if (getgrouplist(user, group, gidlist, &ngroups) == -1)
    fatal

xfree(gidlist);
Comment 13 Tim Hockin 2004-02-24 10:36:40 AEDT
NGROUPS_MAX might be INT_MAX.  You *can't* use it as an array size.

We could replace getgrouplist() with a hand-rolled:

int get_ngroups(const char *user);

That would avoid the reliance on using getgrouplist() with a short list.  How
hard is it to walk the getgr* functions to count how many groups you are in? We
can add the 'base' group, too and filter that out from the getgr* results.  I
bet it's 20 lines of code.
Comment 14 Damien Miller 2004-02-24 11:13:34 AEDT
> NGROUPS_MAX might be INT_MAX. 

That would be utterly idiotic.
Comment 15 Tim Hockin 2004-02-24 11:22:37 AEDT
Not if there really is NO LIMIT to the number of groups.  Or it might be 256k. 
Is it really worth a 256 to a meg of stack space for this?
Comment 16 Tim Hockin 2004-02-24 11:26:22 AEDT
Created attachment 550 [details]
NGROUPS patch with our own get_ngroups()

Since getgrouplist() is stupid when we just want it to count the groups, we can
roll our own.

This might not be needed.  The prior (-4) version of this patch SHOULD work on
anything that supports getgrouplist() (modulo buffer overflow bugs in
getgrouplist() which this is happy to trigger), which must be everything that
openssh supports, since we use it.
Comment 17 Darren Tucker 2004-02-24 12:08:45 AEDT
Comment on attachment 550 [details]
NGROUPS patch with our own get_ngroups()

>Index: uidswap.c
[...]
>+	saved_egroupslen = getgroups(0, NULL);

Should use get_ngroups here too, no?

>Index: groupaccess.c
[...]
>+		struct group *gr = getgrent();

If you're using getgrent() shouldn't you setgrent() first?
Comment 18 Damien Miller 2004-02-24 12:12:27 AEDT
All the manpages that I see state that sysconf(SC_NGROUPS_MAX) is the canonical
place to determine the maximum number of groups that an account can be a member
of. Furthermore, the Linux 2.6.1 and glibc -current headers define NGROUPS_MAX
as 32.

Why would anyone set a *runtime* limit higher than the actual max number of
groups in use for a single account?

Comment 19 Damien Miller 2004-02-24 12:13:29 AEDT
Created attachment 551 [details]
Size group array using sysconf, if available
Comment 20 Damien Miller 2004-02-24 12:16:07 AEDT
Darren,

>+	saved_egroupslen = getgroups(0, NULL);

This does the right thing: it returns the number of supplemental groups that the
process is a member of. 
Comment 21 Darren Tucker 2004-02-24 12:20:17 AEDT
Ah, OK, misread it.
Comment 22 Tim Hockin 2004-02-24 12:28:28 AEDT
Darren: getgroups() works properly, but yeah setgrent() is probably correct.

Damien: Linux 2.6.3 will change NGROUPS_MAX to 64k.  You *really* do not care
what _SC_NGROUPS_MAX says - it is the MAXIMUM (actually, it's the minimum
maximum, says POSIX.  The actual maximum may be higher than sysconf() reports,
if I recall correctly).  All you care about is the ACTUAL count.  You can get
the actual count from a getgrent loop or from a proper getgrouplist() (which
work on all systems, and not just HAVE_SYSCONF systems).

I am pretty sure that either version will work (with a call to setgrent() to be
pedantic in the get_ngroups() version) properly on the systems we care about. 
sysconf() will probably work but is sub-optimal and (to be pedantic) potentially
wrong.
Comment 23 Damien Miller 2004-02-24 12:41:28 AEDT
SuSv3 says _SC_NGROUPS_MAX: "Maximum number of simultaneous supplementary group
IDs per process". Sizing arrays using NGROUPS_MAX and/or sysconf is a very
common idiom. Even the Linux manpages recommend this.

Most systems don't even have anything like 64k groups, let alone accounts which
are members of all of them. Have you not heard of optimising for the common case?

What you propose means more complexity for every piece of downstream software
that uses supplemental groups. 
Comment 24 Tim Hockin 2004-02-24 13:05:46 AEDT
Well, there are very few bits of code that need hacking to work with 64k groups,
so I have to discount the bit about extra complexity.

Speaking of optimizing for the common case: this is called ONCE (unless I
misread) per process.  The real optimization is to use only as much memory as is
strictly needed, though neither you nor I are actually optimizing anything at
all.  The runtime of this code is so far away from the fast path of anything
that it's dumb to be arguing about.

I should also mention that sooner or later _SC_NGROUPS_MAX may end up as an
actual tunable in Linux.  Again, you don't care what the maximum is, just what
the actual number is.  Further, since the patch(es) I proposed are VERY simple
and work reliably, why would you opt AGAINST them, for something that is less
precise AND might not be available on a platform (thereby falling back on
today's buggy behavior).  I can't see the reason for arguing that as a win.

But, in the end, it's not my project, right?
Comment 25 Damien Miller 2004-02-24 13:10:33 AEDT
Simpler patch applied. If necessary, we can revisit this under a different bug
post-release if necessary.

Tim, for a long time software has sized arrays using NGROUPS_MAX and/or sysconf.
By changing NGROUPS_MAX, you break binary compat in scary ways. By making the
baseline _SC_NGROUPS_MAX so high, you waste memory everywhere and force everyone
else to do the work in cleaning up after you.

Hopefully the glibc people will try to shelter userland from this silly and
gratuitous change.
Comment 26 Tim Hockin 2004-02-24 13:16:17 AEDT
I did not find as many examples of people (mis)using NGROUPS_MAX as you'd like
to believe.  Customers demanded more groups.  So it had to grow.

As I keep arguing - you DON'T want to know the maximum groups (in 99% of cases).
 You want to know the ACTUAL groups.  And I am doing what I can to find apps
like openssh that are broken, and help them to see the light.  Once it is fixed,
it's fixed.

Cheers.
Comment 27 Damien Miller 2004-04-14 12:24:20 AEST
Mass change of RESOLVED bugs to CLOSED