Bug 3398 - sshd 8.9p1 regressed on ppc32
Summary: sshd 8.9p1 regressed on ppc32
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sshd (show other bugs)
Version: 8.9p1
Hardware: PPC Linux
: P5 enhancement
Assignee: Assigned to nobody
URL:
Keywords:
Depends on:
Blocks: V_9_2
  Show dependency treegraph
 
Reported: 2022-03-08 01:41 AEDT by Alexander Kanavin
Modified: 2023-03-17 13:41 AEDT (History)
1 user (show)

See Also:


Attachments
autoconf log (48.02 KB, text/plain)
2022-03-08 19:27 AEDT, Alexander Kanavin
no flags Details
generated config.h (57.91 KB, text/plain)
2022-03-08 19:27 AEDT, Alexander Kanavin
no flags Details
autoconf config.log (43.25 KB, application/x-xz)
2022-03-08 19:30 AEDT, Alexander Kanavin
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Alexander Kanavin 2022-03-08 01:41:00 AEDT
All attempts to connect get rejected:

root@qemuppc:~# ssh localhost
Connection reset by ::1 port 22

with the following in logs:

Mar  7 12:28:10 qemuppc auth.info sshd[714]: ssh_dispatch_run_fatal: Connection from ::1 port 60712: Invalid argument [preauth]

I bisected this, and this is the commit where it started:
https://github.com/openssh/openssh-portable/commit/6582a31c388968f4073af2bd8621880735c3d42b

This does look odd, especially given that ppoll is a core API, and ssh 8.9 works totally fine on other platforms (Yocto project also tests arm32/64, mips32/64 and x86/64), but I'd be totally willing to debug further, just need some hints.
Comment 1 Alexander Kanavin 2022-03-08 01:41:53 AEDT
root@qemuppc:~# uname -a
Linux qemuppc 5.15.26-yocto-standard #1 PREEMPT Wed Mar 2 21:12:55 UTC 2022 ppc ppc ppc GNU/Linux

FWIW
Comment 2 Alexander Kanavin 2022-03-08 02:51:11 AEDT
I ran sshd under strace, this is where things seem to go awry:

set_robust_list(0xa7d98090, 12)         = 0
close(6)                                = 0
close(7)                                = 0
getpid()                                = 715
getpid()                                = 715
getrandom("\xe7\x45\x7a\x54\xec\xef\xdf\x78\x4b\x73\x30\xac\xe3\x51\xfc\xfe\x84\x7c\xe0\x10\x3d\x1d\xc0\x6a\x54\x26\xd9\x27\x69\x09\x39\x70", 32, 0) = 32
getpid()                                = 715
getpid()                                = 715
getrandom("\x73\x65\x0a\x2a\x8d\xe8\xd7\x06\x5a\x07\xfe\xed\x38\x8c\x45\xd6\x46\x8f\xab\x52\xcd\x3c\x96\xf4\xde\x8d\x49\xbe\x70\x4f\x36\x24", 32, 0) = 32
getpid()                                = 715
getpid()                                = 715
brk(0xc88000)                           = 0xc88000
chroot("/var/run/sshd")                 = 0
chdir("/")                              = 0
getpid()                                = 715
setgroups(1, [995])                     = 0
getuid()                                = 0
getgid()                                = 0
getpid()                                = 715
setresgid(995, 995, 995)                = 0
setresuid(996, 996, 996)                = 0
setgid(0)                               = -1 EPERM (Operation not permitted)
setresgid(-1, 0, -1)                    = -1 EPERM (Operation not permitted)
getgid()                                = 995
getegid()                               = 995
setuid(0)                               = -1 EPERM (Operation not permitted)
setresuid(-1, 0, -1)                    = -1 EPERM (Operation not permitted)
getuid()                                = 996
geteuid()                               = 996
prlimit64(0, RLIMIT_FSIZE, {rlim_cur=0, rlim_max=0}, NULL) = 0
prlimit64(0, RLIMIT_NOFILE, {rlim_cur=0, rlim_max=0}, NULL) = 0
prlimit64(0, RLIMIT_NPROC, {rlim_cur=0, rlim_max=0}, NULL) = 0
getpid()                                = 715
getpid()                                = 715
getpid()                                = 715
getpid()                                = 715
getpid()                                = 715
write(4, "\0\0\4\24\t\24?\345\251\262\371\245\3712;6\336\227\"\372yQ\0\0\1\tcurve2"..., 1048) = 1048
ppoll([{fd=4, events=POLLIN}], 1, NULL, NULL, 8) = -1 EINVAL (Invalid argument)
getpid()                                = 715
write(8, "\0\0\0Y\0\0\0\3\0\0\0\0\0\0\0Mssh_dispatch_run"..., 93) = 93
getpid()                                = 715
exit_group(255)                         = ?
+++ exited with 255 +++


Particulary, is calling prlimit64() with zeroes expected?
Comment 3 Alexander Kanavin 2022-03-08 03:08:23 AEDT
I compared the configure logs, this difference seems notable:

x86:
             Privsep sandbox style: seccomp_filter

ppc:
             Privsep sandbox style: rlimit
Comment 4 Alexander Kanavin 2022-03-08 03:18:18 AEDT
--with-sandbox=no is the workaround, for now,
Comment 5 Darren Tucker 2022-03-08 12:13:52 AEDT
How did you configure and compile this?  That particular problem should have been caught by this change to configure prior to 8.9:

commit bc16667b4a1c3cad7029304853c143a32ae04bd4
Author: Darren Tucker <dtucker@dtucker.net>
Date:   Tue Feb 22 15:29:22 2022 +1100

    Extend select+rlimit sanbox test to include poll.
    
    POSIX specifies that poll() shall fail if "nfds argument is greater
    than {OPEN_MAX}".  The setrlimit sandbox sets this to effectively zero
    so this causes poll() to fail in the preauth privsep process.
    
    This is likely the underlying cause for the previously observed similar
    behaviour of select() on plaforms where it is implement in userspace on
    top of poll().
Comment 6 Alexander Kanavin 2022-03-08 19:26:11 AEDT
This is cross-compiled using Yocto for powerpc on an x86 build machine. I'll attach the autoconf logs and config.h in a second.
Comment 7 Alexander Kanavin 2022-03-08 19:27:03 AEDT
Created attachment 3575 [details]
autoconf log
Comment 8 Alexander Kanavin 2022-03-08 19:27:28 AEDT
Created attachment 3576 [details]
generated config.h
Comment 9 Alexander Kanavin 2022-03-08 19:30:01 AEDT
Created attachment 3577 [details]
autoconf config.log
Comment 10 Darren Tucker 2022-03-08 19:37:53 AEDT
(In reply to Alexander Kanavin from comment #6)
> This is cross-compiled using Yocto for powerpc on an x86 build
> machine. I'll attach the autoconf logs and config.h in a second.

That's the reason the test didn't catch it: it relies on being able to run the test via AC_RUN_IFELSE.  We might have to assume the problem always exists when cross compiling.
Comment 11 Alexander Kanavin 2022-03-08 19:47:15 AEDT
Why didn't the test error out though? Not being able to run something to make a decision should be treated as a failure (or there should be a fallback for that scenario).
Comment 12 Darren Tucker 2022-03-08 20:01:47 AEDT
(In reply to Alexander Kanavin from comment #11)
> Why didn't the test error out though? Not being able to run
> something to make a decision should be treated as a failure (or
> there should be a fallback for that scenario).

There is a fallback:

AC_MSG_CHECKING([if select and/or poll works with descriptor rlimit])
AC_RUN_IFELSE(
        ...
        [AC_MSG_WARN([cross compiling: assuming yes])
         select_works_with_rlimit=yes]

In the past this was usually fine because on most systems select(2) didn't have this problem, but for poll(2) having the problem seems to be the common case.
Comment 13 Darren Tucker 2022-03-08 20:11:09 AEDT
I've changed the default for cross-compiling:
https://anongit.mindrot.org/openssh.git/commit/?id=8cf5275452a950869cb90eeac7d220b01f77b12e

I think that should fix this problem.
Comment 14 Darren Tucker 2022-11-07 11:29:50 AEDT
As far as we know this is fixed, please reopen if that's not the case.
Comment 15 Damien Miller 2023-03-17 13:41:35 AEDT
OpenSSH 9.3 has been released. Close resolved bugs