| Summary: | Fix race conditions in forwarding tests | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Portable OpenSSH | Reporter: | Colin Watson <cjwatson> | ||||||
| Component: | Regression tests | Assignee: | Assigned to nobody <unassigned-bugs> | ||||||
| Status: | CLOSED FIXED | ||||||||
| Severity: | enhancement | CC: | djm, dtucker | ||||||
| Priority: | P5 | ||||||||
| Version: | 7.4p1 | ||||||||
| Hardware: | Other | ||||||||
| OS: | Linux | ||||||||
| Bug Depends on: | |||||||||
| Bug Blocks: | 2647 | ||||||||
| Attachments: |
|
||||||||
|
Description
Colin Watson
2017-01-03 02:32:57 AEDT
That seems to have indeed improved matters in my test runners. The remaining failure I see should be fixed by the patch in https://bugzilla.mindrot.org/show_bug.cgi?id=2660. Applied, thanks. This change is causing hangs in forwarding.sh in OpenBSD -current. (In reply to Damien Miller from comment #3) > This change is causing hangs in forwarding.sh in OpenBSD -current. It didn't when I committed it and doesn't with binaries from Jan 24. I did see a problem when I just did a make && make install in the ssh dir, however those problems also went away when I did a clean build. Can you reproduce with a clean build? if not, maybe it's fallout from the libssl churn. Yes, it fails with a clean build on OpenBSD and on Linux too. I'll see if I can bisect to see what broke it. Fails on OpenBSD with usr.bin/ssh updated to 20170123, so the problem is likely elsewhere. It looks like the client is crashing: debug1: Local connections to LOCALHOST:3304 forwarded to remote address 127.0.0.1:4242 debug3: channel_setup_fwd_listener_tcpip: type 2 wildcard 0 addr NULL debug1: Local forwarding listening on 127.0.0.1 port 3304. debug2: fd 7 setting O_NONBLOCK debug3: fd 7 is O_NONBLOCK debug1: channel 2: new [port listener] debug1: Local forwarding listening on ::1 port 3304. debug2: fd 8 setting O_NONBLOCK debug3: fd 8 is O_NONBLOCK debug1: channel 3: new [port listener] Could not request local forwarding. FAIL: connection failed, should not (In reply to Damien Miller from comment #6) > Could not request local forwarding. err, not crashing. That is a fatal() Created attachment 2932 [details]
slightly less broken
This fixes the hangs. The test was opening a login session with stdout/err diverted.
It still fails in the "exit on -L/-R forward failure" bits:
env SUDO="" "MALLOC_OPTIONS=CFGJRSUX" sh /usr/src/regress/usr.bin/ssh/test-exec.sh /usr/src/regress/usr.bin/ssh/obj /usr/src/regress/usr.bin/ssh/forwarding.sh
generate keys
wait for sshd
start forwarding, fork to background
transfer over forwarded channels and check result
exit on -L forward failure, proto 2
connection failed, should not
exit on -R forward failure, proto 2
connection failed, should not
simple clear forwarding proto 2
clear local forward proto 2
local forwarding not cleared
clear remote forward proto 2
remote forwarding not cleared
stdio forwarding proto 2
config file: start forwarding, fork to background
config file: transfer over forwarded channels and check result
transfer over chained unix domain socket forwards and check result
wait for sshd to exit
failed local and remote forwarding
*** Error 1 in /usr/src/regress/usr.bin/ssh (Makefile:186 't-forwarding')
Figured it out, here's the essence of the fix:
- ${SSH} -S $CTL -O exit somehost
+ ${SSH} -F $OBJ/ssh_config -S $CTL -O exit somehost
The "transfer over forwarded channels and check result" section started a ssh mux master and tried to stop it at the end. Unfortunately, the $SSH invocation used to kill the mux master would pick up the ~/.ssh/config of the user running the test because it didn't specify an explicit configuration. This would result in the mux master persisting to the next block.
Previously, this would work because of two coincidences: 1) the $CTL socket would be carried over to the next test block (ExitOnForwardFailure) and 2) the forwarding specification is the same in each block, so there was no collision between them.
When this change added the "rm -f $CTL" to the ExitOnForwardFailure test block, this invalidated coincidence #1 and the forwarding attempts there would get EADDRINUSE because the ports were already busy (forwarded by the lingering mux master from the previous block).
Anyway, I've committed the fix
closing resolved bugs as of 8.6p1 release |