When outputting filenames to the terminal, scp doesn't filter out non-printable characters. Example: $ touch "ab`tput clear`cd" $ ls ab* ab?[H?[2Jcd $ scp ab* localhost:/tmp clears the screen.
Created attachment 2678 [details] similar fix as escaping banner message Just fast note. This is evaluated only in progressmeter, where it is printed out not-escaped, but only control characters for terminal. The issue can be fixed by this simple patch, quite similar way as the banner is printed out. Patch may require some more handling of allocated memory. But here is the basic idea.
Or better way would be use solution proposed in bug discussing banner [1] (which is lying there for over one year untouched). We use that solution for banner for some time in RH so applying it for scp should not cause more pain. It is hardcoding table, but probably best solution we have now. It will only need some adjustments of previous patch to be usable also from scp, but nothing impossible. [1] https://bugzilla.mindrot.org/show_bug.cgi?id=2058#c10
Created attachment 2858 [details] proposed patch using new utf8.h AFAIK, we should fix this the same way the bug #2058. Tested with the original reproducer against current git master and it resolves the problem.
Put on the list for 7.4.
That patch isn't correct unfortunately; we had one similar in commit 0e059cdf5fd that had to be backed out: that code is called from a SIGALARM handler but isn't safe to be run in that context
OpenSSH 7.4 release is closing; punt the bugs to 7.5
TL;DR signals are hard, lets go do some neurosurgery Looking at this again, this is quite difficult to fix so long as our progress meter runs in signal context. Even if we got rid of the malloc calls in utf8.c, that code can never be safe to run in a signal handler - none of the mb*/wc* standard library functions are signal-safe. Some alternatives: 1. Use a thread Nope. 2. Make utf8.c signal-safe This would mean ditching use of mb*/wc* and redoing it longhand. Not impossible but big and brittle. Probably a non-starter. 3. Arrange for the formatting (at least of the filename) to happen in a non-signal context Perhaps we could do it in scpio somehow? The problem here is that it is only called at present for complete writes from atomicio, but perhaps we could add some heuristic that allowed it to be called when the underlying read/write was interrupted by a signal too? We'd still need to be careful though - we couldn't naively update the string that gets written by the progress meter code as a SIGALRM could come along while we're updating it. I think we could do it by doing something like: replacing the filenames with a short array of filenames, a sig_atomic_t index that points to the one that is safe to write to and a way to update the index. 4. Allow safe truncation of the filename in signal context. The main thing the progress meter code needs to do with the filename is truncate it appropriately when it doesn't fit the terminal's columns. The problem this presents for multibyte locales is picking the spots where we can safely split it. We could have start_progress_meter() record an array of "split points" and have refresh_progress_meter() pick the one that makes the sanitised filename fit. This might be the easiest to do, but we'd need a new API in utf8.c
(In reply to Damien Miller from comment #7) [...] > 3. Arrange for the formatting (at least of the filename) to happen > in a non-signal context this one gets my vote. > Perhaps we could do it in scpio somehow? The problem here is that it > is only called at present for complete writes from atomicio, but > perhaps we could add some heuristic that allowed it to be called > when the underlying read/write was interrupted by a signal too? the atomicio6 function provides callback hooks which are used for bandwidth limitation, I think that should be sufficient for this. I'm looking at it.
Created attachment 2942 [details] Simplify population of file name display patch 1 of 2
Created attachment 2943 [details] Generate display name outside of sighandler patch 2 of 2. applies on top of attachment #2942 [details].
Comment on attachment 2942 [details] Simplify population of file name display >+ displayname[0] = '\0'; >+ if (file_len > 0) >+ snprintf(displayname, file_len, "%*s ", file_len * -1, file); I don't this will give good output if it chops file in the middle of a multibyte character.
(In reply to Damien Miller from comment #11) > I don't this will give good output if it chops file in the middle of > a multibyte character. The first patch does not handle multibyte characters, it's a strict simplification of what currently exists. For multibyte handling you want attachment #2943 [details].
oops, yes. What happens if a window size change happens between update_progress_displayname() and update_progress_meter()? It looks like it will cause the filename to only be updated after both have been called. If this is the case, could you move setscreensize() out of update_progress_meter() (where it is in signal context and strictly not safe) and into update_progress_displayname(). Also: > + strnvis(buf, file, sizeof buf, VIS_SAFE); I think smprintf() will fallback to vis(3) internally so you shouldn't need this case.
(In reply to Damien Miller from comment #13) > oops, yes. What happens if a window size change happens between > update_progress_displayname() and update_progress_meter()? The display will be a bit off for one cycle then it'll correct itself. If the window got wider, the line will be too narrow briefly. If the window got narrower then the line will be too long but otherwise OK. In the latter case I think the exact behaviour will vary depending on the terminal, but gnome terminal here doesn't line feed so it also recovers ok. > It looks like it will cause the filename to only be updated after > both have been called. I don't follow. displayname is an array of 2 strings, one of which should be good as soon as update_progress_displayname() returns. > If this is the case, could you move > setscreensize() out of update_progress_meter() (where it is in > signal context and strictly not safe) and into > update_progress_displayname(). I actually tried that and it caused the filename to be missing initially although I did not immediately see why. > I think smprintf() will fallback to vis(3) internally so you > shouldn't need this case. it doesn't. It gets to the first escape char then it stops writing to the output, sets the output characters param then returns -1 leaving the string unterminated, which will cause corrupted output, head scratching and debug printfs. Hypothetically.
(In reply to Darren Tucker from comment #14) > > I think smprintf() will fallback to vis(3) internally so you > > shouldn't need this case. > > it doesn't. It gets to the first escape char then it stops writing > to the output, sets the output characters param then returns -1 > leaving the string unterminated, which will cause corrupted output, > head scratching and debug printfs. Hypothetically. :) I don't see this behaviour though and it's definitely a bug if it is happening. I've tested that mprintf doesn't choke on control-chars in ssh, ssh-keygen and sftp though we don't use the column-limit feature in any of them AFAIK. Hacking that in to ssh-keygen: +{ char b[80]; int x = 20; +snmprintf(b, sizeof(b), &x, "\x12XXX"); +printf("%s\n", b); +return 0; +} produces the expected "\022XXX" output too for both LC_CTYPE set to UTF-8 and C...
(In reply to Damien Miller from comment #15) > produces the expected "\022XXX" output too for both LC_CTYPE set to > UTF-8 and C... interesting, your test code behaves as expected on OpenBSD but not on Linux (Fedora 24), both using the portable code. char b[80]; int x = 20; snmprintf(b, sizeof(b), &x, "ab\x12XXX"); printf("len %d, '%s'\n", strlen(b), b); return 0; openbsd-current: len 9, 'ab\022XXX' fedora24: len 2, 'ab'
*** Bug 2193 has been marked as a duplicate of this bug. ***
FYI I've fixed mprintf's truncation of escape characters in commit 011c8ffbb027
Move incomplete bugs to openssh-7.6 target since 7.5 shipped a while back. To calibrate expectations, there's little chance all of these are going to make 7.6.
remove 7.5 target
Comment on attachment 2942 [details] Simplify population of file name display >This keeps the padded and/or truncated displayname in its own variable >which will allow later use of non-signal-safe functions like snmprintf >to compose it. ... >- snprintf(buf + strlen(buf), win_size - strlen(buf), >- " %3d%% ", percent); >+ snprintf(buf, sizeof(buf), "\r%s %3d%% ", displayname, percent); AFAIK it's not safe to truncate arbitrary strings using char* (as opposed to wchar_t*) functions when the charset is not US-ASCII or UTF-8. Could you do the truncation using snmprintf() instead?
Move to OpenSSH 7.8 tracking bug
Retarget remaining bugs planned for 7.8 release to 7.9
Retarget unfinished bugs to OpenSSH 8.0
Created attachment 3228 [details] Move progressmeter formatting out of signal handler. This is moves the formatting entirely out of signal handler context and into code called by the atomicio callback. It changes atomicio call the callback on EINTR and EAGAIN so that SIGALRM will interrupt the read or write and the callback will update the progressmeter at the appropriate time.
Patch has been applied and will be in the 8.0 release. Thanks.
close bugs that were resolved in OpenSSH 8.5 release cycle