Bug 3534 - probable underflow calculating display width of file name
Summary: probable underflow calculating display width of file name
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: scp (show other bugs)
Version: -current
Hardware: amd64 Linux
: P5 normal
Assignee: Assigned to nobody
URL:
Keywords:
Depends on:
Blocks: V_9_3
  Show dependency treegraph
 
Reported: 2023-02-07 05:34 AEDT by programmerjake
Modified: 2023-03-17 13:43 AEDT (History)
4 users (show)

See Also:


Attachments
backtrace from snprintf call with invalid maxlen (1.71 KB, text/plain)
2023-02-10 16:28 AEDT, programmerjake
no flags Details
temporary edits applied for debugging (858 bytes, patch)
2023-02-10 16:36 AEDT, programmerjake
no flags Details | Diff
shell script for reproduction (146 bytes, text/plain)
2023-02-10 19:51 AEDT, programmerjake
no flags Details
Used snmprintf() for window width handling (3.20 KB, patch)
2023-02-13 15:50 AEDT, Damien Miller
no flags Details | Diff
reproducer script that checks multiple widths (210 bytes, text/plain)
2023-02-15 13:10 AEDT, programmerjake
no flags Details
output of reproducer 3674 for patch 3670 with \r replaced by ^M (1.59 KB, text/plain)
2023-02-15 13:16 AEDT, programmerjake
no flags Details
Revised diff (3.36 KB, patch)
2023-02-15 20:16 AEDT, Damien Miller
no flags Details | Diff
different approach (4.88 KB, patch)
2023-02-17 16:29 AEDT, Damien Miller
dtucker: ok+
Details | Diff
final(?) diff (5.01 KB, patch)
2023-02-20 14:46 AEDT, Damien Miller
dtucker: ok+
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description programmerjake 2023-02-07 05:34:08 AEDT
I first found this on Termux on AArch64 Android, but am able to replicate on x86-64 Ubuntu 20.04.

running:
touch ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.txt
scp ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~.txt jacob@192.168.1.141:

gives:
FORTIFY: vsnprintf: size 18446744073709551610 > SSIZE_MAX
Aborted

afaict this is terminal-width-dependent, my terminal has $COLUMNS set to 120, if i increase my terminal width to 205, then it completes successfully.

Afaict this bug also occurs in sftp, i was able to crash it by running the corresponding `put` command from interactive sftp.

I was able to reproduce the scp bug on Ubuntu 20.04.5 LTS on x86-64 where it apparently just prints garbage instead of aborting:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~�

Ubuntu's ssh version:
ssh -V
OpenSSH_8.2p1 Ubuntu-4ubuntu0.5, OpenSSL 1.1.1f  31 Mar 2020
Termux's ssh version:
OpenSSH_9.2p1, OpenSSL 3.0.7 1 Nov 2022
Comment 1 Damien Miller 2023-02-10 13:31:53 AEDT
I'm not able to replicate this. Are you able to get a stacktrace?

Failing that, what are your LOCALE and LC_* environment variables set to?
Comment 2 programmerjake 2023-02-10 16:28:57 AEDT
Created attachment 3665 [details]
backtrace from snprintf call with invalid maxlen

(In reply to Damien Miller from comment #1)
> I'm not able to replicate this. Are you able to get a stacktrace?

yes. I recompiled from source on commit 6dfb65de949cdd0a5d198edee9a118f265924f33 and edited progressmeter.c, changing can_output to always return 1, and editing setscreensize to always set win_size to 121. line numbers should still match the unedited version.

I built it with:
configure --prefix=/home/jacob/projects/openssh/dbg CFLAGS='-g -O0' --with-privsep-path=/run/sshd --disable-strip

created a file:
touch ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
and debugged the built scp with args:
/home/jacob/projects/openssh/dbg/bin/scp ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ localhost:

Attached the backtrace

> 
> Failing that, what are your LOCALE and LC_* environment variables
> set to?

LANG=en_US.UTF-8
LC_COLLATE=C
Comment 3 programmerjake 2023-02-10 16:36:00 AEDT
Created attachment 3666 [details]
temporary edits applied for debugging
Comment 4 Darren Tucker 2023-02-10 19:24:40 AEDT
I also was not able to replicate it on x86-64 (Fedora) or aarch64 (Debian).  Could you put the repo case into a shell script?  I'm wondering if we're losing something with the cut and paste (possibly related to EscapeChar?).
Comment 5 programmerjake 2023-02-10 19:51:00 AEDT
Created attachment 3667 [details]
shell script for reproduction

the reproducer script doesn't need "temporary edits applied for debugging" patch

note that I only ever got the abort on Termux, since Ubuntu's snprintf doesn't check for nonsensical size inputs.
Comment 6 programmerjake 2023-02-10 20:01:36 AEDT
I'm also able to reproduce on debian 10 x86-64 with:
OpenSSH_7.9p1 Debian-10+deb10u2, OpenSSL 1.1.1n  15 Mar 2022

$ ./reproducer.sh
Agent pid 1567
Enter passphrase for /home/jacob/.ssh/id_rsa:
Identity added: /home/jacob/.ssh/id_rsa (jacob@jacob-build-server)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ï½
$

oh, also, i have the computer i'm running scp from on the authorized_keys list
Comment 7 Damien Miller 2023-02-13 15:50:51 AEDT
Created attachment 3670 [details]
Used snmprintf() for window width handling

Firstly, thanks a heap for the debugging and reproduction work you've done. It makes our lives a lot easier.

It looks like what is happening is that snmprintf() (our UTF-8-aware string formatting function) can write more than win_size characters to the buffer. At first glance this shouldn't happen for two reasons: 

1. file_len, which is used as the column limit to snmprintf() is less than win_size
2. The field with for the "%-*s" format string is passed as file_len too.

Neither of these turns out to be an effective limit. For #2, this is the width of the field (i.e how much to pad) rather than the precision (where to chop).

For #1, the column limit applies to the visual width of the string as displayed and not the number of characters used. These will differ if the string contains UTF-8 multibyte characters as this reproducer does.

So strlen(buf) can easily exceed win_size for UTF-8 strings and this will lead to the negative snprintf() size later when we calculate is as "win_size - strlen(buf)".

IMO the best solution to this is to let snmprintf() do more of the work in trying to hit win_size, as attempting to do it via strlen() is never going to work for UTF-8. This patch implements this, adjusting the first snmprintf() call to return how much padding we need to hit file_len. The padded string should be visually correct, but I added a final snmprintf() call to ensure that it's truncated in a UTF-8 safe way at win_size.
Comment 8 programmerjake 2023-02-15 13:10:48 AEDT
Created attachment 3674 [details]
reproducer script that checks multiple widths

(In reply to Damien Miller from comment #7)
> Created attachment 3670 [details]
> Used snmprintf() for window width handling

I tested the patch, it works great for terminal widths >= 36, but exceeds the terminal width and prints garbage at the beginning of the line for terminals narrower than that.
Comment 9 programmerjake 2023-02-15 13:16:46 AEDT
Created attachment 3675 [details]
output of reproducer 3674 for patch 3670 with \r replaced by ^M
Comment 10 Damien Miller 2023-02-15 20:16:07 AEDT
Created attachment 3676 [details]
Revised diff

Please try this one
Comment 11 programmerjake 2023-02-15 20:30:21 AEDT
(In reply to Damien Miller from comment #10)
> Created attachment 3676 [details]
> Revised diff
> 
> Please try this one

tried it, looks good to me! Thanks for the extra effort to fix it on tiny terminals too!
Comment 12 Damien Miller 2023-02-17 16:29:28 AEDT
Created attachment 3679 [details]
different approach

Here's a different approach that uses asmprintf() and xextendf() to build the string, avoiding all the strlen() math.

It's probably safer but makes a few more syscalls and heap operations per progressmeter update. IMO safety wins here.
Comment 13 Damien Miller 2023-02-17 16:29:55 AEDT
Comment on attachment 3679 [details]
different approach

Darren made me do this so he gets to look at it
Comment 14 Darren Tucker 2023-02-17 20:28:54 AEDT
Comment on attachment 3679 [details]
different approach

Looks ok with a couple of nits.

>+	static char buf[((sizeof(off_t) * 8 * 4 * 2) / 10) + 16];

What's the reasoning behind the numbers?

>+		if (hours != 0) {
>+			xextendf(&buf, NULL, "%d:%02d:%02d",
>+			    hours, minutes, seconds);
>+		} else {
>+			xextendf(&buf, NULL, "  %02d:%02d", minutes, seconds);
>+		}

The rest of this function doesn't have brackets around single-line if/else statements.
Comment 15 Damien Miller 2023-02-20 14:35:08 AEDT
(In reply to Darren Tucker from comment #14)
> Comment on attachment 3679 [details]
> different approach
> 
> Looks ok with a couple of nits.
> 
> >+	static char buf[((sizeof(off_t) * 8 * 4 * 2) / 10) + 16];
> 
> What's the reasoning behind the numbers?

calculating the length of integer rendered as a string using nbits*log2(10) / 10

8 bits per byte
4 > 3.321 ~= log2(10)
2 because there are two variables being formatted
Comment 16 Damien Miller 2023-02-20 14:46:19 AEDT
Created attachment 3680 [details]
final(?) diff

Updated the diff to move the math to a #define and fix the braces
Comment 17 Darren Tucker 2023-02-20 18:30:04 AEDT
(In reply to Damien Miller from comment #13)
> Darren made me do this so he gets to look at it

It's true, my main contribution to this project is nersniping Damien into implementing things I'm too lazy or incompetent to do.
Comment 18 Damien Miller 2023-02-22 15:07:30 AEDT
This has been applied and will be in the OpenSSH 9.3 release, due in a few months. Thanks again for your assistance in debugging this and testing patches!
Comment 19 Damien Miller 2023-03-17 13:43:01 AEDT
OpenSSH 9.3 has been released. Close resolved bugs