Bug 3547 - sftp crash with 'invalid multibyte character' when pressing Tab to complete specific Chinese filenames
Summary: sftp crash with 'invalid multibyte character' when pressing Tab to complete s...
Status: NEW
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sftp (show other bugs)
Version: 8.4p1
Hardware: amd64 Linux
: P5 trivial
Assignee: Assigned to nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-03-08 19:59 AEDT by nebclllo0444
Modified: 2023-03-16 08:40 AEDT (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description nebclllo0444 2023-03-08 19:59:32 AEDT
I'm using sftp bundled with OpenSSH 8.4p1-150300.3.15.4 on openSUSE Leap 15.4. I encountered a problem that when there are some files with specific Chinese filename. For example:
(using bash with LANG=zh_CN.utf8)
touch (一)
touch (二)
# Create two file that contains character that represents number. They have the same prefix and suffix. These parentheses are also CJK characters(Not ASCII).
sftp 127.0.0.1
ls (
# Input the first character(The common prefix of the two filenames)
# And press the Tab to auto-complete
And the client immediately after printing the filenames, leaving the terminal not usable without a `reset`:
(一) (二)
invalid multibyte character

It seems that only a few combinations will make the client crash. If these are "(一)" and "(啊)", the client will not crash.

Prefixes and suffixes other than CJK parentheses will make the client crash, though: 

Input 'ls 啊' and press Tab. "啊一啊" "啊二啊" will crash the client. "啊一啊" "啊哦啊" will not.

Switching sftp subsystem from external sftp server to 'internal-sftp' in sshd_config will not fix it.

The same problem also exists on Debian Bullseye (with OpenSSH 1:8.4p1-5+deb11u1)

This is my first bug report for OpenSSH, and I'm not good at English. Sorry for any inconvenience.
Comment 1 Darren Tucker 2023-03-08 21:07:00 AEDT
Can you reproduce the problem with a a stock openssh built from the source from openssh.com?  We cannot help you with vendor-supplied (usually modified) binaries, you will need to seek help from the vendor that supplied them.

A couple of suggestions:
 - run sftp under gdb and see where it crashes ("gdb --args /usr/bin/sftp localhost", "run", then when it crashes use "bt" to get a backtrace).
 - I vaguely recall utf8 crashing bugs with some versions of libedit, maybe check if there's an update for that.  

I made a quick attempt to reproduce here on Fedora (libedit-3.1-41.20210910cvs.fc36.x86_64) with both -current and 8.4p1 but it worked as expected (although it's possible I'm doing something wrong, I don't normally used non-ascii).
Comment 2 Darren Tucker 2023-03-08 23:14:10 AEDT
I did manage to reproduce this on an ARM SBC running a Debian 10.13 derivative.

$ gdb -q --args /usr/bin/sftp localhost
Reading symbols from /usr/bin/sftp...(no debugging symbols found)...done.
(gdb) run
[...]
Connected to localhost.
sftp> cd /tmp/t
sftp> ls 啊
啊一啊  啊二啊  
invalid multibyte character
[Inferior 1 (process 26809) exited with code 0377]

So it's not a crash as in segfault, it prints an error then exits.

The system has libedit 3.1-20181209-1.  If I build OpenSSH against a locally-built libedit-20221030-3.1 it doesn't do this (it doesn't autocomplete either, but that seems reasonable).  I think you're looking at a libedit bug.
Comment 3 Darren Tucker 2023-03-08 23:19:11 AEDT
(In reply to Darren Tucker from comment #2)
> The system has libedit 3.1-20181209-1.

... and if I build OpenSSH 8.4p1 against the system libedit, it also fails, but if I build 8.4p1 against the newest libedit it doesn't.
Comment 4 nebclllo0444 2023-03-16 02:50:07 AEDT
libedit on linux is ported from NetBSD, so I installed NetBSD 9.3 but found that there are few documentations about localization, and CJK characters in terminal(both in ssh and Xfce4 Terminal) will be displayed in '?' but Xfce4 it self seems normal.
Should I keep on trying to reproduce the problem on NetBSD?
Comment 5 Darren Tucker 2023-03-16 08:40:35 AEDT
(In reply to nebclllo0444 from comment #4)
> Should I keep on trying to reproduce the problem on NetBSD?

Based on the libedit version numbers, my guess is that the problem was fixed in NetBSD sometime between 2018 and 2020.  If that's correct, you won't be able to reproduce it on any modern NetBSD version, so not being able to reproduce on 9.3 is not surprising.