Created attachment 2205 [details] banner_message utf-8 support Hello, This bug report is with reference to the discussion in the following link. http://www.gossamer-threads.com/lists/openssh/dev/54481 Logging the defect with fix for consideration. Description: The banner message displayed on the screen contain octal values instead of korean chars. Prior to ssh 5.1 the banner message would display the charaters properly. I understand that starting from 5.1 the message is passed through strnvis() function. I looked into documentation on strnvis and found that it does not support multibyte chars and doesnt work well with international chars.
Unfortunately, allowing arbitrary characters also allows terminal reprogramming attacks. A good solution to this would be to detect that the terminal can handle UTF-8 characters and to validate that the characters coming from the server consitute valid UTF-8 runes and not VT terminal reprogramming codes.
I'm seeing this issue as well, and I'd like it to be addressed. The issue here is that it is often requested by lawyers to have a warning message before log in. Obviously, ASCII cannot not fit every language, so that is going to be an issue for many. English is not the world language. In some places, like China, it is even mandatory for software to support the local charset, and for good reason (GB18030 for China, but that maps to Unicode). The risk here is that by trying to avoid a specific security issue, pragmatic users will have no other choice than to use older versions, or hack in their own patches, both of which might create bigger issues than the one avoided. And of course, the SSH IETF draft standard itself mandates using Unicode for the banner, so being non standard is a problem too. I think that absent a perfect solution that checks the output is perfectly valid UTF-8, there should be a user-configurable flag to allow it if needed.
I second the last comment. Would love to see a fix here.
RFC4252 says banner support is a SHOULD, and filtering control characters is also a SHOULD: If the 'message' string is displayed, control character filtering, discussed in [SSH-ARCH], SHOULD be used to avoid attacks by sending terminal control characters. The text it refers to in RFC4251 is: 9.2. Control Character Filtering When displaying text to a user, such as error or debug messages, the client software SHOULD replace any control characters (except tab, carriage return, and newline) with safe sequences to avoid attacks by sending terminal control characters. so the current behaviour is compliant. Whether or not is possible to safely display utf8 is a separate question.
I was referring to the the section 5.4. Banner Message of RFC4252: byte SSH_MSG_USERAUTH_BANNER string message in ISO-10646 UTF-8 encoding [RFC3629] string language tag [RFC3066] But anyway, I do not want to be dragged into a standard reading competition. I almost regret citing it. In my understanding, as a non-English speaker, that's what that section 5.4 of RFC4252 recognizes. Since it also refers to control-character filtering after asking to send a banner in UTF-8, I think it does not expect an implementation to convert everything to ASCII. That is the easy way out (and please, let's not say that ASCII being a strict subset of UTF-8, the message full of octal codes is still UTF-8). My main point was: for billions of people, ASCII is insufficient as a character set. So please, there should be no hiding behind the letter of a standard. Even if OpenSSH were following it to the letter, it'd still be wrong, because the need to display more than ASCII is real. Not doing it could be a way to get less security as a whole, not more.
I think a reasonable answer is to decide whether the user's terminal is UTF-8 capable (probably using $TERM, locale and/or platform) and, if so, prepare the strings for display using stringprep (http://www.ietf.org/rfc/rfc3454.txt) For everything else, continue to use strnvis(3) I have the beginnings of a minimal stringprep implementation, but I'll need some guidance on how to reliably decide whether UTF-8 is safe for output (or some way to render it for the user's current terminal).
Created attachment 2277 [details] Apply RFC3454 stringprep to banners when possible This patch implements RFC3454 stringprep processing for banners to remove control characters and other undesirable things. Stringprep processing is used instead of strnvis() when the user's output locale has an UTF-8 encoding.
Created attachment 2280 [details] stringprep v.2, with unicode 6.2 assigned characters Improved patch. This skips unassigned characters using the set of assigned characters in Unicode 6.2. It also splits the UTF-8 encoding into a separate function.
Is the patch going to be incorporated to the next release? The patch works fine for me without any apparent issues.
Created attachment 2440 [details] latest version of diff This is the latest version of my diff, simplifying the utf8 bits a little. I haven't moved forward with this because I was told that stringprep is deprecated and I'm not sure what its replacement is. If someone who knows Unicode can verify the allowed/denied tables then we can probably move forward with this.
Thanks for the update. I've tried to apply the patch and I've got some errors: - lib/Makefile doesn't exists - there's conflict in sshconnect2.c, line 43 - 45 There's probably a typo in the whitelist table in stringprep-tables.c: - { 0x09, 0x00 }, + { 0x09, 0x09 },
Different fix committed (using libc mb functions). This will be in openssh-7.3
Close all resolved bugs after 7.3p1 release