Bug 2058 - SSH Banner message displays UTF-8 multibyte char incorrrectly
Summary: SSH Banner message displays UTF-8 multibyte char incorrrectly
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: ssh (show other bugs)
Version: 6.1p1
Hardware: All All
: P5 normal
Assignee: Damien Miller
URL:
Keywords:
Depends on:
Blocks: V_7_3
  Show dependency treegraph
 
Reported: 2013-01-07 20:13 AEDT by balu9463
Modified: 2016-08-02 10:42 AEST (History)
5 users (show)

See Also:


Attachments
banner_message utf-8 support (2.97 KB, application/octet-stream)
2013-01-07 20:13 AEDT, balu9463
no flags Details
Apply RFC3454 stringprep to banners when possible (12.64 KB, patch)
2013-05-30 12:07 AEST, Damien Miller
no flags Details | Diff
stringprep v.2, with unicode 6.2 assigned characters (25.28 KB, patch)
2013-05-30 16:27 AEST, Damien Miller
no flags Details | Diff
latest version of diff (23.25 KB, patch)
2014-06-11 17:29 AEST, Damien Miller
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description balu9463 2013-01-07 20:13:06 AEDT
Created attachment 2205 [details]
banner_message utf-8 support

Hello,

This bug report is with reference to the discussion in the following link.
http://www.gossamer-threads.com/lists/openssh/dev/54481

Logging the defect with fix for consideration.

Description:
The banner message displayed on the screen contain octal values 
instead of korean chars. Prior to ssh 5.1 the banner message would 
display the charaters properly. 
I understand that starting from 5.1 the message is passed through 
strnvis() function. 
I looked into documentation on strnvis and found that it does not 
support multibyte chars and doesnt work well with international chars.
Comment 1 Damien Miller 2013-02-08 10:53:04 AEDT
Unfortunately, allowing arbitrary characters also allows terminal reprogramming attacks. A good solution to this would be to detect that the terminal can handle UTF-8 characters and to validate that the characters coming from the server consitute valid UTF-8 runes and not VT terminal reprogramming codes.
Comment 2 Laurent 2013-04-24 02:28:46 AEST
I'm seeing this issue as well, and I'd like it to be addressed.

The issue here is that it is often requested by lawyers to have a warning message before log in. Obviously, ASCII cannot not fit every language, so that is going to be an issue for many. English is not the world language.
In some places, like China, it is even mandatory for software to support the local charset, and for good reason (GB18030 for China, but that maps to Unicode). 

The risk here is that by trying to avoid a specific security issue, pragmatic users will have no other choice than to use older versions, or hack in their own patches, both of which might create bigger issues than the one avoided.

And of course, the SSH IETF draft standard itself mandates using Unicode for the banner, so being non standard is a problem too.

I think that absent a perfect solution that checks the output is perfectly valid UTF-8, there should be a user-configurable flag to allow it if needed.
Comment 3 Henrik Grindal Bakken 2013-04-26 06:06:34 AEST
I second the last comment.  Would love to see a fix here.
Comment 4 Darren Tucker 2013-04-26 10:55:46 AEST
RFC4252 says banner support is a SHOULD, and filtering control characters is also a SHOULD:

   If the 'message' string is displayed, control character filtering,
   discussed in [SSH-ARCH], SHOULD be used to avoid attacks by sending
   terminal control characters.

The text it refers to in RFC4251 is:

9.2.  Control Character Filtering

   When displaying text to a user, such as error or debug messages, the
   client software SHOULD replace any control characters (except tab,
   carriage return, and newline) with safe sequences to avoid attacks by
   sending terminal control characters.

so the current behaviour is compliant.  Whether or not is possible to safely display utf8 is a separate question.
Comment 5 Laurent 2013-04-27 00:45:26 AEST
I was referring to the the section 5.4.  Banner Message of RFC4252:

      byte      SSH_MSG_USERAUTH_BANNER
      string    message in ISO-10646 UTF-8 encoding [RFC3629]
      string    language tag [RFC3066]


But anyway, I do not want to be dragged into a standard reading competition. I almost regret citing it.

In my understanding, as a non-English speaker, that's what that section 5.4 of RFC4252 recognizes. Since it also refers to control-character filtering after asking to send a banner in UTF-8, I think it does not expect an implementation to convert everything to ASCII. That is the easy way out (and please, let's not say that ASCII being a strict subset of UTF-8, the message full of octal codes is still UTF-8).

My main point was:
for billions of people, ASCII is insufficient as a character set. So please, there should be no hiding behind the letter of a standard.
Even if OpenSSH were following it to the letter, it'd still be wrong, because the need to display more than ASCII is real.
Not doing it could be a way to get less security as a whole, not more.
Comment 6 Damien Miller 2013-04-27 16:25:06 AEST
I think a reasonable answer is to decide whether the user's terminal is UTF-8 capable (probably using $TERM, locale and/or platform) and, if so, prepare the strings for display using stringprep (http://www.ietf.org/rfc/rfc3454.txt)

For everything else, continue to use strnvis(3)

I have the beginnings of a minimal stringprep implementation, but I'll need some guidance on how to reliably decide whether UTF-8 is safe for output (or some way to render it for the user's current terminal).
Comment 7 Damien Miller 2013-05-30 12:07:39 AEST
Created attachment 2277 [details]
Apply RFC3454 stringprep to banners when possible

This patch implements RFC3454 stringprep processing for banners to remove control characters and other undesirable things. Stringprep processing is used instead of strnvis() when the user's output locale has an UTF-8 encoding.
Comment 8 Damien Miller 2013-05-30 16:27:20 AEST
Created attachment 2280 [details]
stringprep v.2, with unicode 6.2 assigned characters

Improved patch. This skips unassigned characters using the set of assigned characters in Unicode 6.2.

It also splits the UTF-8 encoding into a separate function.
Comment 9 Petr Lautrbach 2014-06-10 21:19:56 AEST
Is the patch going to be incorporated to the next release? The patch works fine for me without any apparent issues.
Comment 10 Damien Miller 2014-06-11 17:29:44 AEST
Created attachment 2440 [details]
latest version of diff

This is the latest version of my diff, simplifying the utf8 bits a little.

I haven't moved forward with this because I was told that stringprep is deprecated and I'm not sure what its replacement is. If someone who knows Unicode can verify the allowed/denied tables then we can probably move forward with this.
Comment 11 Petr Lautrbach 2014-06-13 00:30:57 AEST
Thanks for the update.

I've tried to apply the patch and I've got some errors:
- lib/Makefile doesn't exists
- there's conflict in sshconnect2.c, line 43 - 45

There's probably a typo in the whitelist table in stringprep-tables.c:
- 	{ 0x09, 0x00 },
+ 	{ 0x09, 0x09 },
Comment 12 Damien Miller 2016-07-17 14:32:12 AEST
Different fix committed (using libc mb functions). This will be in openssh-7.3
Comment 13 Damien Miller 2016-08-02 10:42:43 AEST
Close all resolved bugs after 7.3p1 release