Bug 1337 - SCP performance twice as slow as RCP
Summary: SCP performance twice as slow as RCP
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: scp (show other bugs)
Version: 3.8.1p1
Hardware: Other AIX
: P3 enhancement
Assignee: Assigned to nobody
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-07-10 05:04 AEST by Jeffery Martinez
Modified: 2010-04-16 15:50 AEST (History)
4 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jeffery Martinez 2007-07-10 05:04:02 AEST
Our testing shows significant performance reduction when using SCP instead of the unsecure RCP protocol to transfer files.  Our testing shows SCP delivers a transfer rate of 16.35 MB/sec while RCP delivers 39.06 MB/sec.  We have tested on a few different systems, and while overall speed improves on more powerful servers, the relative performance remains approximately the same.  SCP is approximately 2 times slower in transfer rate than RCP.  Has there been any work done to improve this transfer speed in newer versions of OpenSSH?  If not, I would like to request an enhancement to improve the SCP transfer rate.
Comment 1 Darren Tucker 2007-07-10 22:59:27 AEST
Because scp does fundamentally more work (encryption,mac) and sends more data (the mac) you probably won't get scp as fast as rcp as long as those remain.  Given enough CPU horsepower, in theory you might be able to get close, though.

There's a few things that you can do: for any version of OpenSSH you should choose a cipher and mac that is fast on your hardware.  You can test them with the "openssl speed" tool.  Unless you have odd hardware or crypto accelerators, this is likely to be arcfour and hmac-md5-96.

There are also some speed improvements in the current development version that haven't made a release yet:
1) channel window size increase. Only likely to help if you have a long, fat pipe.
2) a new, faster MAC: umac-64@openssh.com (see http://fastcrypto.org/umac/)
3) MAC contexts are cleared and reused rather than deallocating/reallocating them, saving ~10% of CPU time.

There's another change pending that has not gone in yet:
4) increasing the scp buffer size and polling rather than busy-waiting in scp when writes stall.  This is waiting for some slacker to review and test (oh, wait, that would be me :-).

#1 and #4 are somewhat similar to the changes in the hpn-ssh patch at PSC, although different in implementation. 

Normally I would just ask you to try a snapshot from ftp.openbsd.org but they seem to be stale at the moment.  If you would like to try it out I have put up a temporary snap at http://www.zip.com.au/~dtucker/tmp/openssh-SNAP_DT-20070710.tar.gz .  If you do try it, please let us know how it goes!
Comment 2 Pádraig Brady 2007-07-13 20:47:03 AEST
I had to comment on such a nice bug number :)

Darren, in relation to your point 4 above, will it help the following?
I noticed that for scp if you disable compression (yes disable),
then transfer rate increases a lot. I looked at it very quickly
and it seemed worse the more the data compressed.
Also Protocol=1 seems much better:

dd bs=1M count=50 if=/dev/zero of=50MB_zeros
dd bs=1M count=50 if=/dev/urandom of=50MB_random

$ scp localhost:50MB_random /tmp
50MB_random                                   100%   50MB  10.0MB/s   00:05

$ scp -C localhost:50MB_random /tmp
50MB_random                                   100%   50MB   4.6MB/s   00:11

$ scp -C localhost:50MB_zeros /tmp
50MB_zeros                                    100%   50MB   2.3MB/s   00:22

#setting Protocol=1 in sshd_config I get the following speed:
$ scp -C localhost:50MB_zeros /tmp #Includes time to type password!
50MB_zeros                                    100%   50MB  12.5MB/s   00:04
Comment 3 Darren Tucker 2007-07-13 21:23:52 AEST
(In reply to comment #2)
> I had to comment on such a nice bug number :)
> 
> Darren, in relation to your point 4 above, will it help the following?
> I noticed that for scp if you disable compression (yes disable),
> then transfer rate increases a lot. I looked at it very quickly
> and it seemed worse the more the data compressed.

Like many things, the answer is "it depends".  Compression is not free, and is not automatically a win.  It gets better throughput when the time taken to compress and decompress the data is less than the amount of time saved by transferring the smaller amount of data across the pipe.

If you have a fast CPU, slow link and compressible data, enabling compression is probably a win.  On the other hand, since TCP connections to localhost are relatively fast, enabling compression on a localhost will probably slow things down.  Ditto for enabling compression on uncompressible data such as your random output: you will spend a lot of  CPU trying (and failing) to compress the data.  Enabling it for a localhost copy of compressed data pretty much guarantees that it will be slower.

> Also Protocol=1 seems much better:
[...]
> #setting Protocol=1 in sshd_config I get the following speed:

You don't need to do that.  Assuming your server has "Protocol 2,1" then you can switch between the two with "scp -o Protocol 1" (or 2).

> $ scp -C localhost:50MB_zeros /tmp #Includes time to type password!

Including user interaction in the timing adds a potentially large variation in the results.  Use pubkey authentication for more consistent results.

If you want to draw any kind of conclusion you should reduce the number of variables.  Remove the compression, then compare Protocol 1 and 2 for an identical dataset.  If protocol 1 is consistently faster by any significant margin then that might be worth looking into.

Then if you still want to play with compression test again with it on, but remember that in Protocol 1 the compression level is variable between 1 and 9 whereas in Protocol 2 it's fixed (in OpenSSH, at "6") so make sure you're comparing apples and apples.
Comment 4 Pádraig Brady 2007-07-14 00:24:20 AEST
Thanks for the reply Darren.

I understand about the trade off with compression :)
but the issue I think is due to buffer sizes,
and inappropriate sleeping. Have a look at
just 2 of the results again:

$ scp -C localhost:50MB_random /tmp
50MB_random                                   100%   50MB   4.6MB/s  
00:11

$ scp -C localhost:50MB_zeros /tmp
50MB_zeros                                    100%   50MB   2.3MB/s  
00:22

So compressing zeros (easy), and transfering much less data,
takes twice as long?

Looking a little bit deeper, shows that it only
uses 15% of the CPU while doing this! (11% for sshd and 4% for scp).
Note it's not waiting on the disk, because I tested
with a ram disk with the same results, like:

mount -t tmpfs tmpfs /mnt/rd
scp -C localhost:/mnt/rd/20MB /mnt/rd/20MB.copy
Comment 5 Darren Tucker 2007-07-14 11:20:56 AEST
(In reply to comment #4)
[...]
> So compressing zeros (easy), and transfering much less data,
> takes twice as long?

What kind of system is that?  (CPU, memory, OS, anything else relevant?)

I attempted to reproduce your result on a Linux box (1.8GHz celeron) but got the opposite result which is more along the lines of what you'd expect.  The first is a file of zeros, the second is from /dev/urandom:

$ scp -C /tmp/50m localhost:/dev/null
50m                                     100%   50MB   7.1MB/s   00:07
$ scp -C /tmp/50mr localhost:/dev/null
50mr                                    100%   50MB   2.0MB/s   00:25
Comment 6 Darren Tucker 2007-07-14 11:30:37 AEST
(In reply to comment #5)
> What kind of system is that?  (CPU, memory, OS, anything else
> relevant?)

Also, which protocol and cipher were you using?  My test as with protocol 2 and arcfour but I got very similar results with aes128-cbc.
Comment 7 Darren Tucker 2007-07-14 11:48:03 AEST
(In reply to comment #2)
> #setting Protocol=1 in sshd_config I get the following speed:
> $ scp -C localhost:50MB_zeros /tmp #Includes time to type password!
> 50MB_zeros                                    100%   50MB  12.5MB/s  
> 00:04

Which cipher was that with?  I suspect it's blowfish which I was suprised to see is faster than arcfour on my system.

$ openssl speed blowfish rc4 des
[...]
type         16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
rc4          24346.86k    25894.12k    24912.30k    26152.62k    27861.13k
des cbc      6074.91k     6242.49k     6238.12k     6265.58k     6251.94k
des ede3     1863.85k     1873.69k     1878.37k     1877.22k     1876.41k
blowfish cbc 6135.08k    48098.93k    48808.53k    48968.36k    48846.54k

If you want to compare speeds Protocols 1 and 2, make sure you're using the same cipher on both.  The only ciphers that they have in common are 3des/3des-cbc and blowfish/blowfish-cbc, and even then two protocols use Blowfish with different key lengths.
Comment 8 Pádraig Brady 2007-07-16 23:51:58 AEST
OK I upgraded my ssh and it looks like this is already fixed.
The original results were for openssh-3.9p1-7 (with only 15% CPU used):

$ scp -C localhost:50MB_zeros /tmp
50MB_zeros                                    100%   50MB   2.3MB/s  
$ scp -C localhost:50MB_random /tmp
50MB_random                                   100%   50MB   4.6MB/s


With openssh-4.5p1-6.fc7 I get 100% of the CPU used:

$ scp -C localhost:50MB_zeros /tmp
50MB_zeros                                    100%   50MB  16.7MB/s
$ scp -C localhost:50MB_random /tmp
50MB_random                                   100%   50MB   5.6MB/s


Comment 9 D. Hugh Redelmeier 2007-07-18 07:02:53 AEST
Compression not only reduces transmission time, it reduces encryption and decryption time.  My guess is that this is a net win with a slow cypher like 3DES.  I offer no measurements so take this as only an hypothesis.
Comment 10 Darren Tucker 2010-01-13 11:33:32 AEDT
I neglected to close this since it's long since fixed.  Closing.
Comment 11 Damien Miller 2010-04-16 15:50:09 AEST
Mass move of bugs RESOLVED->CLOSED following the release of openssh-5.5p1