Bug 105 - scp protocol 2 over a hippi interface takes 6 times longer
Summary: scp protocol 2 over a hippi interface takes 6 times longer
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: scp (show other bugs)
Version: -current
Hardware: MIPS IRIX
: P2 normal
Assignee: OpenSSH Bugzilla mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2002-02-07 03:49 AEDT by Paul Smith
Modified: 2004-04-14 12:24 AEST (History)
0 users

See Also:


Attachments
like this (3.27 KB, patch)
2002-02-13 10:00 AEDT, Markus Friedl
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Smith 2002-02-07 03:49:36 AEDT
scp of a 25Mb file over a hippi interface takes 159s with protocol 2 and 22
seconds with protocol 1. Over fast ethernet it takes 20s with protocal 2 and 22s
with protocol 1. Snooping the hippi for protocol 1 has packets like

16:33:54.465436 hartree-hippi0 -> hodgkin      TCP D=22 S=57066    
Ack=851535706 Seq=1272028889 Len=61440 Win=40767
16:33:54.550683      hodgkin -> hartree-hippi0 TCP D=57066 S=22    
Ack=1272090329 Seq=851535706 Len=0 Win=19188

~62k packet size

and for protocol 2 
16:33:19.024288      hodgkin -> hartree-hippi0 TCP D=57060 S=22     
Ack=2852340565 Seq=845515278 Len=0 Win=61440
16:33:19.040910      hodgkin -> hartree-hippi0 TCP D=57060 S=22    
Ack=2852356997 Seq=845515278 Len=48 Win=61440
16:33:19.047481 hartree-hippi0 -> hodgkin      TCP D=22 S=57060    
Ack=845515326 Seq=2852356997 Len=16432 Win=40767

~16k packet size.
Comment 1 Markus Friedl 2002-02-07 06:54:27 AEDT
could you please try this without scp? e.g.

cat file | ssh -1 -c 3des 'cat > f'
cat file | ssh -2 -c 3des-cbc 'cat > f2'

thanks.
Comment 2 Paul Smith 2002-02-07 20:51:51 AEDT
time `cat lapack.ibm.tar.gz | ssh -1 -c 3des hodgkin 'cat > f' `
10.6u 0.6s 0:24 45% 954+790k 0+0io 0pf+0w

time `cat lapack.ibm.tar.gz | ssh -2 -c 3des-cbc hodgkin  'cat > f2' `
8.9u 0.7s 2:40 6% 929+631k 0+0io 0pf+0w

My colleague has done some investigating and has found something up with select.

"I have added some diagnostics to
packet.c and clientloop.c.  It is clear that the slow select calls
are not working properly - in particular, they are returning after
about 0.2 seconds WITHOUT having set a descriptor.  Subsequent
calls work.  It seems to be input from the connexion that is the
problem.

I suspect a failure to communicate from the HiPPI driver, which
then triggers a timeout."
Comment 3 Paul Smith 2002-02-08 04:31:56 AEDT
This is a problem with the nagle algoithm and a delayed ack timer
http://www.rs6000.ibm.com/support/sp/perf/nagle21.html

describes the problem over an IBM switch which again is a network with a large
MTU. (The same problem accors using scp over this type of network.)

The best solution to this is to be able to have larger packets for networks that
can support them. 
Comment 4 Paul Smith 2002-02-09 02:09:04 AEDT
Changing channels.h
#define CHAN_SES_WINDOW_DEFAULT (32*1024)
#define CHAN_TCP_WINDOW_DEFAULT (32*1024)
to
#define CHAN_SES_WINDOW_DEFAULT (256*1024)
#define CHAN_TCP_WINDOW_DEFAULT (256*1024)
Fixes the buffer problem. 
Scp is still 8 times slower than rcp. The time isn't used in CPU so there is
still scope for improvement.
Comment 5 Paul Smith 2002-02-11 21:30:07 AEDT
time `cat lapack.ibm.tar.gz | ssh -1 -c 3des hodgkin 'cat > f' `
9.2u 0.7s 0:22 44% 887+761k 0+0io 0pf+0w
time `cat lapack.ibm.tar.gz | ssh -2 -c 3des-cbc hodgkin  'cat > f2' `
8.7u 0.7s 0:22 41% 888+630k 0+0io 0pf+0w
time `cat lapack.ibm.tar.gz | rsh  hodgkin  'cat > f2' `
0.0u 0.0s 0:01 2% 77+214k 0+0io 0pf+0w
Comment 6 Markus Friedl 2002-02-12 03:30:17 AEDT
hm, i think
#define CHAN_SES_WINDOW_DEFAULT (256*1024)
#define CHAN_TCP_WINDOW_DEFAULT (256*1024)
generates packets > 32k, but i have to cross check.
does 64*1024 help.

what about using a faster cipher in your tests? :)
e.g blowfish?
Comment 7 Paul Smith 2002-02-12 04:08:58 AEDT
time `cat lapack.ibm.tar.gz | ssh -2 -c blowfish-cbc hodgkin  'cat > f2' `
2.6u 0.6s 0:06 46% 736+532k 0+0io 0pf+0w
Yep much faster but still not more than half the time in the cpu.

Yep packet size is a function of these values
#define CHAN_SES_PACKET_DEFAULT (CHAN_SES_WINDOW_DEFAULT/2)
#define CHAN_TCP_PACKET_DEFAULT (CHAN_TCP_WINDOW_DEFAULT/2)

The networks I saw the problem on hippi and IBM sp switch both have a MTU of 64k
so I wanted these values to be atleast 128 and went for 256 to be sure. 
Comment 8 Paul Smith 2002-02-12 04:43:13 AEDT
with 64 giving 32k packet
time `cat lapack.ibm.tar.gz | ssh -2 -c 3des-cbc -p 1025 hodgkin  'cat > f2' `
8.9u 0.7s 1:47 8% 887+629k 0+0io 0pf+0w
with 128 giving 64k packet
time `cat lapack.ibm.tar.gz | ssh -2 -c 3des-cbc -p 1025 hodgkin  'cat > f2' `
9.0u 0.6s 0:23 41% 895+633k 0+0io 0pf+0w
Comment 9 Markus Friedl 2002-02-13 07:56:47 AEDT
hm, ok, lets try this.

keep CHAN_SES_PACKET_DEFAULT fixed:

#define CHAN_SES_PACKET_DEFAULT (16*1024)
and change the _window_ size to

#define CHAN_SES_WINDOW_DEFAULT (CHAN_SES_PACKET_DEFAULT*4)

you can try to increase the 4.

this means the ssh client will send 4 packets before
waiting for an ACK from the server.

Comment 10 Markus Friedl 2002-02-13 10:00:23 AEDT
Created attachment 23 [details]
like this
Comment 11 Paul Smith 2002-02-13 22:51:52 AEDT
time `cat lapack.ibm.tar.gz | local/bin/ssh -2 -c 3des-cbc -p 10222 hodgkin 
'cat > f2' `
8.8u 0.8s 0:23 41% 671+630k 0+0io 137pf+0w
with these changes.
Comment 12 Markus Friedl 2002-02-14 10:20:25 AEDT
so, this helps, too?

what happens if you

#define CHAN_SES_WINDOW_DEFAULT (CHAN_SES_PACKET_DEFAULT*20)

Comment 13 Paul Smith 2002-02-15 21:10:57 AEDT
hartree_a [4] time `cat lapack.ibm.tar.gz | local/bin/ssh -2 -c 3des-cbc -p 1222
hodgkin 'cat > f2' `
8.8u 0.6s 0:23 40% 672+633k 0+0io 140pf+0w

No affect going from 4 to 20. Basically anthing that increases the window
default above 32 helps.
Comment 14 Markus Friedl 2002-02-18 08:43:40 AEDT
what about this?

Index: channels.c
===================================================================
RCS file: /cvs/openssh_cvs/channels.c,v
retrieving revision 1.138
diff -u -r1.138 channels.c
--- channels.c  8 Feb 2002 11:07:17 -0000       1.138
+++ channels.c  17 Feb 2002 21:34:48 -0000
@@ -1227,7 +1227,7 @@
 static int
 channel_handle_rfd(Channel *c, fd_set * readset, fd_set * writeset)
 {
-       char buf[16*1024];
+       char buf[64*1024];
        int len;

        if (c->rfd != -1 &&
Comment 15 Paul Smith 2002-02-18 21:07:22 AEDT
time `cat lapack.ibm.tar.gz | local/bin/ssh -2 -c 3des-cbc -p 10222 hodgkin 'cat
> f2' `
8.8u 0.7s 0:29 33% 681+701k 0+0io 139pf+0w

That didn't seem to help. I checked for reproducability.
Comment 16 Markus Friedl 2002-02-19 04:24:29 AEDT
there should be no difference between protocol 1 and 2 now.
Comment 17 Damien Miller 2004-04-14 12:24:17 AEST
Mass change of RESOLVED bugs to CLOSED