Bug 772 - Corrupted MAC on input
Summary: Corrupted MAC on input
Status: CLOSED WORKSFORME
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: ssh (show other bugs)
Version: 3.7.1p1
Hardware: All Linux
: P2 normal
Assignee: OpenSSH Bugzilla mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-12-13 14:21 AEDT by James Mackie
Modified: 2004-04-14 12:24 AEST (History)
0 users

See Also:


Attachments
make tests results.. (9.68 KB, text/plain)
2003-12-13 23:51 AEDT, James Mackie
no flags Details
netstat -s output (2.40 KB, text/plain)
2003-12-14 01:51 AEDT, James Mackie
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description James Mackie 2003-12-13 14:21:07 AEDT
I have run into this on a couple of my systems during backups over ssh, I am 
tarring the full server directory structure and piping it over ssh and 
redirecting it tar file on our backups server.

this does not happen on every server.. however the servers that it DOES happen 
to.. it is repeatable and happens EVERY time.. 

on double checking the architectures.. it would appear that the 2 servers that 
I KNOW for sure do this.. are dual AMD's.. 

the backup server is a p4..
Comment 1 James Mackie 2003-12-13 14:22:45 AEDT
Actually one is a dual p3.. sorry.. 
Comment 2 Darren Tucker 2003-12-13 16:09:04 AEDT
You don't happen to have a LinkSys router between client and server, do you?  If
so see bug #510.
Comment 3 James Mackie 2003-12-13 18:45:25 AEDT
Nope.. no linksys router.. 

These are internet webservers.. they are connected to switches of 3Com and/or 
Cisco.. 

Comment 4 Darren Tucker 2003-12-13 20:03:15 AEDT
So the problem only occurs with a dual-cpu client?  What linux distribution and
kernel version?  Can you run openssh's regression tests ("make tests")?

It would also be interesting to know if you can reproduce this with an openssh
*and* openssl compiled without optimization.
Comment 5 Damien Miller 2003-12-13 20:37:30 AEDT
Please provide some detail on your OS platform and OpenSSL version. Did you
compile OpenSSL yourself? If you used a vendor OpenSSL, is it optimised for a
particular CPU architecture?

These errors are usually OpenSSL issues.

Comment 6 Damien Miller 2003-12-13 20:40:27 AEDT
Also check "netstat -s" for packets with bad IP/TCP checksums. IP and TCPs
checksums are pretty short, so bad packets can occasionally make it through to
the application layer (where it will be detected by the MAC).
Comment 7 James Mackie 2003-12-13 23:11:54 AEDT
self compiled.. openssh3.7.1p1 and openssl 0.9.6b.. redhat 7.3 on the most 
recently found server with this error.. it is using the redhat kernel 2.4.18-
3smp.. 

openssl i remember installing a later version.. and same with the kernel.. 
however this is how it is currently situated.. (possibly forgot to remove the 
fallback in lilo or something)

let me recompile a few things now that i see that i have more to do before 
submitting this report.. my apologies.. I will let you know what happens after 
i recompile everything up to date..  


Comment 8 James Mackie 2003-12-13 23:51:13 AEDT
Created attachment 509 [details]
make tests results.. 

This is the tests results after relinking openssh-3.7.1p1 with openssl-0.9.7b
(still gives the Corrupted MAC on input error)
Comment 9 Darren Tucker 2003-12-14 00:44:08 AEDT
Well, the output shows that the regression tests pass.  Did "netstat -s" show
any errors?  Can you replicate the error using just the loopback?  eg

# ssh root@127.0.0.1 "tar cf - /" >/dev/null
or
# ssh -o Compression=no root@127.0.0.1 "dd if=/dev/zero bs=1k count=1m" >/dev/null

And it's definitely only occurs on SMP boxes?

Also note that 3.7.1p1 has some security issues WRT PAM, if you're using PAM you
should upgrade: http://www.openssh.com/txt/sshpam.adv
Comment 10 James Mackie 2003-12-14 01:51:53 AEDT
Created attachment 510 [details]
netstat -s output

netstat output.. still getting the same error after kernel upgrade to 2.4.23
Comment 11 James Mackie 2003-12-14 14:05:43 AEDT
Just an update.. i think i have tracked it down to actually being on the backup 
(receiving) server.. running the command piped thru locally gives this error on 
the backup server.. but not on the clients..  

it is running 3.7.1p1 (no we dont use PAM so we didn't do the p2 upgrade.) 
linked against openssl 0.9.6b (it would appear the redhat puts ssl in a totally 
different place than the compiled version and configure finds the old version.. 
meaning that i am going to have to recompile every server to tell it to use the 
newer version.. but thats another issue) 

running compiled kernel 2.4.20 on a p4

I will relink the openssh against the newer openssl and let you guys know the 
results.. 

Comment 12 James Mackie 2003-12-15 16:44:53 AEDT
ok.. close this report.. 

i am going to chock this up to bad ram in the server.. recompiling openssh and 
the kernel did all sorts of wierd things that clearly points to memory problems 
on the server.. 

thanks for your help, and my apologies for the false report.. :( 

James
Comment 13 Damien Miller 2003-12-15 16:50:54 AEDT
Thanks - we'll add "bad ram" to the list of things that can cause this.
Comment 14 Damien Miller 2004-04-14 12:24:20 AEST
Mass change of RESOLVED bugs to CLOSED