Bug 1007 - sftp client hangs on tru64 5.1A
Summary: sftp client hangs on tru64 5.1A
Status: CLOSED WORKSFORME
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: ssh (show other bugs)
Version: 4.0p1
Hardware: Alpha Tru64
: P2 normal
Assignee: OpenSSH Bugzilla mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-03-30 11:09 AEST by Paul Stepowski
Modified: 2016-08-02 10:40 AEST (History)
1 user (show)

See Also:


Attachments
results of ssh-rand-help -vvv (4.40 KB, text/plain)
2005-03-30 12:16 AEST, Paul Stepowski
no flags Details
sftp wait() changeset from 4.0 (1.27 KB, patch)
2005-03-30 20:27 AEST, Darren Tucker
no flags Details | Diff
results of ssh-rand-help -vvv (2) (5.31 KB, text/plain)
2005-04-06 15:04 AEST, Paul Stepowski
no flags Details
Make ssh-rand-helper close all fds above STDERR (585 bytes, patch)
2008-01-01 04:57 AEDT, Darren Tucker
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Paul Stepowski 2005-03-30 11:09:07 AEST
Platform: Compaq Tru64 UNIX V5.1A (Rev. 1885)

When building:

openssh-4.0p1 against openssl-0.9.7f and zlib-1.2.2

The sftp client hangs when attempting to log into the remote host. e.g.

---snip---
# sftp -vvv root@css-ps
Connecting to css-ps
OpenSSH_4.0p1, OpenSSL 0.9.7f 22 Mar 2005
debug1: Reading configuration data /usr/local/ssh/etc/ssh_config
debug3: Seeding PRNG from /usr/local/ssh/libexec/ssh-rand-helper


---snip---

I had the same problem when building:

openssh-3.9p1 against openssl-0.9.7f and zlib-1.2.2

When I built:

openssh-4.0p1 against openssl-0.9.7e and zlib-1.2.2

I do not have this problem.

So it appears that some change between openssl-0.9.7e and openssl-0.9.7f has 
broken the sftp client.

For more info, please see secure-shell mailing list:

http://marc.theaimsgroup.com/?l=secure-shell&m=111167722024411&w=2
http://marc.theaimsgroup.com/?l=secure-shell&m=111210986628577&w=2
Comment 1 Darren Tucker 2005-03-30 11:16:49 AEST
Could you please run ssh-rand-helper on its own in debug mode and attach the
output using "Create New Attachment"?  ie

$ /usr/local/ssh/libexec/ssh-rand-helper -vvv

Comment 2 Darren Tucker 2005-03-30 11:23:40 AEST
BTW does it hang with regular ssh commands too or is it specific to sftp?
Comment 3 Paul Stepowski 2005-03-30 12:16:48 AEST
Created attachment 859 [details]
results of ssh-rand-help -vvv

This is the current output of ssh-rand-helper.	Here's the weird thing.  I
rebuilt openssh-4.0p1 with the lastest openssl and zlib and sftp works.  (i.e.
now I cannot reproduce the problem.)  Now there's no reason why the rebuild
would change anything.

It must have something to do with the ssh-rand-help command.  I tested this
when I was having the problem though and ssh-rand-helper appeared to run fine
from the command line.	(See:
http://marc.theaimsgroup.com/?l=secure-shell&m=111167722024411&w=2 for my
test.)

Is there any difference in running from command line and the way sftp runs
ssh-rand-helper.
Comment 4 Paul Stepowski 2005-03-30 15:25:54 AEST
(In reply to comment #2)
> BTW does it hang with regular ssh commands too or is it specific to sftp?

Specific to sftp.
Comment 5 Darren Tucker 2005-03-30 20:27:48 AEST
Created attachment 860 [details]
sftp wait() changeset from 4.0

I don't think this is it, but it's the only thing I could come up with that
might  have an effect anything like what you're reporting.  It also doesn't
explain why it only occurs with openssl-0.9.7f or why it apparently stopped
after a recompile.

If you manage to reproduce it again, try reverting this patch (patch -R) and
retesting.
Comment 6 Darren Tucker 2005-03-31 23:08:48 AEST
One other thing: do you have prngd or egd?
Comment 7 Paul Stepowski 2005-04-04 09:03:26 AEST
(In reply to comment #5)
> Created an attachment (id=860) [edit]
> sftp wait() changeset from 4.0
> I don't think this is it, but it's the only thing I could come up with that
> might  have an effect anything like what you're reporting.  It also doesn't
> explain why it only occurs with openssl-0.9.7f or why it apparently stopped
> after a recompile.
> If you manage to reproduce it again, try reverting this patch (patch -R) and
> retesting.

I'll do this when the problem comes up again.  We have a number of Tru64 boxes 
running OpenSSH so I dare say we haven't seen the last of this problem.

(In reply to comment #6)
> One other thing: do you have prngd or egd?

No.
Comment 8 Darren Tucker 2005-04-04 10:54:35 AEST
Is it dependant on the version of the server you're connecting to?
Comment 9 Paul Stepowski 2005-04-06 14:44:02 AEST
(In reply to comment #8)
> Is it dependant on the version of the server you're connecting to?

No.  I've tried sftp'ing to various different ssh servers.  Linux, Tru64, 
FreeBSD with OpenSSH and Tru64 with commercial ssh.
Comment 10 Paul Stepowski 2005-04-06 14:52:33 AEST
(In reply to comment #5)
> Created an attachment (id=860) [edit]
> sftp wait() changeset from 4.0
> I don't think this is it, but it's the only thing I could come up with that
> might  have an effect anything like what you're reporting.  It also doesn't
> explain why it only occurs with openssl-0.9.7f or why it apparently stopped
> after a recompile.
> If you manage to reproduce it again, try reverting this patch (patch -R) and
> retesting.

When I try:

# patch -R < openssh-sftp-sigchld.patch.txt

I get:

Hmm...  I can't seem to find a patch in there anywhere.
Comment 11 Paul Stepowski 2005-04-06 15:04:21 AEST
Created attachment 869 [details]
results of ssh-rand-help -vvv (2)

Problem is back in the same machine as before.	I've included debug output of
ssh-rand-helper and it all looks OK.  I couldn't get the patch you posted to
apply but I think we've ruled out ss-rand-helper as the problem.
Comment 12 Darren Tucker 2005-04-06 15:11:31 AEST
(In reply to comment #10)
> Hmm...  I can't seem to find a patch in there anywhere.

It's a unified diff, some patch program don't understand them.  If you have GNU
patch available then try that, otherwise I can recreate the diff in another
format.  (Alternatively you could apply by hand, it's only a couple of lines if
you ignore the scp.c part).
Comment 13 Paul Stepowski 2005-04-06 15:47:29 AEST
(In reply to comment #12)
> (In reply to comment #10)
> > Hmm...  I can't seem to find a patch in there anywhere.
> It's a unified diff, some patch program don't understand them.  If you have 
GNU
> patch available then try that, otherwise I can recreate the diff in another
> format.  (Alternatively you could apply by hand, it's only a couple of lines 
if
> you ignore the scp.c part).

Yeah Tru64(Sucks)UNIX diff/patch utilities don't support unified diffs, it 
turns out.  So I applied by hand e.g. commented out the call to waitpid but 
the problem is still there.


Comment 14 Damien Miller 2007-06-15 10:55:36 AEST
If you are still experiencing this bug, could you please try to truss or strace (or whatever the Tru64 equivalent is) the sshd and ssh-rand-helper processes, and attach the results?

Alternately, you could use a debugger or insert a bunch of printf("still here"); statements into entropy.c:seed_rng() to see where it chokes.
Comment 15 Paul Stepowski 2007-06-15 14:11:06 AEST
Hi Damien,

Casting my mind back I believe I worked around the issue by editing:

/usr/local/ssh/etc/ssh_prng_cmds

And commenting out:

---snip---
#"netstat -an" /usr/sbin/netstat 0.05
#"netstat -in" /usr/sbin/netstat 0.05
#"netstat -rn" /usr/sbin/netstat 0.02
#"netstat -pn" undef 0.02
#"netstat -ia" /usr/sbin/netstat 0.05
#"netstat -s" /usr/sbin/netstat 0.02
#"netstat -is" /usr/sbin/netstat 0.07
#"arp -a -n" /sbin/arp 0.02
---snip---

It seems these commands were taking a long time to complete and delaying the PRNG seed by the sftp client.  From memory, it was *not* DNS resolution (netstat -n) that was causing the delay in netstat.  It must have been some other issue.  I don't recall any more details I could give you.

Let me know if you need any more detail and I'll see what I can dig up.

Thanks,

Paul
Comment 16 Darren Tucker 2008-01-01 04:57:07 AEDT
Created attachment 1431 [details]
Make ssh-rand-helper close all fds above STDERR

Random thought: I wonder if some commands are making unwarranted assumptions about the descriptors they inherit?

The down side of this is one systems that don't have a native closefrom or equivalent is that the equivalent close calls are going to be relatively slow (but only required once).
Comment 17 Damien Miller 2009-08-18 11:12:32 AEST
Can't replicate this and can't diagnose without a truss or similar; ssh-rand-helper includes subcommand timeout logic that should avoid this. Please reopen if it recurs and you can trace it.
Comment 18 Damien Miller 2012-11-04 22:20:44 AEDT
As I said a few years ago, we can't diagnose this without more information. Please reopen this bug if the problem is still occurring and you are able to provide it.
Comment 19 Damien Miller 2016-08-02 10:40:44 AEST
Close all resolved bugs after 7.3p1 release