Bug 651 - SCO 3.2v4.2 and OpenSSH 3.7.1p1 --> connection hangs and does not close (ssh2 only)
Summary: SCO 3.2v4.2 and OpenSSH 3.7.1p1 --> connection hangs and does not close (ssh2...
Status: CLOSED WONTFIX
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sshd (show other bugs)
Version: 3.7p1
Hardware: All Other
: P2 major
Assignee: OpenSSH Bugzilla mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2003-09-17 19:03 AEST by Vikash Badal
Modified: 2004-04-14 12:24 AEST (History)
0 users

See Also:


Attachments
compressed config.log (50.49 KB, application/octet-stream)
2003-12-30 22:06 AEDT, Vikash Badal
no flags Details
compressed config.h (9.48 KB, application/octet-stream)
2003-12-30 22:49 AEDT, Vikash Badal
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description Vikash Badal 2003-09-17 19:03:21 AEST
When executing a remote command or when exiting from a shell,
the ssh connection hangs indefinitely, the only way to close the session
is [crontol]+[c] even [~][.] does not work.

This problem only exists when using ssh2 connections.

the server side debug(-d -d -d) :

debug1: Received SIGCHLD.
debug2: channel 0: read failed
debug2: channel 0: close_read
debug2: channel 0: input open -> drain
debug2: channel 0: ibuf_empty delayed efd 12/(0)
debug2: notify_done: reading
debug2: channel 0: read 0 from efd 12
debug2: channel 0: closing read-efd 12
debug2: channel 0: ibuf empty
debug2: channel 0: send eof
debug2: channel 0: input drain -> closed
--->hangs<---
--------------------------------------------

below is a backtrace from gdb:
(gdb) s
326             ret = select((*maxfdp)+1, *readsetp, *writesetp, NULL, tvp);
(gdb) l
321                     tv.tv_usec = 1000 * (max_time_milliseconds % 1000);
322                     tvp = &tv;
323             }
324
325             /* Wait for something to happen, or the timeout to expire. */
326             ret = select((*maxfdp)+1, *readsetp, *writesetp, NULL, tvp);
327
328             if (ret == -1) {
329                     memset(*readsetp, 0, *nallocp);
330                     memset(*writesetp, 0, *nallocp);
(gdb) bt
#0  wait_until_can_do_something (readsetp=0x7ffff8d8, writesetp=0x7ffff8d4,
    maxfdp=0x7ffff8d0, nallocp=0x7ffff8cc, max_time_milliseconds=0)
    at serverloop.c:326
#1  0x8bfc in server_loop2 (authctxt=0x42eaf8) at serverloop.c:770
#2  0x104bb in do_authenticated2 (authctxt=0x42eaf8) at session.c:2152
#3  0xcc19 in do_authenticated (authctxt=0x42eaf8) at session.c:216
#4  0x2eaa in main (ac=6, av=0x7ffffe1c) at sshd.c:1506
(gdb) s

I have no idea how to resolve this.
Comment 1 Vikash Badal 2003-10-01 22:00:34 AEST
After spending some more time with this, I have discovered that
gdb always hangs at ret = select((*maxfdp)+1, *readsetp, *writesetp, NULL, tvp);
even on 3.5p1.

After some more tracing, it seems that the connection_closed does not change,
the only place that could possiblity change connection_closed is process_input() 
in serverloop.c

I am not sure how this section works, attached is a diff that forces 
connection_closed to be set to 1 if SIGCHLD is received:
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
--- serverloop.c.orig   Wed Oct  1 09:04:00 2003
+++ serverloop.c        Wed Oct  1 13:58:24 2003
@@ -82,6 +82,7 @@
 static int connection_closed = 0;      /* Connection to client closed. */
 static u_int buffer_high;      /* "Soft" max buffer size. */
 static int client_alive_timeouts = 0;
+int kill_session = 0;

 /*
  * This SIGCHLD kludge is used to detect when the child exits.  The server
@@ -144,6 +145,7 @@
        int save_errno = errno;
        debug("Received SIGCHLD.");
        child_terminated = 1;
+       kill_session = 1;
 #ifndef _UNICOS
        mysignal(SIGCHLD, sigchld_handler);
 #endif
@@ -345,6 +347,11 @@
 {
        int len;
        char buf[16384];
+
+        /* set connection_closed to 1 if received SIGCHLD */
+        if ( kill_session == 1 ) {
+           connection_closed = 1;
+        }

        /* Read and buffer any input data from the client. */
        if (FD_ISSET(connection_in, readset)) {
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Is this approach incorrect ?
Comment 2 Damien Miller 2003-10-02 14:02:36 AEST
Are you sure that this is not bug #52 ?
Comment 3 Vikash Badal 2003-10-02 14:23:49 AEST
I'm not sure, 3.4p1 and 3.5p1 worked fine
Bug #52 : ssh hangs on exit after running commands that fork
ssh hangs on exit all the time, whether i use bash, ksh , sh, csh
where i execute a remote command or scp or an interactive shell

Sorry if i'm not very helpful, SCO is a really crappy OS, but I still have
50+ to support and maintain.
Comment 4 Vikash Badal 2003-12-30 16:39:21 AEDT
tried openssh-SNAP-20031223

does not compile:
        (cd openbsd-compat && make)
        gcc -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -I. -I.. -I. -I./..
-I/usr/local/ssl/include  -Dftruncate=chsize -I/usr/local/include
-DHAVE_CONFIG_H -c bsd-arc4random.c
In file included from ../includes.h:170,
                 from bsd-arc4random.c:25:
../defines.h:146: #error "8 bit int type not found."
../defines.h:158: #error "16 bit int type not found."
../defines.h:167: #error "32 bit int type not found."
../defines.h:183: #error "8 bit int type not found."
../defines.h:195: #error "16 bit int type not found."
../defines.h:204: #error "32 bit int type not found."
../defines.h:243: warning: `SIZE_T_MAX' redefined
../defines.h:237: warning: this is the location of the previous definition
In file included from ../openbsd-compat/getrrsetbyname.h:57,
                 from ../openbsd-compat/openbsd-compat.h:40,
                 from ../includes.h:173,
                 from bsd-arc4random.c:25:
/usr/include/arpa/nameser.h:48: warning: `/*' within comment
In file included from ../openbsd-compat/openbsd-compat.h:128,
                 from ../includes.h:173,
                 from bsd-arc4random.c:25:
../openbsd-compat/bsd-waitpid.h:42: warning: `WEXITSTATUS' redefined
/usr/include/sys/wait.h:82: warning: this is the location of the previous definition
../openbsd-compat/bsd-waitpid.h:43: warning: `WTERMSIG' redefined
/usr/include/sys/wait.h:84: warning: this is the location of the previous definition
../openbsd-compat/bsd-waitpid.h:45: warning: `WCOREDUMP' redefined
/usr/include/sys/wait.h:94: warning: this is the location of the previous definition
*** Error code 1
*** Error code 1
[root@sco]: /usr/home/dev/openssh # ls ..


please advise
Comment 5 Darren Tucker 2003-12-30 20:41:11 AEDT
Which gcc version?
Comment 6 Vikash Badal 2003-12-30 20:47:13 AEDT
gcc version 2.7.2.3
Comment 7 Darren Tucker 2003-12-30 21:56:32 AEDT
Please check config.log for the results of the "checking for intXX_t types" test.
Comment 8 Vikash Badal 2003-12-30 22:06:09 AEDT
Created attachment 518 [details]
compressed config.log

configure:11392: checking for intXX_t types
configure:11411: gcc -c -g -O2 -Wall -Wpointer-arith -Wno-uninitialized
-I/usr/l
ocal/ssl/include  -Dftruncate=chsize -I/usr/local/include conftest.c >&5
configure: In function `main':
configure:11404: `int8_t' undeclared (first use this function)
configure:11404: (Each undeclared identifier is reported only once
configure:11404: for each function it appears in.)
configure:11404: parse error before `a'
configure:11404: `int16_t' undeclared (first use this function)
configure:11404: `int32_t' undeclared (first use this function)
configure:11404: `a' undeclared (first use this function)
configure:11404: `b' undeclared (first use this function)
configure:11404: `c' undeclared (first use this function)
configure:11414: $? = 1

I've attached the config.log (compressed) if more information is required.
Comment 9 Darren Tucker 2003-12-30 22:20:39 AEDT
The fragments of defines.h look like this:
# if (SIZEOF_CHAR == 1)
typedef char int8_t;
# else
#  error "8 bit int type not found."
# endif

Is SIZEOF_CHAR defined in config.h?  (I see it's in config.log.)
Comment 10 Vikash Badal 2003-12-30 22:49:25 AEDT
Created attachment 519 [details]
compressed config.h

from config.h :
/* The size of a `char', as computed by sizeof. */
/* #undef SIZEOF_CHAR */

I've attached the config.h (compressed).

after looking through config.h, its seems like nothing is defined
I am not sure what is causing this.
using 3.5p1 (last version that works), the following are defined:
#define _CONFIG_H
#define BROKEN_SYS_TERMIO_H 1
#define HAVE_SECUREWARE 1
#define LOGIN_PROGRAM_FALLBACK "/bin/login"
#define HAVE_ACCRIGHTS_IN_MSGHDR 1
#define HAVE_SYS_ERRLIST 1
#define HAVE_SYS_NERR 1
#define USE_PIPES 1
#define ENTROPY_TIMEOUT_MSEC 200
#define SSH_PRIVSEP_USER "sshd"
#define HAVE_OPENSSL 1
#define HAVE_STRUCT_TIMEVAL 1
#define HAVE_PID_IN_UTMP 1
#define HAVE_TYPE_IN_UTMP 1
#define HAVE_ID_IN_UTMP 1
#define HAVE_EXIT_IN_UTMP 1
#define HAVE_TIME_IN_UTMP 1
#define DISABLE_UTMPX 1
#define DISABLE_WTMPX 1
#define HAVE___FUNCTION__ 1
#define DISABLE_SHADOW 1
#define MAIL_DIRECTORY ""
#define HAVE_U_INT 1
#define HAVE_U_CHAR 1
#define HAVE_SIZE_T 1
#define HAVE_CLOCK_T 1
#define HAVE_MODE_T 1
#define HAVE_PID_T 1
#define USER_PATH "/usr/bin:/bin:/usr/sbin:/sbin:/usr/home/work/bin/bin"
#define _PATH_SSH_PIDDIR "/usr/home/work/bin/etc"
#define BROKEN_SAVED_UIDS 1
#define BROKEN_ONE_BYTE_DIRENT_D_NAME 1
#define GETPGRP_VOID 1
#define HAVE_BCOPY 1
#define HAVE_CLOCK 1
#define HAVE_ENDUTENT 1
#define HAVE_GETCWD 1
#define HAVE_GETLUID 1
#define HAVE_GETOPT 1
#define HAVE_GETTIMEOFDAY 1
#define HAVE_GETUTENT 1
#define HAVE_GETUTID 1
#define HAVE_GETUTLINE 1
#define HAVE_GLOB 1
#define HAVE_INET_NTOA 1
#define HAVE_LIBSOCKET 1
#define HAVE_LIBZ 1
#define HAVE_LIMITS_H 1
#define HAVE_LOGWTMP 1
#define HAVE_MEMMOVE 1
#define HAVE_MEMORY_H 1
#define HAVE_NETDB_H 1
#define HAVE_NETINET_IN_SYSTM_H 1
#define HAVE_PUTUTLINE 1
#define HAVE_SETEGID 1
#define HAVE_SETEUID 1
#define HAVE_SETGROUPS 1
#define HAVE_SETLUID 1
#define HAVE_SETREUID 1
#define HAVE_SETSID 1
#define HAVE_SETUTENT 1
#define HAVE_SETVBUF 1
#define HAVE_SHADOW_H 1
#define HAVE_SIGACTION 1
#define HAVE_SIG_ATOMIC_T 1
#define HAVE_STDDEF_H 1
#define HAVE_STDLIB_H 1
#define HAVE_STRERROR 1
#define HAVE_STRFTIME 1
#define HAVE_STRINGS_H 1
#define HAVE_STRING_H 1
#define HAVE_SYSCONF 1
#define HAVE_SYS_SELECT_H 1
#define HAVE_SYS_STAT_H 1
#define HAVE_SYS_STROPTS_H 1
#define HAVE_SYS_SYSMACROS_H 1
#define HAVE_SYS_TIME_H 1
#define HAVE_SYS_TYPES_H 1
#define HAVE_TCGETPGRP 1
#define HAVE_TIME 1
#define HAVE_TIME_H 1
#define HAVE_UNISTD_H 1
#define HAVE_UTIL_H 1
#define HAVE_UTIME_H 1
#define HAVE_UTMPNAME 1
#define HAVE_UTMP_H 1
#define HAVE_WAITPID 1
#define PACKAGE_BUGREPORT ""
#define PACKAGE_NAME ""
#define PACKAGE_STRING ""
#define PACKAGE_TARNAME ""
#define PACKAGE_VERSION ""
#define SIZEOF_CHAR 1
#define SIZEOF_INT 4
#define SIZEOF_LONG_INT 4
#define SIZEOF_LONG_LONG_INT 8
#define SIZEOF_SHORT_INT 2
#define STDC_HEADERS 1
#define socklen_t int

will try adding the #defines from the config.log to see what happens.
Comment 11 Vikash Badal 2003-12-30 23:03:30 AEDT
after adding the #defines from config.log and from 3.5p1 ( deplicates removed)
compilation now fails at :
getrrsetbyname.c: In function `getrrsetbyname':
getrrsetbyname.c:164: warning: initialization from incompatible pointer type
getrrsetbyname.c:193: dereferencing pointer to incomplete type
getrrsetbyname.c:193: warning: implicit declaration of function `res_init'
getrrsetbyname.c:209: warning: implicit declaration of function `res_query'
getrrsetbyname.c:212: `h_errno' undeclared (first use this function)
getrrsetbyname.c:212: (Each undeclared identifier is reported only once
getrrsetbyname.c:212: for each function it appears in.)
getrrsetbyname.c:267: `T_SIG' undeclared (first use this function)
getrrsetbyname.c: In function `parse_dns_response':
getrrsetbyname.c:368: `HFIXEDSZ' undeclared (first use this function)
getrrsetbyname.c: In function `parse_dns_qsection':
getrrsetbyname.c:439: warning: implicit declaration of function `dn_expand'
*** Error code 1
*** Error code 1
Comment 12 Darren Tucker 2003-12-30 23:13:16 AEDT
OK, we still don't know why config.h was not generated properly.

The error you're having now is because DNS support is now always enabled and
your OS doesn't support the DNS functionality SSH now needs.  We have had
reports of people working around this on earlier HP-UXes by using the BIND9
package, see:
http://marc.theaimsgroup.com/?m=106994993504962
Comment 13 Vikash Badal 2004-03-24 03:34:54 AEDT
there are just too many things wrong with sco 3.2 to
waste more time here.

Comment 14 Damien Miller 2004-04-14 12:24:19 AEST
Mass change of RESOLVED bugs to CLOSED