| Summary: | OpenSSH 3.6.1p2 ON SCO 3.2v4.2 + STRICTMODES -->yes (broken dirname in libgen) | ||
|---|---|---|---|
| Product: | Portable OpenSSH | Reporter: | Vikash Badal <vikashb> |
| Component: | sshd | Assignee: | OpenSSH Bugzilla mailing list <openssh-bugs> |
| Status: | CLOSED FIXED | ||
| Severity: | major | ||
| Priority: | P2 | ||
| Version: | 3.6.1p2 | ||
| Hardware: | ix86 | ||
| OS: | Other | ||
| Bug Depends on: | |||
| Bug Blocks: | 627 | ||
| Attachments: | |||
|
Description
Vikash Badal
2003-07-10 21:38:34 AEST
After some more effort on my part, i have been able to determine that the problem is in the configure script. SCO 3.2.4.2 has a broken dirname function in libgen, The configure script from 3.5p1 detects this, however the configure script from 3.6p2 does not detect that dirname is broken I am clueless as to how to resolve this. Created attachment 361 [details] Gzipped configure from 3.6.1p2 rebuilt with autoconf-2.53 Bug #558 also relates to a broken dirname test. There's was another bug where configure did not detect something on HP-UX which turned out to be some problem with the version of autoconf used for the 3.6.1p2 release. Can you try the attached configure and see if it behaves itself? Tried the attached configure script, the 'new' configure script still sets HAVE_DIRNAME Created attachment 362 [details] Gzipped configure from 3.6.1p2 minus libgen.h test. It seems that this is due to the libgen.h test being added before the dirname test. Please try attached configure. $ cvs log configure.ac [snip] revision 1.107 date: 2003/02/24 01:47:16; author: djm; state: Exp; lines: +3 -3 - (djm) Most of Bug #499: Cygwin compile fixes for new progressmeter I'm wondering if there should be two test: HAVE_DIRNAME and BROKEN_DIRNAME? The existing test is fairly complex. Created attachment 363 [details]
GZ-zipped config.log of latest configure script
the newer config script still fails: checking for library containing basename... -lgen checking whether strsep is declared... no checking for dirname... yes checking libgen.h usability... no checking libgen.h presence... no from config.log configure:7269: checking for dirname configure:7312: gcc -o conftest -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -Dftruncate=chsize -I/usr/local/include -L/usr/local/lib conftest.c -lgen -lin tl -lz -lsocket -los -lprot -lx -ltinfo -lm >&5 configure:7315: $? = 0 configure:7318: test -s conftest configure:7321: $? = 0 configure:7331: result: yes I've download the newest configure script three times, still the same. After reading bug #558 it looks like even if we detect a broken dirname then then next problem will be a type conflict for dirname ("char *" versus "const char *" arguments). The easy way to fix that is remove "const" from openbsd- compat/dirname.[ch]. Comments, anyone? Created attachment 387 [details]
Move libgen test after dirname test
Looked at this again, I think the reason it's not working is libgen has already
been detected before the dirname test, and that upsets the delicate logic in
that test.
Attachment is patch to configure.ac, will attach a rebuilt configure for
testing.
Created attachment 388 [details] rebuilt gzipped configure To test, please download today's snapshot: ftp://ftp.ca.openbsd.org/pub/OpenBSD/OpenSSH/portable/snapshot/openssh-SNAP-20030905.tar.gz then replace configure with this attachment, then "./configure && make" I downloaded the configure (id=388) file as well as the snapshop (20030905).
./configure works fine and config.h does not define HAVE_DIRNAME
however, make fails with the following:
gcc -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -I. -I.. -I. -I./..
-I/usr/local/ssl/include -Dftruncate=chsize -I/usr/local/include
-DHAVE_CONFIG_H -c bsd-arc4random.c
In file included from ../includes.h:34,
from bsd-arc4random.c:25:
/usr/local/lib/gcc-lib/i386-unknown-sco3.2v4.2/2.7.2.3/include/time.h:126:
warning: `struct timeb' declared inside parameter list
/usr/local/lib/gcc-lib/i386-unknown-sco3.2v4.2/2.7.2.3/include/time.h:126:
warning: its scope is only this definition or declaration,
/usr/local/lib/gcc-lib/i386-unknown-sco3.2v4.2/2.7.2.3/include/time.h:126:
warning: which is probably not what you want.
In file included from ../openbsd-compat/openbsd-compat.h:127,
from ../includes.h:173,
from bsd-arc4random.c:25:
../openbsd-compat/bsd-misc.h:97: parse error before `('
*** Error code 1
*** Error code 1
Not sure if this is related, though it appears that the original problem is
solved ( i.e broken dirname is marked as broken )
and an interesting note:
The configure script that was part of the snapshot (20030905) also does not
define HAVE_DIRNAME and make also fails at:
gcc -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -I. -I.. -I. -I./..
-I/usr/local/ssl/include -Dftruncate=chsize -I/usr/local/include
-DHAVE_CONFIG_H -c bsd-arc4random.c
In file included from ../includes.h:170,
from bsd-arc4random.c:25:
../defines.h:146: #error "8 bit int type not found."
../defines.h:158: #error "16 bit int type not found."
../defines.h:167: #error "32 bit int type not found."
../defines.h:183: #error "8 bit int type not found."
../defines.h:195: #error "16 bit int type not found."
../defines.h:204: #error "32 bit int type not found."
In file included from ../openbsd-compat/openbsd-compat.h:128,
from ../includes.h:173,
from bsd-arc4random.c:25:
../openbsd-compat/bsd-waitpid.h:42: warning: `WEXITSTATUS' redefined
/usr/include/sys/wait.h:82: warning: this is the location of the previous definition
../openbsd-compat/bsd-waitpid.h:43: warning: `WTERMSIG' redefined
/usr/include/sys/wait.h:84: warning: this is the location of the previous definition
../openbsd-compat/bsd-waitpid.h:45: warning: `WCOREDUMP' redefined
/usr/include/sys/wait.h:94: warning: this is the location of the previous definition
*** Error code 1
*** Error code 1
Not sure if this new problem should be part of this bug, please advise.
Created attachment 390 [details]
Updated configure.gz from 3.6.1p2
Let's try to cut down the variables here: there's been a bunch of changes
recently in openbsd-compat/. Does the attached configure work with vanilla
3.6.1p2? It has the following lines moved below the broken dirname test:
+AC_CHECK_FUNC(getspnam, , AC_CHECK_LIB(gen, getspnam, LIBS="$LIBS -lgen"))
+AC_SEARCH_LIBS(nanosleep, rt posix4, AC_DEFINE(HAVE_NANOSLEEP))
+AC_SEARCH_LIBS(basename, gen, AC_DEFINE(HAVE_BASENAME))
Also, which header file defines "struct timeb"? You can try adding a #include
for that file immediately before "#include <time.h>" in includes.h.
applied configure(id=390) to vanilla 3.6p2 configure is fine, HAVE_DIRNAME not defined in config.log compiled cleanly problem is resolved the timeb struct is defined in sys/timeb.h The getspnam line was there in 3.5p1 too, so I think it's safe to leave it and the nanosleep line where they are and just move the basename line, which I'll do unless someone objects. What's at time.h line 126, and why does it break with -current but not 3.6.1p2? Is it inside an #ifdef or something?
time.h (/usr/include)
125 #if !defined(_XOPEN_SOURCE) && !defined(_POSIX_SOURCE) && !__STDC__
126 extern void ftime ( struct timeb * );
127 extern char * nl_cxtime( long *, char * );
128 extern char * nl_ascxtime( struct tm *, char * );
129 #endif
-current breaks with :
#ifndef HAVE_TCSENDBREAK
int tcsendbreak(int,int);
in "openbsd-compat/bsd-misc.h" line 96
and "openbsd-compat/bsd-misc.c" line 183
I don't understand the problem here,
SCO 3.2v4 has tcsendbreak in <termios.h> { int tcsendbreak (fildes, duration) }
from config.log:
configure:6033: gcc -o conftest -g -O2 -Wall -Wpointer-arith -Wno-uninitialized
-Dftruncate=chsize -I/usr/local/include -L/usr/local/lib conftest.c -lintl -lz
-lsocket -los -lprot -lx -ltinfo -lm >&5
undefined first referenced
symbol in file
tcsendbreak /usr/tmp/cca153591.o
ld fatal: Symbol referencing errors. No output written to conftest
is configure is missing the -lc ?
the sco man page for tcsendbreak() states
cc . . . -lc
I looked at config.h of 3.6p2 and there is no TCSENDBREAK,
on -current, HAVE_TCSENDBREAK is undefined in config.h
if I define HAVE_TCSENDBREAK then the make stops at
gcc -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -I. -I.
-I/usr/local/ssl/include -Dftruncate=chsize -I/usr/local/include
-DSSHDIR=\"/etc/ssh\" -D_PATH_SSH_PROGRAM=\"/usr/local/bin/ssh\"
-D_PATH_SSH_ASKPASS_DEFAULT=\"/usr/local/libexec/ssh-askpass\"
-D_PATH_SFTP_SERVER=\"/usr/local/libexec/sftp-server\"
-D_PATH_SSH_KEY_SIGN=\"/usr/local/libexec/ssh-keysign\"
-D_PATH_SSH_PIDDIR=\"/etc/ssh\" -D_PATH_PRIVSEP_CHROOT_DIR=\"/var/empty\"
-DSSH_RAND_HELPER=\"/usr/local/libexec/ssh-rand-helper\" -DHAVE_CONFIG_H -c
ssh-keygen.c
In file included from /usr/local/include/sys/time.h:34,
from includes.h:34,
from ssh-keygen.c:14:
/usr/local/lib/gcc-lib/i386-unknown-sco3.2v4.2/2.7.2.3/include/time.h:126:
warning: `struct timeb' declared inside parameter list
/usr/local/lib/gcc-lib/i386-unknown-sco3.2v4.2/2.7.2.3/include/time.h:126:
warning: its scope is only this definition or declaration,
/usr/local/lib/gcc-lib/i386-unknown-sco3.2v4.2/2.7.2.3/include/time.h:126:
warning: which is probably not what you want.
ssh-keygen.c: In function `do_change_comment':
ssh-keygen.c:740: warning: implicit declaration of function `fdopen'
ssh-keygen.c:740: warning: assignment makes pointer from integer without a cast
ssh-keygen.c: In function `main':
ssh-keygen.c:798: `PATH_MAX' undeclared (first use this function)
ssh-keygen.c:798: (Each undeclared identifier is reported only once
ssh-keygen.c:798: for each function it appears in.)
ssh-keygen.c:826: warning: implicit declaration of function `gethostname'
ssh-keygen.c:1118: warning: assignment makes pointer from integer without a cast
ssh-keygen.c:798: warning: unused variable `out_file'
*** Error code 1
The above problem is resolved by the following:
#diff defines.h.org defines.h
52a53,55
> #ifndef PATH_MAX
> # define PATH_MAX 64
> #endif
I am not sure if the figure of 64 is safe!
I can now get the code to compile and if execute ./sshd -p 5000 -d -d -d
all seems well, i have not done a complete test yet, but i can login as root
I think the tcsendbreak is just from a redefinition of it, since it wasn't detected correctly. I dunno about the -lc thing, I thought all C programs would get linked against libc. What happens if you do "./configure --with-ldflags=-lc"? Tim is looking at the PATH_MAX thing. executing "./configure --with-ldflags=-lc" does not cause tcsendbreak to be detected. I don't fully understand how the configure script works, thought it seems to me that the <termios.h> is missing from the confdef.h, the config.log shows: configure:5996: checking for tcsendbreak configure:6033: gcc -o conftest -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -Dftruncate=chsize -I/usr/local/include -L/usr/local/lib conftest.c -lintl -lz -lsocket -los -lprot -lx -ltinfo -lm >&5 undefined first referenced symbol in file tcsendbreak /usr/tmp/cca116141.o ld fatal: Symbol referencing errors. No output written to conftest configure:6036: $? = 1 I edited the configure script and added "#include <termios.h>" after line 6006 and the following is revealed from config.log: configure:5996: checking for tcsendbreak configure:6033: gcc -o conftest -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -Dftruncate=chsize -I/usr/local/include -L/usr/local/lib -lc conftest.c -lintl -lz -lsocket -los -lprot -lx -ltinfo -lm >&5 configure:6013: macro `tcsendbreak' used without args configure:6036: $? = 1 okay at this point i am lost, i do not understand enough to take this further. forgive my ignorance ( not really a programmer (yet) ) >configure:6033: gcc -o conftest -g -O2 -Wall -Wpointer-arith -Wno-
>uninitialized -Dftruncate=chsize -I/usr/local/include -L/usr/local/lib -lc
>conftest.c -lintl -lz -lsocket -los -lprot -lx -ltinfo -lm >&5
>configure:6013: macro `tcsendbreak' used without args
This implies to me that tcsendbreak() is not a function call, but is a macro.
Check that header you added for tcsendbreak() as a macro.
That would seem odd to me.. but <shrug>
Created attachment 392 [details]
Test for tcsendbreak as a macro (I hope!)
Don't worry, you're doing fine. It looks like we need to test for the
possibility of tcsendbreak() being a macro. Does AC_CHECK_DECL detect macros?
If so, we can do something like the attached.
Will attach a rebuilt configure containing this and the libgen change.
Created attachment 393 [details]
openssh-SNAP-sco.patch.gz: compressed diff against SNAP-20030906
This is a gzipped patch against openssh-SNAP-20030906.tar.gz containing all of
the changes since then plus attachments #391 & #392. (It's relatively large
but most of the diff is from rebuilding machine-generated files). Please test.
BTW, anyone know why the snapshots aren't updating?
Applied the patch set (id=393) to openssh-SNAP-20030906
configure worked, HAVE_TCSENDBREAK is defined
make fails at :
/bin:/bin:/usr/sbin:/sbin|/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin|g'
${manpage} > scp.1.out; \
fi
gawk: fatal: can't open source file "./mdoc2man.awk" for reading (No such file
or directory)
*** Error code 2
retrieved mdoc2.awk from cvs repo
recompiled okay, can login as root
however when exiting from shell or remote command results in
the session hanging at:
debug1: channel 0: obuf empty
debug1: channel 0: close_write
debug1: channel 0: output drain -> closed
part of the problem seems fixed, only the hanging on exit or
command completion remains.
Thanks, I've learnt a lot.
Whoops, forgot about mdoc2man.awk. Is that debug from the server or client? Could you please attach a full server-side debug (eg "sshd -ddd -p 2022")? I can't think of anything that might be causing the sessions to hang, though. Created attachment 394 [details]
Complete cut and paste of ./sshd -p 5000 -d -d -d
The hanging on exit of shell or remote command is only applicable when
connecting via ssh2
I've beening attemping to locate the problem, so far it appears to be
in serverloop.c { rev 1.110 }
It seems to me that the connection_closed (line 783) is not being set.
from my understanding, the process_input function should set
connection_closed, how do i determine what's stopping this and why ?
I messed around abit:
server side debug:
debug1: Received SIGCHLD.
debug2: notify_done: reading
debug2: channel 0: read<=0 rfd 10 len 0
debug2: channel 0: read failed
debug2: channel 0: close_read
debug2: channel 0: input open -> drain
debug2: channel 0: read 0 from efd 12
debug2: channel 0: closing read-efd 12
debug3: Vix --> entering process_input
debug2: channel 0: ibuf empty
debug2: channel 0: send eof
debug2: channel 0: input drain -> closed
debug3: Vix --> entering process_input
code :
process_input(fd_set * readset)
{
int len;
char buf[16384];
debug3("Vix --> entering process_input");
/* Read and buffer any input data from the client. */
if (FD_ISSET(connection_in, readset)) {
....
some additional info : when using ssh1: server side debug: debug1: Received SIGCHLD. debug3: Vix --> Leaving process_input debug2: notify_done: reading debug3: Vix --> entering process_input debug3: Vix --> Leaving process_input debug1: End of interactive session; stdin 0, stdout (read 295, sent 295), stderr 263 bytes. Disconnecting: wait: No child processes debug1: Calling cleanup 0x25078(0x0) The fix for the original problem has been committed, the current snapshots should work without changes. 20030911 - (dtucker) [configure.ac] Bug #588, #615: Move other libgen tests to after the dirname test, to allow a broken dirname to be detected correctly. Based partially on patch supplied by alex.kiernan at thus.net. ok djm@ I don't know about the session hang problem. If I had to guess I'd say it was something stopping the pty from closing. You can see when the variable changes by using GDB and setting a "watchpoint". You'll get a break whenever something touches the variable. I've only ever done that once and it was really slow, and if you're debugging sshd you'll probably want to put UsePrivilegeSeparation=no into sshd's args. If we can't resolve this quickly, I'm going to ask you to close this bug and open a new one since the original problem (and the one after that!) has been solved. I am not familiar with gdb, will probably have to install it first. Any pointers ? I'll close this bug and log another ( after I learn about gbd ) Thanks Mass change of RESOLVED bugs to CLOSED |