I have compiled OpenSSH-3.6.1p2 on SCO 3.2v4.2 and the following problem occurs: I am unable to login as root using when strictmode is set to yes. output of debug: Failed none for root from 192.168.1.1 port 1199 ssh2 debug1: userauth-request for user root service ssh-connection method publickey debug1: attempt 1 failures 1 debug2: input_userauth_request: try method publickey debug1: test whether pkalg/pkblob are acceptable debug1: trying public key file //.ssh/authorized_keys debug3: secure_filename: checking '/.ssh' debug3: secure_filename: checking '' Authentication refused: bad ownership or modes for directory debug1: trying public key file //.ssh/authorized_keys2 debug3: secure_filename: checking '/.ssh' debug3: secure_filename: checking '' Authentication refused: bad ownership or modes for directory It seems that the final check is searching for a non-existent directory, with OpenSSH 3.5p1 this problem does not exist. OS : SCO 3.2v4.2 gcc: 2.7.2.3
After some more effort on my part, i have been able to determine that the problem is in the configure script. SCO 3.2.4.2 has a broken dirname function in libgen, The configure script from 3.5p1 detects this, however the configure script from 3.6p2 does not detect that dirname is broken I am clueless as to how to resolve this.
Created attachment 361 [details] Gzipped configure from 3.6.1p2 rebuilt with autoconf-2.53 Bug #558 also relates to a broken dirname test. There's was another bug where configure did not detect something on HP-UX which turned out to be some problem with the version of autoconf used for the 3.6.1p2 release. Can you try the attached configure and see if it behaves itself?
Tried the attached configure script, the 'new' configure script still sets HAVE_DIRNAME
Created attachment 362 [details] Gzipped configure from 3.6.1p2 minus libgen.h test. It seems that this is due to the libgen.h test being added before the dirname test. Please try attached configure. $ cvs log configure.ac [snip] revision 1.107 date: 2003/02/24 01:47:16; author: djm; state: Exp; lines: +3 -3 - (djm) Most of Bug #499: Cygwin compile fixes for new progressmeter I'm wondering if there should be two test: HAVE_DIRNAME and BROKEN_DIRNAME? The existing test is fairly complex.
Created attachment 363 [details] GZ-zipped config.log of latest configure script
the newer config script still fails: checking for library containing basename... -lgen checking whether strsep is declared... no checking for dirname... yes checking libgen.h usability... no checking libgen.h presence... no from config.log configure:7269: checking for dirname configure:7312: gcc -o conftest -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -Dftruncate=chsize -I/usr/local/include -L/usr/local/lib conftest.c -lgen -lin tl -lz -lsocket -los -lprot -lx -ltinfo -lm >&5 configure:7315: $? = 0 configure:7318: test -s conftest configure:7321: $? = 0 configure:7331: result: yes I've download the newest configure script three times, still the same.
After reading bug #558 it looks like even if we detect a broken dirname then then next problem will be a type conflict for dirname ("char *" versus "const char *" arguments). The easy way to fix that is remove "const" from openbsd- compat/dirname.[ch]. Comments, anyone?
Created attachment 387 [details] Move libgen test after dirname test Looked at this again, I think the reason it's not working is libgen has already been detected before the dirname test, and that upsets the delicate logic in that test. Attachment is patch to configure.ac, will attach a rebuilt configure for testing.
Created attachment 388 [details] rebuilt gzipped configure To test, please download today's snapshot: ftp://ftp.ca.openbsd.org/pub/OpenBSD/OpenSSH/portable/snapshot/openssh-SNAP-20030905.tar.gz then replace configure with this attachment, then "./configure && make"
I downloaded the configure (id=388) file as well as the snapshop (20030905). ./configure works fine and config.h does not define HAVE_DIRNAME however, make fails with the following: gcc -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -I. -I.. -I. -I./.. -I/usr/local/ssl/include -Dftruncate=chsize -I/usr/local/include -DHAVE_CONFIG_H -c bsd-arc4random.c In file included from ../includes.h:34, from bsd-arc4random.c:25: /usr/local/lib/gcc-lib/i386-unknown-sco3.2v4.2/2.7.2.3/include/time.h:126: warning: `struct timeb' declared inside parameter list /usr/local/lib/gcc-lib/i386-unknown-sco3.2v4.2/2.7.2.3/include/time.h:126: warning: its scope is only this definition or declaration, /usr/local/lib/gcc-lib/i386-unknown-sco3.2v4.2/2.7.2.3/include/time.h:126: warning: which is probably not what you want. In file included from ../openbsd-compat/openbsd-compat.h:127, from ../includes.h:173, from bsd-arc4random.c:25: ../openbsd-compat/bsd-misc.h:97: parse error before `(' *** Error code 1 *** Error code 1 Not sure if this is related, though it appears that the original problem is solved ( i.e broken dirname is marked as broken ) and an interesting note: The configure script that was part of the snapshot (20030905) also does not define HAVE_DIRNAME and make also fails at: gcc -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -I. -I.. -I. -I./.. -I/usr/local/ssl/include -Dftruncate=chsize -I/usr/local/include -DHAVE_CONFIG_H -c bsd-arc4random.c In file included from ../includes.h:170, from bsd-arc4random.c:25: ../defines.h:146: #error "8 bit int type not found." ../defines.h:158: #error "16 bit int type not found." ../defines.h:167: #error "32 bit int type not found." ../defines.h:183: #error "8 bit int type not found." ../defines.h:195: #error "16 bit int type not found." ../defines.h:204: #error "32 bit int type not found." In file included from ../openbsd-compat/openbsd-compat.h:128, from ../includes.h:173, from bsd-arc4random.c:25: ../openbsd-compat/bsd-waitpid.h:42: warning: `WEXITSTATUS' redefined /usr/include/sys/wait.h:82: warning: this is the location of the previous definition ../openbsd-compat/bsd-waitpid.h:43: warning: `WTERMSIG' redefined /usr/include/sys/wait.h:84: warning: this is the location of the previous definition ../openbsd-compat/bsd-waitpid.h:45: warning: `WCOREDUMP' redefined /usr/include/sys/wait.h:94: warning: this is the location of the previous definition *** Error code 1 *** Error code 1 Not sure if this new problem should be part of this bug, please advise.
Created attachment 390 [details] Updated configure.gz from 3.6.1p2 Let's try to cut down the variables here: there's been a bunch of changes recently in openbsd-compat/. Does the attached configure work with vanilla 3.6.1p2? It has the following lines moved below the broken dirname test: +AC_CHECK_FUNC(getspnam, , AC_CHECK_LIB(gen, getspnam, LIBS="$LIBS -lgen")) +AC_SEARCH_LIBS(nanosleep, rt posix4, AC_DEFINE(HAVE_NANOSLEEP)) +AC_SEARCH_LIBS(basename, gen, AC_DEFINE(HAVE_BASENAME)) Also, which header file defines "struct timeb"? You can try adding a #include for that file immediately before "#include <time.h>" in includes.h.
applied configure(id=390) to vanilla 3.6p2 configure is fine, HAVE_DIRNAME not defined in config.log compiled cleanly problem is resolved
the timeb struct is defined in sys/timeb.h
The getspnam line was there in 3.5p1 too, so I think it's safe to leave it and the nanosleep line where they are and just move the basename line, which I'll do unless someone objects. What's at time.h line 126, and why does it break with -current but not 3.6.1p2? Is it inside an #ifdef or something?
time.h (/usr/include) 125 #if !defined(_XOPEN_SOURCE) && !defined(_POSIX_SOURCE) && !__STDC__ 126 extern void ftime ( struct timeb * ); 127 extern char * nl_cxtime( long *, char * ); 128 extern char * nl_ascxtime( struct tm *, char * ); 129 #endif -current breaks with : #ifndef HAVE_TCSENDBREAK int tcsendbreak(int,int); in "openbsd-compat/bsd-misc.h" line 96 and "openbsd-compat/bsd-misc.c" line 183 I don't understand the problem here, SCO 3.2v4 has tcsendbreak in <termios.h> { int tcsendbreak (fildes, duration) } from config.log: configure:6033: gcc -o conftest -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -Dftruncate=chsize -I/usr/local/include -L/usr/local/lib conftest.c -lintl -lz -lsocket -los -lprot -lx -ltinfo -lm >&5 undefined first referenced symbol in file tcsendbreak /usr/tmp/cca153591.o ld fatal: Symbol referencing errors. No output written to conftest is configure is missing the -lc ? the sco man page for tcsendbreak() states cc . . . -lc I looked at config.h of 3.6p2 and there is no TCSENDBREAK, on -current, HAVE_TCSENDBREAK is undefined in config.h if I define HAVE_TCSENDBREAK then the make stops at gcc -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -I. -I. -I/usr/local/ssl/include -Dftruncate=chsize -I/usr/local/include -DSSHDIR=\"/etc/ssh\" -D_PATH_SSH_PROGRAM=\"/usr/local/bin/ssh\" -D_PATH_SSH_ASKPASS_DEFAULT=\"/usr/local/libexec/ssh-askpass\" -D_PATH_SFTP_SERVER=\"/usr/local/libexec/sftp-server\" -D_PATH_SSH_KEY_SIGN=\"/usr/local/libexec/ssh-keysign\" -D_PATH_SSH_PIDDIR=\"/etc/ssh\" -D_PATH_PRIVSEP_CHROOT_DIR=\"/var/empty\" -DSSH_RAND_HELPER=\"/usr/local/libexec/ssh-rand-helper\" -DHAVE_CONFIG_H -c ssh-keygen.c In file included from /usr/local/include/sys/time.h:34, from includes.h:34, from ssh-keygen.c:14: /usr/local/lib/gcc-lib/i386-unknown-sco3.2v4.2/2.7.2.3/include/time.h:126: warning: `struct timeb' declared inside parameter list /usr/local/lib/gcc-lib/i386-unknown-sco3.2v4.2/2.7.2.3/include/time.h:126: warning: its scope is only this definition or declaration, /usr/local/lib/gcc-lib/i386-unknown-sco3.2v4.2/2.7.2.3/include/time.h:126: warning: which is probably not what you want. ssh-keygen.c: In function `do_change_comment': ssh-keygen.c:740: warning: implicit declaration of function `fdopen' ssh-keygen.c:740: warning: assignment makes pointer from integer without a cast ssh-keygen.c: In function `main': ssh-keygen.c:798: `PATH_MAX' undeclared (first use this function) ssh-keygen.c:798: (Each undeclared identifier is reported only once ssh-keygen.c:798: for each function it appears in.) ssh-keygen.c:826: warning: implicit declaration of function `gethostname' ssh-keygen.c:1118: warning: assignment makes pointer from integer without a cast ssh-keygen.c:798: warning: unused variable `out_file' *** Error code 1 The above problem is resolved by the following: #diff defines.h.org defines.h 52a53,55 > #ifndef PATH_MAX > # define PATH_MAX 64 > #endif I am not sure if the figure of 64 is safe! I can now get the code to compile and if execute ./sshd -p 5000 -d -d -d all seems well, i have not done a complete test yet, but i can login as root
I think the tcsendbreak is just from a redefinition of it, since it wasn't detected correctly. I dunno about the -lc thing, I thought all C programs would get linked against libc. What happens if you do "./configure --with-ldflags=-lc"? Tim is looking at the PATH_MAX thing.
executing "./configure --with-ldflags=-lc" does not cause tcsendbreak to be detected. I don't fully understand how the configure script works, thought it seems to me that the <termios.h> is missing from the confdef.h, the config.log shows: configure:5996: checking for tcsendbreak configure:6033: gcc -o conftest -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -Dftruncate=chsize -I/usr/local/include -L/usr/local/lib conftest.c -lintl -lz -lsocket -los -lprot -lx -ltinfo -lm >&5 undefined first referenced symbol in file tcsendbreak /usr/tmp/cca116141.o ld fatal: Symbol referencing errors. No output written to conftest configure:6036: $? = 1 I edited the configure script and added "#include <termios.h>" after line 6006 and the following is revealed from config.log: configure:5996: checking for tcsendbreak configure:6033: gcc -o conftest -g -O2 -Wall -Wpointer-arith -Wno-uninitialized -Dftruncate=chsize -I/usr/local/include -L/usr/local/lib -lc conftest.c -lintl -lz -lsocket -los -lprot -lx -ltinfo -lm >&5 configure:6013: macro `tcsendbreak' used without args configure:6036: $? = 1 okay at this point i am lost, i do not understand enough to take this further. forgive my ignorance ( not really a programmer (yet) )
>configure:6033: gcc -o conftest -g -O2 -Wall -Wpointer-arith -Wno- >uninitialized -Dftruncate=chsize -I/usr/local/include -L/usr/local/lib -lc >conftest.c -lintl -lz -lsocket -los -lprot -lx -ltinfo -lm >&5 >configure:6013: macro `tcsendbreak' used without args This implies to me that tcsendbreak() is not a function call, but is a macro. Check that header you added for tcsendbreak() as a macro. That would seem odd to me.. but <shrug>
Created attachment 392 [details] Test for tcsendbreak as a macro (I hope!) Don't worry, you're doing fine. It looks like we need to test for the possibility of tcsendbreak() being a macro. Does AC_CHECK_DECL detect macros? If so, we can do something like the attached. Will attach a rebuilt configure containing this and the libgen change.
Created attachment 393 [details] openssh-SNAP-sco.patch.gz: compressed diff against SNAP-20030906 This is a gzipped patch against openssh-SNAP-20030906.tar.gz containing all of the changes since then plus attachments #391 & #392. (It's relatively large but most of the diff is from rebuilding machine-generated files). Please test. BTW, anyone know why the snapshots aren't updating?
Applied the patch set (id=393) to openssh-SNAP-20030906 configure worked, HAVE_TCSENDBREAK is defined make fails at : /bin:/bin:/usr/sbin:/sbin|/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin|g' ${manpage} > scp.1.out; \ fi gawk: fatal: can't open source file "./mdoc2man.awk" for reading (No such file or directory) *** Error code 2 retrieved mdoc2.awk from cvs repo recompiled okay, can login as root however when exiting from shell or remote command results in the session hanging at: debug1: channel 0: obuf empty debug1: channel 0: close_write debug1: channel 0: output drain -> closed part of the problem seems fixed, only the hanging on exit or command completion remains. Thanks, I've learnt a lot.
Whoops, forgot about mdoc2man.awk. Is that debug from the server or client? Could you please attach a full server-side debug (eg "sshd -ddd -p 2022")? I can't think of anything that might be causing the sessions to hang, though.
Created attachment 394 [details] Complete cut and paste of ./sshd -p 5000 -d -d -d
The hanging on exit of shell or remote command is only applicable when connecting via ssh2 I've beening attemping to locate the problem, so far it appears to be in serverloop.c { rev 1.110 } It seems to me that the connection_closed (line 783) is not being set. from my understanding, the process_input function should set connection_closed, how do i determine what's stopping this and why ? I messed around abit: server side debug: debug1: Received SIGCHLD. debug2: notify_done: reading debug2: channel 0: read<=0 rfd 10 len 0 debug2: channel 0: read failed debug2: channel 0: close_read debug2: channel 0: input open -> drain debug2: channel 0: read 0 from efd 12 debug2: channel 0: closing read-efd 12 debug3: Vix --> entering process_input debug2: channel 0: ibuf empty debug2: channel 0: send eof debug2: channel 0: input drain -> closed debug3: Vix --> entering process_input code : process_input(fd_set * readset) { int len; char buf[16384]; debug3("Vix --> entering process_input"); /* Read and buffer any input data from the client. */ if (FD_ISSET(connection_in, readset)) { ....
some additional info : when using ssh1: server side debug: debug1: Received SIGCHLD. debug3: Vix --> Leaving process_input debug2: notify_done: reading debug3: Vix --> entering process_input debug3: Vix --> Leaving process_input debug1: End of interactive session; stdin 0, stdout (read 295, sent 295), stderr 263 bytes. Disconnecting: wait: No child processes debug1: Calling cleanup 0x25078(0x0)
The fix for the original problem has been committed, the current snapshots should work without changes. 20030911 - (dtucker) [configure.ac] Bug #588, #615: Move other libgen tests to after the dirname test, to allow a broken dirname to be detected correctly. Based partially on patch supplied by alex.kiernan at thus.net. ok djm@ I don't know about the session hang problem. If I had to guess I'd say it was something stopping the pty from closing. You can see when the variable changes by using GDB and setting a "watchpoint". You'll get a break whenever something touches the variable. I've only ever done that once and it was really slow, and if you're debugging sshd you'll probably want to put UsePrivilegeSeparation=no into sshd's args. If we can't resolve this quickly, I'm going to ask you to close this bug and open a new one since the original problem (and the one after that!) has been solved.
I am not familiar with gdb, will probably have to install it first. Any pointers ? I'll close this bug and log another ( after I learn about gbd ) Thanks
Mass change of RESOLVED bugs to CLOSED