Created attachment 2165 [details] sftp resume support (using offset) Hi, I have to copy huge files using sftp, and i thought that it would be cool to get sftp support. My diff has some arbitrary choices: 1) If user specifies -a (for resume), and file doesn't exist, it creates it and starts from offset 0. 2) If destination file is bigger than source file, it aborts. The assumption being that the user specified a wrong destination file. 3) -a to use resume and "REGET" in interactive mode. Feedback welcomed as usual.
Created attachment 2168 [details] truncate file support
Created attachment 2170 [details] updated high-water offset
ping :-) ?
We've considered doing this in the past but was worried about the wording in section 6.1 of http://www.openssh.com/txt/draft-ietf-secsh-filexfer-02.txt allowing the server to reorder the responses from the block transfers in such a way that the client got a wrong idea of the high water mark. Upon re-reading the spec, the wording is still unclear. The first paragraph seems to unequivocally ban such a reordering, but then the second paragraph seems to soften that. An implementor might read both and conclude "it's okay for me to _send_ the responses in any order, so long as the blocks hit the disk in the order they were requested". IMO this would be stretching hard against the spirit of the spec and I don't know of any implementation that would do this. Perhaps we should just treat them as broken...
Very interesting. Maybe the rfc needs to be updated to clear this point, and prevent implementors from getting the wrong idea. In any case, If someone has some doubts, he can still check openssh as it's the sane implementation. Maybe this could be added to the man page for the "REGET" section to document the pitfall. I find SFTP resume very useful, particularly for big files.
Created attachment 2198 [details] resume diff -ftruncate file to 0 when blocks are detected to be out of order.
Created attachment 2199 [details] Resume diff with copyright Added ISC-modified license to make the distribution terms clear & mention sponsoring organisation.
Comment on attachment 2199 [details] Resume diff with copyright >Index: sftp-client.c >=================================================================== >RCS file: /cvs/openssh/sftp-client.c,v >retrieving revision 1.108 >diff -u -p -r1.108 sftp-client.c >--- sftp-client.c 2 Jul 2012 12:15:39 -0000 1.108 >+++ sftp-client.c 11 Dec 2012 19:00:53 -0000 >@@ -15,6 +15,24 @@ > * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > */ > >+/* >+ * Copyright (c) 2012 Loganaden Velvindron >+ * >+ * Permission to use, copy, modify, and distribute this software for any >+ * purpose with or without fee is hereby granted, provided that the above >+ * copyright notice and this permission notice appear in all copies. >+ * >+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES >+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF >+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR >+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES >+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN >+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF >+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. >+ * >+ * Sponsored by AfriNIC. >+ */ Normally we wouldn't add a whole copyright notice for a change of a few dozen lines. I'm happy to credit AfriNIC in the commit message. >@@ -1050,11 +1069,36 @@ do_download(struct sftp_conn *conn, char > return(-1); > } > >- local_fd = open(local_path, O_WRONLY | O_CREAT | O_TRUNC, >- mode | S_IWRITE); >+ if (resume) { >+ if (stat(local_path, &st) == -1) { >+ offset = 0; >+ highwater = 0; >+ local_fd = open(local_path, O_WRONLY | O_CREAT, >+ mode | S_IWRITE); >+ } I thing it would be safer and clearer to do: local_fd = open(local_path, O_WRONLY | O_CREAT | S_IWRITE | mode | resume ? 0 : O_TRUNC) fstat(local_fd, &st) offset = highwater = st.st_size // test bigger, etc. >@@ -1139,6 +1183,12 @@ do_download(struct sftp_conn *conn, char > write_error = 1; > max_req = 0; > } >+ else if (req->offset <= highwater) >+ highwater = req->offset + len; >+ else if (req->offset > highwater) { >+ ftruncate(local_fd, 0); >+ printf("reordered blocks detected"); This will spam the user for every block transferred and break the download of a file that would have otherwise downloaded fine (by truncating it). I think it would be better to just leave highwater at 0 for this case and do a debug() call on the first run through the loop. >+ } > progress_counter += len; > xfree(data); > >@@ -1187,6 +1237,11 @@ do_download(struct sftp_conn *conn, char > /* Sanity check */ > if (TAILQ_FIRST(&requests) != NULL) > fatal("Transfer complete, but requests still in queue"); >+ /* Truncate at highest contiguous point to avoid holes on interrupt */ >+ if (read_error || write_error || interrupted) { >+ debug("truncating at %llu", highwater); >+ ftruncate(local_fd, highwater); Here you should check if highwater == 0 to detect reordered blocks and warn the user: logit("server reordered requests: %s download cannot be resumed", local_path) or something similar. >@@ -1233,6 +1288,7 @@ download_dir_internal(struct sftp_conn * > SFTP_DIRENT **dir_entries; > char *filename, *new_src, *new_dst; > mode_t mode = 0777; >+ int resume = 0; > > if (depth >= MAX_DIR_DEPTH) { > error("Maximum directory depth exceeded: %d levels", depth); >@@ -1284,7 +1340,7 @@ download_dir_internal(struct sftp_conn * > ret = -1; > } else if (S_ISREG(dir_entries[i]->a.perm) ) { > if (do_download(conn, new_src, new_dst, >- &(dir_entries[i]->a), pflag) == -1) { >+ &(dir_entries[i]->a), pflag, resume) == -1) { why use a variable here and not just 0?
Created attachment 2302 [details] resume diff with corrections
(In reply to Damien Miller from comment #11) > Comment on attachment 2199 [details] > Resume diff with copyright > > >Index: sftp-client.c > >=================================================================== > >RCS file: /cvs/openssh/sftp-client.c,v > >retrieving revision 1.108 > >diff -u -p -r1.108 sftp-client.c > >--- sftp-client.c 2 Jul 2012 12:15:39 -0000 1.108 > >+++ sftp-client.c 11 Dec 2012 19:00:53 -0000 > >@@ -15,6 +15,24 @@ > > * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > > */ > > > >+/* > >+ * Copyright (c) 2012 Loganaden Velvindron > >+ * > >+ * Permission to use, copy, modify, and distribute this software for any > >+ * purpose with or without fee is hereby granted, provided that the above > >+ * copyright notice and this permission notice appear in all copies. > >+ * > >+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES > >+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF > >+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR > >+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > >+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN > >+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF > >+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. > >+ * > >+ * Sponsored by AfriNIC. > >+ */ > > Normally we wouldn't add a whole copyright notice for a change of a > few dozen lines. I'm happy to credit AfriNIC in the commit message. The diff I uploaded the last time was incomplete. The complete diff is a bit long. In Any case, I'm fine with it :-) > > >@@ -1050,11 +1069,36 @@ do_download(struct sftp_conn *conn, char > > return(-1); > > } > > > >- local_fd = open(local_path, O_WRONLY | O_CREAT | O_TRUNC, > >- mode | S_IWRITE); > >+ if (resume) { > >+ if (stat(local_path, &st) == -1) { > >+ offset = 0; > >+ highwater = 0; > >+ local_fd = open(local_path, O_WRONLY | O_CREAT, > >+ mode | S_IWRITE); > >+ } > > I thing it would be safer and clearer to do: > > local_fd = open(local_path, O_WRONLY | O_CREAT | S_IWRITE | mode | > resume ? 0 : O_TRUNC) > fstat(local_fd, &st) > offset = highwater = st.st_size > // test bigger, etc. Yep, agreed as well. > > > >@@ -1139,6 +1183,12 @@ do_download(struct sftp_conn *conn, char > > write_error = 1; > > max_req = 0; > > } > >+ else if (req->offset <= highwater) > >+ highwater = req->offset + len; > >+ else if (req->offset > highwater) { > >+ ftruncate(local_fd, 0); > >+ printf("reordered blocks detected"); > > This will spam the user for every block transferred and break the > download of a file that would have otherwise downloaded fine (by > truncating it). I think it would be better to just leave highwater > at 0 for this case and do a debug() call on the first run through > the loop. That makes sense. > > >+ } > > progress_counter += len; > > xfree(data); > > > >@@ -1187,6 +1237,11 @@ do_download(struct sftp_conn *conn, char > > /* Sanity check */ > > if (TAILQ_FIRST(&requests) != NULL) > > fatal("Transfer complete, but requests still in queue"); > >+ /* Truncate at highest contiguous point to avoid holes on interrupt */ > >+ if (read_error || write_error || interrupted) { > >+ debug("truncating at %llu", highwater); > >+ ftruncate(local_fd, highwater); > > Here you should check if highwater == 0 to detect reordered blocks > and warn the user: logit("server reordered requests: %s download > cannot be resumed", local_path) or something similar. Added as you requested. > > >@@ -1233,6 +1288,7 @@ download_dir_internal(struct sftp_conn * > > SFTP_DIRENT **dir_entries; > > char *filename, *new_src, *new_dst; > > mode_t mode = 0777; > >+ int resume = 0; > > > > if (depth >= MAX_DIR_DEPTH) { > > error("Maximum directory depth exceeded: %d levels", depth); > >@@ -1284,7 +1340,7 @@ download_dir_internal(struct sftp_conn * > > ret = -1; > > } else if (S_ISREG(dir_entries[i]->a.perm) ) { > > if (do_download(conn, new_src, new_dst, > >- &(dir_entries[i]->a), pflag) == -1) { > >+ &(dir_entries[i]->a), pflag, resume) == -1) { > > why use a variable here and not just 0? do_download() needs to later pass on the resume value. It can be 1 as well.
Thanks - looks good.
Created attachment 2304 [details] sftp.1 man page diff I also included a description similar to that of get. I replaced retrieve with resume. What do you guys think ? Should we elaborate more ?
Created attachment 2305 [details] sftp man page diff Updated so that it's sorted alphabetically.
Created attachment 2313 [details] Combined diff with tweaks This diff (source and manual) has a couple of tweaks: - Add "get -a" to attempt resumption - Make "reget" a synonym for "get -a" - Fix some code formatting - Allow -a and -r to coexist: make download_dir() pass the resume flag on - A couple of logic tweaks in do_download() I think this is ready to go in.
Created attachment 2314 [details] another tweak This simplifies the logic for server that reorder requests a bit more. It only displays the warning message if it actually makes a difference (i.e. the transfer did not complete successfully).
I've tested your latest diff against an anoncvs copy. It works fine for both single files and recursive downloads. However, the diff is slightly out of date, and patch(1) warned about fuzzing for certain lines. I've attached a diff against the latest version from the cvs server.
Created attachment 2316 [details] djm_latest_diff_anoncvs
committed - will be in openssh-6.3. Thanks :) > CVSROOT: /cvs > Module name: src > Changes by: djm@cvs.openbsd.org 2013/07/24 18:56:52 > > Modified files: > usr.bin/ssh : sftp-client.c sftp-client.h sftp.1 sftp.c > > Log message: > sftp support for resuming partial downloads; patch mostly by Loganaden > Velvindron/AfriNIC with some tweaks by me; feedback and ok dtucker@ > "Just be careful" deraadt@
Close all resolved bugs after 7.3p1 release