Bug 897 - scp doesn't clean up forked children when processing multiple files
Summary: scp doesn't clean up forked children when processing multiple files
Status: CLOSED WORKSFORME
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: scp (show other bugs)
Version: 3.8p1
Hardware: All All
: P2 normal
Assignee: OpenSSH Bugzilla mailing list
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-07-20 23:28 AEST by Stephen Riehm
Modified: 2006-10-07 11:36 AEST (History)
0 users

See Also:


Attachments
patch to clean up forked processes before starting new ones (2.47 KB, patch)
2004-07-20 23:33 AEST, Stephen Riehm
no flags Details | Diff
problem only effects tolocal(). Previous patch broke on toremote copies (850 bytes, patch)
2004-07-21 00:19 AEST, Stephen Riehm
no flags Details | Diff
3rd attempt at a bug-free patch :-( (2.30 KB, patch)
2004-07-21 01:47 AEST, Stephen Riehm
no flags Details | Diff
...and now for something completely diff. The same again for -u (1.82 KB, patch)
2004-07-22 16:35 AEST, Stephen Riehm
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Stephen Riehm 2004-07-20 23:28:59 AEST
scp forks one ssh process for each file to be copied, but doesn't wait for the processes until all files 
have been processed.

Problem: if you are copying large numbers of files scp collects one zombie per file copied. If the 
number of files to be copied in one scp command excedes the number of processes a user may have, 
the user gets the message: "fork: resource temporarily unavailable"

See patch (very simple - just moved waitpid call into the respective for loops. My Imperical tests worked 
fine (Mac OS-X 10.3.4)
Comment 1 Stephen Riehm 2004-07-20 23:33:22 AEST
Created attachment 686 [details]
patch to clean up forked processes before starting new ones
Comment 2 Stephen Riehm 2004-07-21 00:19:13 AEST
Created attachment 687 [details]
problem only effects tolocal(). Previous patch broke on toremote copies

The previous patch changed both the tolocal and toremote routines. toremote
doesn't suffer the same problem (because there's only one target?) and the
previous patch actually broke remote copied.
This is the minimal patch required to clean up child processes when copying to
the local host - all other operations are uneffected.
Comment 3 Stephen Riehm 2004-07-21 01:47:17 AEST
Created attachment 688 [details]
3rd attempt at a bug-free patch :-(

After testing with production scripts, another slight bug. I think I've got it
now.
Comment 4 Ben Lindstrom 2004-07-21 10:18:27 AEST
I don't see the behavior you claim on OS/X.  Nor have I seen this behavior on any of the platforms I have 
around me.
Comment 5 Damien Miller 2004-07-21 11:20:32 AEST
Could you please redo the patch as a unified diff ("diff -u")? Context diffs are
unreadable :)
Comment 6 Stephen Riehm 2004-07-22 16:35:32 AEST
Created attachment 695 [details]
...and now for something completely diff. The same again for -u
Comment 7 Stephen Riehm 2004-07-22 16:55:08 AEST
Hi Ben,
you will only see the behaviour I mention if you copy multiple files from a remote directory to a local 
one. You should also make sure the files are large enough to keep scp busy for a while, as soon as scp 
has finished the OS cleans up and there's no record that anything was amiss.

Try something like this:

scp user@host:largefile1 user@host:largefile2 user@host:largefile3 user@host:largefile4 user@host:
thebiggestfileyouvegot ~/tmp

then watch your processes with ps -u from another terminal - by the time scp is copying 
thebiggestfileyouvegot you will see 4 "(scp)" processes. These are terminated child processes (they 
finished their act properly and have left the building) which are waiting for the parent (scp) to pick them 
up at the backstage door (and check their exit code etc etc). These things are called zombies and are 
nothing more than an entry in the process table. The problem for me was that OS-X (panther) only 
allows 100 processes per user by default - copy about 80 files in a single scp command and you'll get 
the error: "fork: resource temporarily not available".
I hit this because I'm syncing an image library of around 150,000 images...
The patch waits for each ssh child before starting the next one, thus preventing the accumulation of 
zombies.

If you still don't see this behaviour, I'd be most interested in your exact environment. This handling of 
child processes is perfectly normal unix behaviour.
Comment 8 Stephen Riehm 2004-08-03 20:02:29 AEST
um... hello? The first responses were extremely quick. Now that I've uploaded the patch the way you 
like it and provided a detailed explanation of my motives, everything has gone veeewwwyy quiet.

Did I step on someone's toes?
Comment 9 Damien Miller 2005-04-21 16:00:23 AEST
Does 4.0p1 fix your problem? There were some changes in this area (see bug #950).
Comment 10 Stephen Riehm 2005-04-21 18:31:06 AEST
(In reply to comment #9)
> Does 4.0p1 fix your problem? There were some changes in this area (see bug #950).

Thanks for the notification! I'll download it, but it will be a while before I can really test it. The systems 
here are currently using my patched version so the bug has gone away for us now. I'll be upgrading to 
OS-X 10.4 soon and I'll have a look then.

Cheers,

Steve
Comment 11 Damien Miller 2005-06-21 13:08:16 AEST
Have you had a chance to retest yet? I'd like to close this bug if possible.
Comment 12 Damien Miller 2006-03-13 15:46:34 AEDT
9 months and no reply = closed bug
Comment 13 Damien Miller 2006-03-13 15:48:03 AEDT
9 months and no reply = closed bug
Comment 14 Darren Tucker 2006-10-07 11:36:34 AEST
Change all RESOLVED bug to CLOSED with the exception of the ones fixed post-4.4.