Bug 1831 - Repeatable crash of softflowd on high PPS collector?
Summary: Repeatable crash of softflowd on high PPS collector?
Status: CLOSED INVALID
Alias: None
Product: softflowd
Classification: Unclassified
Component: softflowd (show other bugs)
Version: -current
Hardware: amd64 Linux
: P2 normal
Assignee: Damien Miller
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-03 05:34 AEDT by Peter Wood
Modified: 2022-02-25 13:55 AEDT (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Peter Wood 2010-11-03 05:34:01 AEDT
Good Evening,

First of all, thanks for your efforts we've been using softflowd here at Lancaster University for some time and we love it. It's previously been happily running on a FreeBSD 7.X machine.

We've obtained a monitoring server which we're resetting up our installation on, due to hardware and NIC limitations we're now having to use Ubuntu Linux. In this case 10.04 on AMD64, this issue occurs with softflowd 0.9.8 in packages and from source. This server is currently receiving around 40kpps, full payload.

I can run up softflowd and after a short period (fairly random) the following happens:

Nov  2 17:49:09 packet softflowd[2533]: softflowd v0.9.8 starting data collection
Nov  2 17:49:09 packet softflowd[2533]: Exporting flows to [127.0.0.1]:12001
Nov  2 17:49:18 packet softflowd[2533]: Shutting down after pcap EOF
Nov  2 17:49:18 packet softflowd[2533]: Shutting down on user request

I've traced this through the softflowd code, and it appears to be softflowd.c:1870 at "fault":

                        } else if (r == 0) {
                                logit(LOG_NOTICE, "Shutting down after pcap EOF");
                                graceful_shutdown_request = 1;*/
                                break;
                        }

r is the return value from pcap_dispatch, according to the pcap_dispatch man page during live capture a return of 0 can mean simply that there is no data for the pcap consumer to use. Commenting out this section results in a completely usable version of softflowd, which is currently in use for us. 

I've seen comments around the code base that there are issues with timeouts? Perhaps for some reason this is getting here when there's no data for it to deal with? I apologise there's no patch here to fix this, I'll look at what I can do but right now I've got to complete the rest of the setup.

Kind regards,

Peter.
Comment 1 Damien Miller 2019-01-23 20:06:01 AEDT
softflowd is not longer in this bugtracker
Comment 2 Damien Miller 2022-02-25 13:55:23 AEDT
closing bugs resolved before the openssh-8.9 release