fork process to handle fifo input

Discussion in 'Perl Misc' started by Ole, Oct 29, 2003.

  1. Ole

    Ole Guest

    hi everybody,
    i've got a little problem with a program i'm working on. That program
    has the purpose to collect log entries generated by iptables through
    syslog-ng. I have redirected the logs coming from iptables to a
    named pipe (or fifo) ( "/dev/ipt_fifo" ). The program then forks
    processes that
    must collect the log entries coming through the fifo and prepare them
    to
    be saved for further analysis.

    Now i have the following code:
    ( the program forks 5 processes that work on the input of the named
    pipe.
    after having collected a certain amount of information, the
    processes save
    the data to a file. )



    ##########################################################################
    #!/usr/bin/perl -w
    use POSIX;
    use strict;
    my $pid;
    my $childId;
    my $child_pids = [ ];
    our $zombies = 0;

    ##########################################################################

    for ( my $num = 0; $num < 5; $num++ ) {
    if (( $pid = fork ( ) ) > 0 ) { # parent process
    push ( @$child_pids, $pid );
    }elsif ( $pid == 0 ) {
    my $FIFO;
    my $i = 1;
    my $reports = [];
    my $my_pid = getpid ( ) ||die "Cannot getpid ( ) : $!";
    while ( 1 ) {
    open ( $FIFO, "</dev/ipt_fifo" );
    while ( <$FIFO> ) {
    # analyze log data... ( left here ) #
    push( @$reports, $_ );
    print $my_pid . " : " . $i . "--Size of \$reports\t:\t" .
    scalar( @$reports ) . "\n";
    if (( $i % 1024 ) == 0 ) {
    saveReport ( $reports,$my_pid );
    undef ( $reports );
    print "OK\n";
    $i = 0;
    }
    $i++;
    }
    close ( $FIFO );
    }
    }
    }

    wait_zombies ( $pid );


    sub wait_zombies {
    my $pid = $_[ 0 ];
    if ( $pid > 0 ) {
    my $chId;
    do {
    $chId = waitpid( -1, 0 );
    print "Caught child no $chId\n";
    }until ( $chId == - 1 );
    }
    }



    sub saveReport {
    my $reports = $_[ 0 ];
    my $my_pid = $_[ 1 ];
    open ( FH, ">>log_fork_childno_$my_pid.log" );
    print "saving reports [PID = $my_pid]";
    foreach ( @$reports ) {
    print ".";
    print FH $_;
    }
    close ( FH );
    }
    ##################################################################

    Now i have 2 problems:

    1.) How do I trap e.g. a "kill -9" command, directed to the parent
    process
    in order to collect to child processes before quiting. Now the
    child
    processes get init for as parent.

    2. ) I recognize a strange behaviour: for a while, the processes
    collect
    data from syslog through the fifo as they should. But then, it
    seems
    that syslog is generating approx 200 entires per second. But i
    think
    that it has to do with process synchronization.

    Questions:

    1. ) Is it possible that, if two processes try to read simultaneously
    from
    the named pipe or fifo, they lock themselves somehow ?

    2. ) How can manage it to have only one child process trying to read
    from
    the fifo ?

    I would be very thankful if someone might help me out here.
    I need this filter daemon, because it is the topic of the final
    examination
    of my training.

    Thanks in advance for your interprocesscommunication!

    Greetings Ole Viaud-Murat.
     
    Ole, Oct 29, 2003
    #1
    1. Advertising

  2. Ole

    Ben Morrow Guest

    (Ole) wrote:

    > hi everybody,


    > Now i have the following code: ( the program forks 5 processes that
    > work on the input of the named pipe. after having collected a
    > certain amount of information, the processes save the data to a
    > file. )


    I have to confess that I don't really understand the description
    above; however, I shall offer what help I can...

    > #!/usr/bin/perl -w
    > use POSIX;
    > use strict;


    You want
    use subs qw/wait_zombies saveReport/;
    here. You alse want to choose one of under_scores and studlyCaps and
    stick to it.

    > my $pid;
    > my $childId;
    > my $child_pids = [ ];
    > our $zombies = 0;


    Why 'our'? The only reason is if you need to get a this var from
    outside this file... as far I I can see, you have only one file.

    You want
    $\ = "\n";
    here: for why see below where I deal with your print()s.

    > ##########################################################################
    >
    > for ( my $num = 0; $num < 5; $num++ ) {


    This is C. Perl would be:

    for my $num (0..5) {

    > if (( $pid = fork ( ) ) > 0 ) { # parent process


    Where do you check if the fork failed?

    Better:
    $pid = fork;
    defined $pid or die "can't fork: $!";
    if($pid) { # parent

    > push ( @$child_pids, $pid );
    > }elsif ( $pid == 0 ) {


    ....in which case this can just be
    } else { # kid

    > my $FIFO;
    > my $i = 1;
    > my $reports = [];
    > my $my_pid = getpid ( ) ||die "Cannot getpid ( ) :
    > $!";


    It is clearer to use 'or' rather than ||, as then you do not need the
    () for getpid.

    The current process's pid is available in $$: see perldoc perlvar.

    > while ( 1 ) {


    Where do you exit this loop?

    > open ( $FIFO, "</dev/ipt_fifo" );


    No need for the brackets.
    As Tad would say :) always, yes *always* check the return of open.

    open $FIFO, "</dev/ipt_fifo" or die "can't open fifo: $!";

    > while ( <$FIFO> ) {
    > # analyze log data... ( left here ) #
    > push( @$reports, $_ );
    > print $my_pid . " : " . $i . "--Size of \$reports\t:\t" .
    > scalar( @$reports ) . "\n";


    print "$$ : $i--Size of \$reports\t:\t" . @$reports;

    No need for \n as you set $\ above: see perldoc perlvar.

    > if (( $i % 1024 ) == 0 ) {


    $i will never be greater that 1024: no need for modulus.

    if($i == 1024) {

    > saveReport ( $reports,$my_pid );
    > undef ( $reports );
    > print "OK\n";
    > $i = 0;
    > }
    > $i++;
    > }
    > close ( $FIFO );


    close can fail as well.

    close $FIFO or die "colsing fifo failed: $!";

    > }
    > }
    > }
    >
    > wait_zombies ( $pid );


    $pid holds the pid of the last child you forked. Why are you passing
    only this?

    > sub wait_zombies {
    > my $pid = $_[ 0 ];


    my $pid = shift;

    > if ( $pid > 0 ) {


    Is this an attempt to make up for not checking the return value of
    fork()? Otherwise it is pointless...

    > my $chId;
    > do {
    > $chId = waitpid( -1, 0 );
    > print "Caught child no $chId\n";
    > }until ( $chId == - 1 );
    > }
    > }
    >
    >
    >
    > sub saveReport {
    > my $reports = $_[ 0 ];
    > my $my_pid = $_[ 1 ];


    my $reports = shift;
    my $my_pid = shift;

    or

    my ($reports, $my_pid) = @_;

    according to taste.

    > open ( FH, ">>log_fork_childno_$my_pid.log" );


    open FH, ">>log_fork_childno_${my_pid}.log"
    or die "can't open log no. $my_pid: $!";

    > print "saving reports [PID = $my_pid]";
    > foreach ( @$reports ) {


    /Please/ try and be consistent. Do you spell it 'for' or 'foreach'?

    > print ".";
    > print FH $_;
    > }
    > close ( FH );


    .... or die "can't close log $my_pid: $!";

    > }
    > ##################################################################
    >
    > Now i have 2 problems:


    > 1.) How do I trap e.g. a "kill -9" command, directed to the parent
    > process in order to collect to child processes before quiting. Now
    > the child processes get init for as parent.


    You can't. That's the point of SIGKILL: it can't be trapped. If you
    want to trap other signals, look at %SIG in perlvar.

    > 2. ) I recognize a strange behaviour: for a while, the processes
    > collect data from syslog through the fifo as they should. But then,
    > it seems that syslog is generating approx 200 entires per
    > second. But i think that it has to do with process synchronization.


    I'm afraid I don't understand what you mean... but you could be right:
    you certainly have sync problems.

    > Questions:
    >
    > 1. ) Is it possible that, if two processes try to read
    > simultaneously from the named pipe or fifo, they lock themselves
    > somehow ?


    Each byte fed into the pipe comes out exactly once, to exactly one of
    the processes reading it. Absolutely no guarantees about which. It is
    generally a bad idea to have more than one process on either end of a
    fifo.

    > 2. ) How can manage it to have only one child process trying to read
    > from the fifo ?


    a. Use a lockfile to synchronise access.
    b. Have a 'reader' process that reads data from the fifo and passes it
    to the other children for processing.
    c. Don't fork()... :)

    > I would be very thankful if someone might help me out here. I need
    > this filter daemon, because it is the topic of the final examination
    > of my training.


    Well, you must examine your own conscience about that... I would
    suggest you need to at least read the question a bit more carefully to
    find out what is *actually* required, in particular, what these
    multiple child processes are supposed to do.

    Ben

    --
    Like all men in Babylon I have been a proconsul; like all, a slave ... During
    one lunar year, I have been declared invisible; I shrieked and was not heard,
    I stole my bread and was not decapitated.
    ~ ~ Jorge Luis Borges, 'The Babylon Lottery'
     
    Ben Morrow, Oct 29, 2003
    #2
    1. Advertising

  3. I'm more than once since there are a number of distinct issues here.

    (Ole) writes:
    >
    > 1.) How do I trap e.g. a "kill -9" command, directed to the parent
    > process


    You can't trap "kill -9" that's the whole point of the "-9"!

    If you want to give a process a chance to clean up after itself then
    you don't send it SIGKILL, you send it SIGINT or SIGTERM or SIGQUIT.

    This, of course, has nothing particular to do with Perl.

    You can trap trappable signals using %SIG.

    > Now the child processes get init for as parent.


    Your children can poll their parent pid to detect this or you can have
    a FIFO running from the children to the parent so that they get a
    SIGPIPE when the parent dies.

    This, of course, has nothing particular to do with Perl.

    > 2. ) I recognize a strange behaviour: for a while, the processes
    > collect data from syslog through the fifo as they should. But then,
    > it seems that syslog is generating approx 200 entires per
    > second. But i think that it has to do with process synchronization.
    >
    > Questions:
    >
    > 1. ) Is it possible that, if two processes try to read
    > simultaneously from the named pipe or fifo, they lock themselves
    > somehow ?


    Unlikely. At worst they'll get garbled data, but if the writing
    application is not splitting messages accross multiple write()
    syscalls the this shouldn't happen.

    This, of course, has nothing particular to do with Perl.

    > 2. ) How can manage it to have only one child process trying to read
    > from the fifo ?


    A semaphore. (e.g. a lock on a file, or even the FIFO iself in your
    OS allows that.).

    This, of course, has nothing particular to do with Perl.

    --
    \\ ( )
    . _\\__[oo
    .__/ \\ /\@
    . l___\\
    # ll l\\
    ###LL LL\\
     
    Brian McCauley, Oct 29, 2003
    #3
  4. (Ole) writes:

    > #!/usr/bin/perl -w
    > use POSIX;
    > use strict;


    Unless you need backward compatabiliy you should probably "use
    warnings" rather than have a -w in the shebang. But I know you don't
    need backward compatabiliy as you are using a feature that postdates
    the warnings pragma elsewhere in your script.

    > my $pid;
    > my $childId;
    > my $child_pids = [ ];
    > our $zombies = 0;


    Always declare all variables as lexcally scoped in the smallest
    applicable scope unless there is a positive reason to do otherwise.
    This keeps you code a lot tighter. This does not only apply in Perl.

    Do not think of declarations as "a way to keep strict quiet" and shove
    them all up the top. Appart from introducing some bugs it also means
    you find yourself declaring variables that are no longer used.

    > for ( my $num = 0; $num < 5; $num++ ) {


    In Perl that is more ideomatically written as

    for my $num ( 0 .. 4 ) {

    Or, since you never use $num, you may want to let for() use it's
    default iterator

    for ( 1 .. 5 ) {

    > if (( $pid = fork ( ) ) > 0 ) { # parent process
    > push ( @$child_pids, $pid );
    > }elsif ( $pid == 0 ) {


    In Perl (unlike C) fork() does not return a negative number on
    failure.

    To see that it does return on failure: perldoc -f fork()

    > my $FIFO;
    > my $i = 1;
    > my $reports = [];
    > my $my_pid = getpid ( ) ||die "Cannot getpid ( ) : $!";
    > while ( 1 ) {
    > open ( $FIFO, "</dev/ipt_fifo" );


    You should always check open() succeded. If you really think it's not
    gonna fail so can't be bothered with any elegant recovery code the
    very least you should do is die. BTW: now would be a good time to
    have declared $FIFO.

    open ( my $FIFO, "</dev/ipt_fifo" ) or die $!;


    > while ( <$FIFO> ) {
    > # analyze log data... ( left here ) #
    > push( @$reports, $_ );
    > print $my_pid . " : " . $i . "--Size of \$reports\t:\t" .
    > scalar( @$reports ) . "\n";
    > if (( $i % 1024 ) == 0 ) {


    Since $i can never be >1024 that is more simply written as

    if ( $i == 1024 ) {

    > saveReport ( $reports,$my_pid );
    > undef ( $reports );


    Earlier you initialised $reports to [], now you reinitialize to undef.

    Either will do but it's neater to be consistant.

    > print "OK\n";
    > $i = 0;
    > }
    > $i++;
    > }


    Don't you want to call saveReport here too to save the tail?

    > close ( $FIFO );


    If you hadn't delared $FIFO in too wide a scope in the first place you
    wouldn't have needed the explicit close() because $FIFO would go out
    of scope here anyhow.

    > }
    > }


    You have forgotten the final else block on your if .. elsif .. else to
    handle the 3 possible return states from fork().

    > }
    >
    > wait_zombies ( $pid );


    Why are you passing $pid here? What it the 5th fork failed?

    I think you mean:

    wait_zombies() if @$child_pids;

    Or maybe just

    wait_zombies();

    > sub wait_zombies {
    > my $pid = $_[ 0 ];
    > if ( $pid > 0 ) {
    > my $chId;
    > do {
    > $chId = waitpid( -1, 0 );
    > print "Caught child no $chId\n";
    > }until ( $chId == - 1 );
    > }
    > }


    As a general rule it's wastefull to wrap a loop inside an "if"
    constuct that simply tests for a condition under which the loop would
    exit immediately anyhow.

    sub wait_zombies {
    while ( (my $chId = waitpid( -1, 0 )) != -1) {
    print "Caught child no $chId\n";
    }
    }

    > sub saveReport {
    > my $reports = $_[ 0 ];
    > my $my_pid = $_[ 1 ];


    Fetching sutroutinne arguments more ideomatically written as:

    my $reports = shift;
    my $my_pid = shift;

    Or:

    my ($reports,$my_pid) = @_;

    > open ( FH, ">>log_fork_childno_$my_pid.log" );


    Hey! You knew about using lexically scoped file handles above but now
    you've forgotten.

    open ( my $FH, ">>log_fork_childno_$my_pid.log" ) or die $!;

    > print "saving reports [PID = $my_pid]";
    > foreach ( @$reports ) {
    > print ".";
    > print FH $_;
    > }
    > close ( FH );


    If you'd uses a lexically scoped file handle you wouldn't need the
    explicit close.

    --
    \\ ( )
    . _\\__[oo
    .__/ \\ /\@
    . l___\\
    # ll l\\
    ###LL LL\\
     
    Brian McCauley, Oct 29, 2003
    #4
  5. Ole

    Ole Guest

    Ben Morrow <> wrote in message news:<bnoto5$k9a$>...
    > (Ole) wrote:
    >
    > > hi everybody,

    >
    > > Now i have the following code: ( the program forks 5 processes that
    > > work on the input of the named pipe. after having collected a
    > > certain amount of information, the processes save the data to a
    > > file. )

    >
    > I have to confess that I don't really understand the description
    > above; however, I shall offer what help I can...
    >
    > > #!/usr/bin/perl -w
    > > use POSIX;
    > > use strict;

    >
    > You want
    > use subs qw/wait_zombies saveReport/;
    > here. You alse want to choose one of under_scores and studlyCaps and
    > stick to it.
    >
    > > my $pid;
    > > my $childId;
    > > my $child_pids = [ ];
    > > our $zombies = 0;

    >
    > Why 'our'? The only reason is if you need to get a this var from
    > outside this file... as far I I can see, you have only one file.
    >
    > You want
    > $\ = "\n";
    > here: for why see below where I deal with your print()s.
    >
    > > ##########################################################################
    > >
    > > for ( my $num = 0; $num < 5; $num++ ) {

    >
    > This is C. Perl would be:
    >
    > for my $num (0..5) {
    >
    > > if (( $pid = fork ( ) ) > 0 ) { # parent process

    >
    > Where do you check if the fork failed?
    >
    > Better:
    > $pid = fork;
    > defined $pid or die "can't fork: $!";
    > if($pid) { # parent
    >
    > > push ( @$child_pids, $pid );
    > > }elsif ( $pid == 0 ) {

    >
    > ...in which case this can just be
    > } else { # kid
    >
    > > my $FIFO;
    > > my $i = 1;
    > > my $reports = [];
    > > my $my_pid = getpid ( ) ||die "Cannot getpid ( ) :
    > > $!";

    >
    > It is clearer to use 'or' rather than ||, as then you do not need the
    > () for getpid.
    >
    > The current process's pid is available in $$: see perldoc perlvar.
    >
    > > while ( 1 ) {

    >
    > Where do you exit this loop?
    >
    > > open ( $FIFO, "</dev/ipt_fifo" );

    >
    > No need for the brackets.
    > As Tad would say :) always, yes *always* check the return of open.
    >
    > open $FIFO, "</dev/ipt_fifo" or die "can't open fifo: $!";
    >
    > > while ( <$FIFO> ) {
    > > # analyze log data... ( left here ) #
    > > push( @$reports, $_ );
    > > print $my_pid . " : " . $i . "--Size of \$reports\t:\t" .
    > > scalar( @$reports ) . "\n";

    >
    > print "$$ : $i--Size of \$reports\t:\t" . @$reports;
    >
    > No need for \n as you set $\ above: see perldoc perlvar.
    >
    > > if (( $i % 1024 ) == 0 ) {

    >
    > $i will never be greater that 1024: no need for modulus.
    >
    > if($i == 1024) {
    >
    > > saveReport ( $reports,$my_pid );
    > > undef ( $reports );
    > > print "OK\n";
    > > $i = 0;
    > > }
    > > $i++;
    > > }
    > > close ( $FIFO );

    >
    > close can fail as well.
    >
    > close $FIFO or die "colsing fifo failed: $!";
    >
    > > }
    > > }
    > > }
    > >
    > > wait_zombies ( $pid );

    >
    > $pid holds the pid of the last child you forked. Why are you passing
    > only this?
    >
    > > sub wait_zombies {
    > > my $pid = $_[ 0 ];

    >
    > my $pid = shift;
    >
    > > if ( $pid > 0 ) {

    >
    > Is this an attempt to make up for not checking the return value of
    > fork()? Otherwise it is pointless...
    >
    > > my $chId;
    > > do {
    > > $chId = waitpid( -1, 0 );
    > > print "Caught child no $chId\n";
    > > }until ( $chId == - 1 );
    > > }
    > > }
    > >
    > >
    > >
    > > sub saveReport {
    > > my $reports = $_[ 0 ];
    > > my $my_pid = $_[ 1 ];

    >
    > my $reports = shift;
    > my $my_pid = shift;
    >
    > or
    >
    > my ($reports, $my_pid) = @_;
    >
    > according to taste.
    >
    > > open ( FH, ">>log_fork_childno_$my_pid.log" );

    >
    > open FH, ">>log_fork_childno_${my_pid}.log"
    > or die "can't open log no. $my_pid: $!";
    >
    > > print "saving reports [PID = $my_pid]";
    > > foreach ( @$reports ) {

    >
    > /Please/ try and be consistent. Do you spell it 'for' or 'foreach'?
    >
    > > print ".";
    > > print FH $_;
    > > }
    > > close ( FH );

    >
    > ... or die "can't close log $my_pid: $!";
    >
    > > }
    > > ##################################################################
    > >
    > > Now i have 2 problems:

    >
    > > 1.) How do I trap e.g. a "kill -9" command, directed to the parent
    > > process in order to collect to child processes before quiting. Now
    > > the child processes get init for as parent.

    >
    > You can't. That's the point of SIGKILL: it can't be trapped. If you
    > want to trap other signals, look at %SIG in perlvar.
    >


    So, if i "use wait_zombies", the child processes are terminated, er ..
    waited for automatically or do i have to define a reaper function ?


    > > 2. ) I recognize a strange behaviour: for a while, the processes
    > > collect data from syslog through the fifo as they should. But then,
    > > it seems that syslog is generating approx 200 entires per
    > > second. But i think that it has to do with process synchronization.

    >
    > I'm afraid I don't understand what you mean... but you could be right:
    > you certainly have sync problems.
    >
    > > Questions:
    > >
    > > 1. ) Is it possible that, if two processes try to read
    > > simultaneously from the named pipe or fifo, they lock themselves
    > > somehow ?

    >
    > Each byte fed into the pipe comes out exactly once, to exactly one of
    > the processes reading it. Absolutely no guarantees about which. It is
    > generally a bad idea to have more than one process on either end of a
    > fifo.
    >
    > > 2. ) How can manage it to have only one child process trying to read
    > > from the fifo ?

    >
    > a. Use a lockfile to synchronise access.


    But what if syslog wants to write to the fifo while me having locked
    the pipe ?


    > b. Have a 'reader' process that reads data from the fifo and passes it
    > to the other children for processing.
    > c. Don't fork()... :)


    But how have children without fork ?

    fifo
    |
    | parents reads from
    ________|__________
    |parent ( reader )|
    |_________________|
    /\
    / \ parent sends data to children
    / \
    child1 .. childn ( children process data and
    save it )

    >
    > > I would be very thankful if someone might help me out here. I need
    > > this filter daemon, because it is the topic of the final examination
    > > of my training.

    >
    > Well, you must examine your own conscience about that... I would
    > suggest you need to at least read the question a bit more carefully to
    > find out what is *actually* required, in particular, what these
    > multiple child processes are supposed to do.

    There is no question. Here in germany the system lacks structured
    education.
    You must ( or may ) choose what you want to do for the final
    excamination. BTW, i'm not studying computer science ( not yet ). I'm
    training as "Fachinformatiker Systemintegration" some kind of network
    plugger.
    So i decided to implement a traffic analyzing system, running on
    linux, iptables and syslog-ng and perl. Initially i thought it would
    suffice
    to have a cron execute a perl script every 10 minutes in order to
    extract logfileinformation and save it away to postgres. But i saw
    that the logfiles were growing too fast ( ~ 0.5M / sec ). So i
    decided to implement a daemon that
    will attach to a pipe, filtering the logfiles "on the fly".

    Thanks for MUCH inspiration to enhancing the code ( or to make it
    reasonable ).

    Greetings Ole .

    >
    > Ben
     
    Ole, Oct 30, 2003
    #5
  6. Please trim quoted material leaving just what is relevant to provide context.

    (Ole) writes:

    > Ben Morrow <> wrote in message news:<bnoto5$k9a$>...
    > > (Ole) wrote:
    > >

    [ snip 160 lines that were not necessary to give context ]

    > >
    > > > 1.) How do I trap e.g. a "kill -9" command, directed to the parent
    > > > process in order to collect to child processes before quiting. Now
    > > > the child processes get init for as parent.

    > >
    > > You can't. That's the point of SIGKILL: it can't be trapped. If you
    > > want to trap other signals, look at %SIG in perlvar.
    > >

    >
    > So, if i "use wait_zombies", the child processes are terminated, er ..
    > waited for automatically or do i have to define a reaper function ?


    In general you only have to define a reaper if you want timely destruction of
    zombies.

    [ snip 19 more irrelevant qouted lines ]

    > > > 2. ) How can manage it to have only one child process trying to read
    > > > from the fifo ?

    > >
    > > a. Use a lockfile to synchronise access.

    >
    > But what if syslog wants to write to the fifo while me having locked
    > the pipe ?


    syslog neither knows nor cares what mechanisms your processes are
    using to suncronise their actions.

    > > c. Don't fork()... :)

    >
    > But how have children without fork ?


    You don't. By don't fork Ben was saying just have a single process.
    There's no apparent reason for using a multiprocess model here.


    > > b. Have a 'reader' process that reads data from the fifo and passes it
    > > to the other children for processing.


    >
    > fifo
    > |
    > | parents reads from
    > ________|__________
    > |parent ( reader )|
    > |_________________|
    > /\
    > / \ parent sends data to children
    > / \
    > child1 .. childn ( children process data and
    > save it )


    Yes that's Ben's (b).

    > > > I would be very thankful if someone might help me out here. I need
    > > > this filter daemon, because it is the topic of the final examination
    > > > of my training.

    > >
    > > Well, you must examine your own conscience about that... I would
    > > suggest you need to at least read the question a bit more carefully to
    > > find out what is *actually* required, in particular, what these
    > > multiple child processes are supposed to do.

    >
    > There is no question. Here in germany the system lacks structured
    > education.
    > You must ( or may ) choose what you want to do for the final
    > excamination. BTW, i'm not studying computer science ( not yet ). I'm
    > training as "Fachinformatiker Systemintegration" some kind of network
    > plugger.
    > So i decided to implement a traffic analyzing system, running on
    > linux, iptables and syslog-ng and perl. Initially i thought it would
    > suffice
    > to have a cron execute a perl script every 10 minutes in order to
    > extract logfileinformation and save it away to postgres. But i saw
    > that the logfiles were growing too fast ( ~ 0.5M / sec ). So i
    > decided to implement a daemon that
    > will attach to a pipe, filtering the logfiles "on the fly".


    I think you are missing the point here.

    When Ben said "find out what is *actually* required, in particular,
    what these multiple child processes are supposed to do", he was (I
    believe) saying that as has as you've shown us there is no reason for
    using more than a single process to implement you daemon.

    Is this running on a 8-CPU box? If not your processes will all be
    contending for the same CPUs so will most likely be slower than a
    single process.

    Even if your box has 8CPUs, do you really want this daemon to be
    able to occupy multiple CPUs?

    --
    \\ ( )
    . _\\__[oo
    .__/ \\ /\@
    . l___\\
    # ll l\\
    ###LL LL\\
     
    Brian McCauley, Oct 30, 2003
    #6
  7. Ole

    Ben Morrow Guest

    [brian has already alluded to the virtues of proper snipping]

    (Ole) wrote:

    > Ben Morrow <> wrote in message
    > news:<bnoto5$k9a$>...
    > > (Ole) wrote:
    > > > 1.) How do I trap e.g. a "kill -9" command, directed to the parent
    > > > process in order to collect to child processes before quiting. Now
    > > > the child processes get init for as parent.

    > >
    > > You can't. That's the point of SIGKILL: it can't be trapped. If you
    > > want to trap other signals, look at %SIG in perlvar.

    >
    > So, if i "use wait_zombies", the child processes are terminated, er ..
    > waited for automatically or do i have to define a reaper function ?


    No, no... if your parent process is killed with SIGKILL, you have
    absolutely no choice about what happens next. Your parent process will
    terminate, and the children will be picked up by init and be
    automatically reaped when they exit. (Note: this has nothing to do
    with Perl, it is simply how SIGKILL works under Unix).

    If, on the other hand, you kill the parent process with SIGTERM, which
    is the accepted way to ask a process to terminate, you can install a
    handler in %SIG{TERM} to kill the children as well. Read perlipc and
    the description of %SIG in perlvar. I'm not going to say this again :).

    > > > 2. ) How can manage it to have only one child process trying to read
    > > > from the fifo ?

    > >
    > > a. Use a lockfile to synchronise access.

    >
    > But what if syslog wants to write to the fifo while me having locked
    > the pipe ?


    1. Perl's locks (and Unix locks generally) are advisory. If syslog
    wishes to write to the fifo, it will ignore the lock and write.

    2. The lock is to synchronise reading, not writing. There is only one
    process writing, so there's no need to sync. My suggestion is that
    you use a separate file created solely for the purpose rather than
    attempting to lock the fifo itself, as I've no idea how well
    locking a fifo works. See http://perl.plover.com/yak/flock/.

    > > b. Have a 'reader' process that reads data from the fifo and passes it
    > > to the other children for processing.
    > > c. Don't fork()... :)

    >
    > But how have children without fork ?


    I think Brian's dealt with this.

    > There is no question. Here in germany the system lacks structured
    > education.
    > You must ( or may ) choose what you want to do for the final
    > excamination.


    In that case, I will change my recommendation to 'I would suggest you
    attempt to implement something you understand a little better'. But
    far be it from me to discourage you from learning... You need to
    understand the issues of multiprocessing before you try this, though,
    including:

    1. What benefits it can and cannot bring. You should have a clear idea
    of why you want to multiprocess, and of what the jobs of the
    different processes are, before you start.

    2. The need for synchronisation, including how file locks work, what
    they can and cannot prevent you from doing, and to avoid races.

    3. A better understanding of Unix signals, in this case particularly
    SIGKILL, SIGTERM and SIGCHLD.

    Most of these issues are dealt with to some degree in perlipc; a more
    detailed exposition should be available in any introductory book about
    programming Unix (the same concepts apply regardless of language).

    Ben

    --
    don't get my sympathy hanging out the 15th floor. you've changed the locks 3
    times, he still comes reeling though the door, and soon he'll get to you, teach
    you how to get to purest hell. you do it to yourself and that's what really
    hurts is you do it to yourself just you, you and noone else *
     
    Ben Morrow, Oct 30, 2003
    #7
  8. Ole

    Ole Guest

    i have ( tried to ) taken into consideration your improvement
    suggestions. ( or better : your hints to render my project possible ).
    BTW, excuse my bad english!!
    I hope you ( both ) are willingly to criticize these last ( almost
    final )
    perl scripts. Apart from some "start-stop-daemon" issues, ( i think
    those are
    debian GNU / Linux specific ), the programs are working fine. And,
    apart from what you hopefully will find and point out, i have to
    render the scripts able
    to initialize the key for the message queue(s) ids dynamically.

    You may surely use these scripts if you like
    ( take that as a joke ).

    I hope the examinations - commitee will not decapitate me.

    Thanks for any further hints and for those already given.

    Ole Viaud-Murat.



    I have left the functions, that store and save data behind.
    1.) the process that prepares the data and sends it to 1.)



    #!/usr/bin/perl
    ##############################################################################
    # File : ipt_input_logger.pl
    # Descr : Process that reads data coming from a fifo. The data is
    then
    # analyzed and put into a data package. This package is then
    put into
    # a message queue to be read by a client process that
    prepares the
    # data to be saved into a DB
    ##############################################################################
    use POSIX;
    use IPC::SysV qw( IPC_NOWAIT );
    use IPC::Msg qw( S_IRWXU );
    use constant KEY => 1234;
    use warnings;
    use strict;


    # check, if the process already runs, or under circumstances the pid
    # file still exists.

    if ( -f "/var/run/ipt_input_logger.pid" ) {
    die "I'm already running. Check /var/run !\n";
    }else {

    # open logfile handle

    open ( my $LOGFH, ">>/var/log/ipt_input_log" ) or die "cannot open
    logfile: $!";

    # fork to become daemon

    if (( my $pid = fork ( )) == 0 ) {
    print "starting ipt_input_logger...\n";
    # create message queue, to that the log data will be sent
    my $msg_queue = new IPC::Msg ( KEY, S_IRWXU );
    local $| = 1;
    local $SIG{ INT } = \&clean_all;
    local $SIG{ TERM } = \&clean_all;
    my $fifo_path = "</dev/ipt_input";
    my @message_buffer;
    save_pid ( $pid );

    eval {
    # enter loop, that will atach to fifo when possible
    while ( open ( my $fifo, $fifo_path ) or die "no fifo here")
    {
    # read what comes
    while ( <$fifo> ) {
    # if message queue is unavailable due to msgmax, save
    it to
    # buffer otherwise, send entry and flush buffer to
    queue
    # ( i assume unwisely that msgmax is the only reason
    for error )
    eval {
    $msg_queue->snd( 1, $_, IPC_NOWAIT );
    for ( my $i=0; $i < @message_buffer; $i++ ) {
    $msg_queue->snd( 1, pop( @message_buffer ),
    IPC_NOWAIT );
    }
    };
    # if the buffer grows too large, shout panic to logfile
    if ( $@ ) {
    push ( @message_buffer, $_ );
    if ( @message_buffer > 1024 ) {
    my $msg = sprintf( "[%s] : PANIC! someone is
    stressing this machine !!!\n", localtime() );
    $msg .= sprintf( "\@message_buffer is growing too
    large ( size = %d )\n", @message_buffer );
    print $LOGFH $msg;
    }
    next;
    }
    }
    }
    };
    catch_error ( $LOGFH, $@ );
    }elsif ( $pid > 0 ) {
    catch_error ( $LOGFH, $@ );
    exit(0);
    }else {
    die "fork failed: $!";
    }
    }

    sub clean_all {
    my $LOGFH = shift;
    if ( $@ ) {
    print $LOGFH $@;
    }
    if ( -f "/var/run/ipt_input_logger.pid" ) {
    unlink( "/var/run/ipt_input_logger.pid" );
    }
    exit( 0 );
    }

    sub catch_error {
    my $LOGFH = shift;
    my $error_msg = shift;
    if ( $error_msg ) {
    print $LOGFH "[ " . localtime ( ) . " ]:" . $error_msg;
    exit(-1);
    }
    exit(0);
    }

    sub save_pid {
    my $pid = shift;
    open ( my $FH, ">/var/run/ipt_input_logger.pid" ) or die "Cannot
    open pid file:$!";
    print $FH $pid . "\n";
    close ( $FH );
    }

    ##############################################################################

    2.) the process that awaits the prepared data:

    #!/usr/bin/perl
    ##############################################################################
    # File : ipt_client.pl
    # Descr : Process that reads the log data put into a message queue
    # and prepares the data so that it can be saved into the DB.
    ##############################################################################
    use POSIX;
    use IPC::SysV qw( IPC_CREAT );
    use IPC::Msg;
    use Ptrack_DB;
    use constant INTERVAL => 1;
    use constant KEY => 1234;
    use constant SAVEINTERVAL => 60;
    use warnings;
    use strict;

    my $months = {
    'Jan' => 1, 'Feb' => 2, 'Mar' => 3, 'Apr' => 4, 'May' => 5, 'Jun'
    => 6,
    'Jul' => 7, 'Aug' => 8, 'Sep' => 9, 'Oct' => 10, 'Nov' => 11, 'Dec'
    => 12
    };

    if (( my $pid = fork ( ) ) == 0 ) {
    my $store = {};
    my $ptrack = Ptrack_DB->connect ( "iptrack" );
    my $msg_queue = new IPC::Msg( KEY, IPC_CREAT );
    my ( $start, $end );
    $start = time ( );
    open ( my $LOGFH, ">>/var/log/ipt_client.log" ) || die "Cannot open
    logfile:$!";
    local $| = 1;
    # daemonize
    chdir("/");
    umask(0);
    setsid();
    while ( $msg_queue->rcv( my $buffer, 1024 ) ) {
    $end = time ();
    if ( defined ( $buffer ) ) {
    my ( $year, $month, $day, $time, $src, $dst, $direction);
    my $tcp_header_content = {};
    if ( $buffer =~
    /((^[JFMASOND]..)(\s?\d+)(\s?(..):(..):(..)))/ ) {
    $year = (localtime( ))[ 5 ] + 1900;
    $month = $$months{ $2 };
    $day = $3;
    $time = $4;
    }
    if ( $buffer =~ /(FW(\s)(\w+))/ ) {
    $direction = $3;
    }
    $buffer =~ s/((^(...) (.)+ (..):(..):(..) 2pk kernel:)|(FW
    (\w+):)|(DF)|(ACK)|(DYN)|(SYN)|(PSH)|(FIN)|(RST))//g;
    for my $field ( split ( " ", $buffer ) ) {
    my ( $flag, $value ) = split( "=", $field );
    $tcp_header_content->{ $flag } = $value;
    }
    put_into_store ( $tcp_header_content, $store, $direction );
    }
    if ( $end - $start > SAVEINTERVAL ) {
    save_data ( $store, $ptrack, $LOGFH );
    $store = { };
    $start = time ( );
    $end = time ( );
    }
    my $now = localtime ( );
    }
    }elsif ( $pid > 0 ) {
    exit(0);
    }
     
    Ole, Nov 8, 2003
    #8
  9. Ole

    Ole Guest

    Hello,

    please critizize my perlscripts posted in position ( 6 ).

    They concern SysV IPC and standard perl stuff.

    Greetings Ole V.-M.
     
    Ole, Nov 12, 2003
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page