perl efficiency -- fastest grepping?

Discussion in 'Perl Misc' started by Bryan Krone, Nov 16, 2004.

  1. Bryan Krone

    Bryan Krone Guest

    I have a stream of data comming off a serial port at 19200. I am wondering
    what is the most efficient way to grep through the data in realtime? I have
    20 or so different strings I need to find. All of which are ~15 characters
    or less. Currently I'm using code that looks like this:                

    forever loop
    {
    sysread the serial buffer into $newdata

    if( defined $newdata )
    {
            $inString =~ s/^.*(.{32})$/$1/o;
            $inString .= $newdata;
    }



    if( $inString =~ /.*ResetPF.*/o || $inString =~ /.*[gG][oO].*/o || $inString
    =~ /.*reset.*/o || $inString =~ /.*sysinit.*/o )
    {
            set some flag;
    }
    }
    Is there a more efficient way to grep for the strings to set some flag? This
    works pretty well but this is only 4 strings. I would like to add a lot
    more but the program slows down after 10 or more strings. Any ideas would
    be greatly appreciated.

    Thanks
    Bryan Krone, Nov 16, 2004
    #1
    1. Advertising

  2. Bryan Krone

    Peter Wyzl Guest

    "Bryan Krone" <> wrote in message
    news:...
    >I have a stream of data comming off a serial port at 19200. I am wondering
    > what is the most efficient way to grep through the data in realtime? I
    > have
    > 20 or so different strings I need to find. All of which are ~15 characters
    > or less. Currently I'm using code that looks like this:


    No doubt others more qualified than I will comment as well, but a couple of
    things...

    > forever loop
    > {
    > sysread the serial buffer into $newdata
    >
    > if( defined $newdata )
    > {
    > $inString =~ s/^.*(.{32})$/$1/o;


    Why are you using the 'o' switch to the regex? You have you variable being
    interpolated.

    > $inString .= $newdata;



    Anyway, I believe you will find substr to be significantly faster for this
    operation, simply discarding everything except the last 32 characters in a
    string.

    $inString = substr( $inString, -32) . $newdata;

    Read about that in perlfunc


    > }
    >
    >
    >
    > if( $inString =~ /.*ResetPF.*/o || $inString =~ /.*[gG][oO].*/o ||
    > $inString
    > =~ /.*reset.*/o || $inString =~ /.*sysinit.*/o )


    Ooo!! Your regexen will be VERY inefficient because of the .* causing huge
    amounts of backtracking (specially at both ends). Since you are only
    looking to match the string, you can discard both sets of .* for a BIG
    performance boost (particularly across multiple regexen). Again, you have
    the unnecessary 'o' switches, and that second regex can be written using the
    'i' switch (case insensitive).

    Yielding:

    if( $inString =~ /ResetPF/ || $inString =~ /go/i || $inString =~ /reset/ ||
    $inString =~ /sysinit/ ){

    I think you need to read up a bit more on regexes, particularly switches and
    how the regex engine works.

    HTH
    --
    Wyzelli
    #!/usr/bin/perl -w
    use strict;
    eval reverse ';"n\rekcaH lreP rehtona tsuJ" tnirp';
    Peter Wyzl, Nov 16, 2004
    #2
    1. Advertising

  3. On Tue, 16 Nov 2004 05:57:25 -0600, Bryan Krone wrote:

    <snip>
    > if( $inString =~ /.*ResetPF.*/o || $inString =~ /.*[gG][oO].*/o ||

    $inString
    > =~ /.*reset.*/o || $inString =~ /.*sysinit.*/o ) {
    >         set some flag;
    > }
    > }
    > Is there a more efficient way to grep for the strings to set some flag?
    > This works pretty well but this is only 4 strings. I would like to add a
    > lot more but the program slows down after 10 or more strings. Any ideas
    > would be greatly appreciated.


    First, if you can do without the regular expressions, do so. You can use
    either 'unpack' or 'split' and place the results into an array. Then you
    can use 'grep' to find what you need.

    Second, I'm going to throw this out here and see what happens.

    If you can't get away from using regular expressions ... and because there
    are *specific* matches to be performed ... and with each match there might
    be a specific flag to be set (or action to be performed based upon the
    match), I'd (maybe) use a lookup table. This method may or may not be any
    better than the way you're doing it now. I haven't benchmarked it and ...
    my benchmarks would be useless against what you're trying to do.

    For example:

    #!/usr/bin/perl

    use strict;
    use warnings;

    my $inString = 'reset the switch now please';

    my %lookup = (
    qr{ResetPF} => \&do_resetpf,
    qr{go}i => \&do_go,
    qr{reset} => \&do_reset,
    qr{sysinit} => \&do_sysinit,
    );

    while( my($key,$value) = each %lookup ) {
    if( $inString =~ $key) {
    $value->();
    }
    }

    sub do_resetpf {
    print "ResetPF matched\n";
    }

    sub do_go {
    print "GgOo matched\n";
    }

    sub do_reset {
    print "reset matched\n";
    }

    sub do_sysinit {
    print "sysinit matched\n";
    }

    HTH

    Jim
    James Willmore, Nov 16, 2004
    #3
  4. Peter Wyzl <> wrote:


    > I think you need to read up a bit more on regexes, particularly switches and
    > how the regex engine works.



    See also:

    "How Regexes Work"

    http://perl.plover.com/Regex/


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Nov 16, 2004
    #4
  5. Bryan Krone

    Matija Papec Guest

    X-Ftn-To: Bryan Krone

    Bryan Krone <> wrote:
    >if( $inString =~ /.*ResetPF.*/o || $inString =~ /.*[gG][oO].*/o || $inString
    >=~ /.*reset.*/o || $inString =~ /.*sysinit.*/o )
    >{
    >        set some flag;
    >}
    >}
    >Is there a more efficient way to grep for the strings to set some flag? This


    If you're checking against plain strings (ResetPF, reset..) you can speed up
    things with perldoc -f index,

    if (1+index($inString, "ResetPF") or ..) {}



    --
    Matija
    Matija Papec, Nov 16, 2004
    #5
  6. Bryan Krone

    Uri Guttman Guest

    >>>>> "DD" == Darren Dunham <> writes:

    DD> Peter Wyzl <> wrote:

    >> if( $inString =~ /ResetPF/ || $inString =~ /go/i || $inString =~
    >> /reset/ || $inString =~ /sysinit/ ){


    DD> And while those may be replacable with index, if you can't do so in the
    DD> general case, moving all the matches into a single regex can be
    DD> significantly faster...

    DD> if( $inString =~ /ResetPF|(?i:go)|reset|sysinit/ )

    and alternation of lots of strings in a regex can be very slow as well.

    the OP didn't give a proper spec for the problem IMO. if the string in
    question has a token in a know place, the fastest way to check for it is
    to grab it with a simple regex and then look it up in a hash. so the
    data read from the serial line needs to be properly specified with some
    way to define where this match string is located. then extraction should
    be easy and a hash can be made of the desired strings.

    uri
    Uri Guttman, Nov 16, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Bryan Krone

    perl efficiency -- fastest grepping?

    Bryan Krone, Nov 5, 2004, in forum: Perl
    Replies:
    1
    Views:
    1,470
    Jim Gibson
    Nov 8, 2004
  2. John Fitzsimons

    Grepping with Python - script/prog required.

    John Fitzsimons, Jun 24, 2003, in forum: Python
    Replies:
    1
    Views:
    736
    John Fitzsimons
    Jul 1, 2003
  3. David A. Black

    [ANN] Grepper: object-oriented grepping

    David A. Black, Dec 24, 2008, in forum: Ruby
    Replies:
    1
    Views:
    130
    Tiago Nogueira
    Dec 24, 2008
  4. David A. Black
    Replies:
    3
    Views:
    115
    Florian Gilcher
    Jan 19, 2009
  5. mike
    Replies:
    15
    Views:
    183
    Anno Siegel
    May 21, 2004
Loading...

Share This Page