Pattern Matching and skipping

Discussion in 'Perl Misc' started by mattjones@hotmail.co.uk, Sep 6, 2006.

  1. Guest

    Hi,

    I've got a small problem....im searching for words in a sentence (from
    a log file) then pulling the sentence and putting it in a database. My
    problem is that if the word and thus the sentence is no in the log file
    - it is missing the log file out altogether and now recording them in
    the database.
    Is there a way of telling the program to look for these words - and if
    they are not there - just skip that pattern (leave the field blank) and
    move to the next pattern?

    <SNIP>

    while (<LOG>) {
    if (/FASTSEARCH|conflicting/) {
    my @lines = split /\n/; {
    foreach my $fast (@lines) {

    while (<LOG>) {
    if (/elapsed/) {
    my @lines = split /\n/; {
    foreach my $elapsed2 (@lines) {

    So, the script looks for FASTSEARCH and conflicting......if these are
    not present i'd like the script to leave the field blank and move to
    the next pattern (elapsed).
    At the moment my script works but a few of the log files don't contain
    FASTSEARCH or conflicting and so are not getting read into the database
    (presumably because i am just doing loop after loop and if it doesn't
    pick up either of these words it breaks the chain!)

    Thanks

    Matt (PERL Newbie!)
     
    , Sep 6, 2006
    #1
    1. Advertising

  2. Mumia W. Guest

    On 09/06/2006 09:24 AM, wrote:
    > Hi,
    >
    > I've got a small problem....im searching for words in a sentence (from
    > a log file) then pulling the sentence and putting it in a database. My
    > problem is that if the word and thus the sentence is no in the log file
    > - it is missing the log file out altogether and now recording them in
    > the database.
    > Is there a way of telling the program to look for these words - and if
    > they are not there - just skip that pattern (leave the field blank) and
    > move to the next pattern?
    >
    > <SNIP>
    >
    > while (<LOG>) {
    > if (/FASTSEARCH|conflicting/) {
    > my @lines = split /\n/; {
    > foreach my $fast (@lines) {
    >
    > while (<LOG>) {
    > if (/elapsed/) {
    > my @lines = split /\n/; {
    > foreach my $elapsed2 (@lines) {
    >
    > So, the script looks for FASTSEARCH and conflicting......if these are
    > not present i'd like the script to leave the field blank and move to
    > the next pattern (elapsed).
    > At the moment my script works but a few of the log files don't contain
    > FASTSEARCH or conflicting and so are not getting read into the database
    > (presumably because i am just doing loop after loop and if it doesn't
    > pick up either of these words it breaks the chain!)
    >
    > Thanks
    >
    > Matt (PERL Newbie!)
    >


    Use the 'else' section of the if statement to tell perl what
    to do when FASTSEARCH and conflicting are not there.

    if (condition) {
    ... code ...
    } else {
    ... code ...
    }
     
    Mumia W., Sep 6, 2006
    #2
    1. Advertising

  3. Guest


    >
    > Use the 'else' section of the if statement to tell perl what
    > to do when FASTSEARCH and conflicting are not there.
    >
    > if (condition) {
    > ... code ...
    > } else {
    > ... code ...
    > }


    ok thanks - i thought i had tried that today but ill give it another
    shot tomorrow... perhaps i put the curly brackets in the wrong way!
     
    , Sep 6, 2006
    #3
  4. Guest

    Ok, this hasn't seemed to work - either i am putting the } in the wrong
    place (and i think i have tried most things)!

    So i've tried:

    while (<LOG>) {
    if (/elapsed/) {
    my @lines = split /\n/; {
    foreach my $elapsed1 (@lines) {
    {
    while (<LOG>) {
    if (/FASTSEARCH|conflicting/) {
    } else {next } # this is where i
    am having trouble
    my @lines = split /\n/; {
    foreach my $fast (@lines) {

    while (<LOG>) {
    if (/elapsed/) {
    my @lines = split /\n/; {
    foreach my $elapsed2 (@lines) {

    So, i want the code to look for FASTSEARCH or conflicting and if they
    aren't there - leave it blank but continue with the while loops.
     
    , Sep 7, 2006
    #4
  5. -berlin.de Guest

    <> wrote in comp.lang.perl.misc:
    > Ok, this hasn't seemed to work - either i am putting the } in the wrong
    > place (and i think i have tried most things)!
    >
    > So i've tried:
    >
    > while (<LOG>) {
    > if (/elapsed/) {
    > my @lines = split /\n/; {


    I haven't followed this thread, but the split() above is nonsense. If
    you have read a line (with standard $/) the text contains at most one
    line feed at he end. Splitting on linefeed doesn't do more than chomp.

    Since it appears multiple times in your code, here and in your OP, I
    thought I'd mention it.

    Anno
     
    -berlin.de, Sep 7, 2006
    #5
  6. MattJ83 Guest

    > > while (<LOG>) {
    > > if (/elapsed/) {
    > > my @lines = split /\n/; {

    >
    > I haven't followed this thread, but the split() above is nonsense. If
    > you have read a line (with standard $/) the text contains at most one
    > line feed at he end. Splitting on linefeed doesn't do more than chomp.
    >
    > Since it appears multiple times in your code, here and in your OP, I
    > thought I'd mention it.
    >
    > Anno


    When i remove split from my code however, the code doesn't work?!
     
    MattJ83, Sep 7, 2006
    #6
  7. MattJ83 Guest

    Right - i thought it might be btter if i just paste all of the code i
    have done:

    #!/usr/central/bin/perl
    use strict ;
    #use warnings;
    use DBI;

    my @files= </home/USERNAME/logs/*.log>;
    foreach my $file (@files) {

    open (LOG,</home/USERNAME/logs/*.log>) or die "can't open LOG: $!\n";

    while (<LOG>) {
    if (/updates table/) {
    my @lines = split /\n/; {
    foreach my $info (@lines) {

    while (<LOG>) {
    if (/elapsed/) {
    my @lines = split /\n/; {
    foreach my $elapsed1 (@lines) {

    while (<LOG>) {
    if (/conflicting|FASTSEARCH/) {
    my @lines = split /\n/; {
    foreach my $fast (@lines) {

    while (<LOG>) {
    if (/elapsed/) {
    my @lines = split /\n/; {
    foreach my $elapsed2 (@lines) {

    while (<LOG>) {
    if (/inversions/) {
    my @lines = split /\n/; {
    foreach my $inversions (@lines) {

    while (<LOG>) {
    if (/elapsed/) {
    my @lines = split /\n/; {
    foreach my $elapsed3 (@lines) {

    close(LOG) ;

    my $dbh = DBI ->connect("dbi:Oracle:SERVER", "DATABASE", "PASSWORD")
    or die "couldn't connect to database: $DBI::errstr\n";

    $dbh->do("insert into LOGS values ('$file', '$info', '$elapsed1',
    '$fast', '$elapsed2', '$inversions', '$elapsed3')") or die ("inse
    rting data failure: $!\n");

    $dbh->disconnect;
    }}}}}}}}}}}}}}}}}}}}}}}}}
    exit;

    And all i want it to do is skip the line if it can't match the words
    specified!!!
    I just can't get the else or elseif statements to work...........

    Sorry for the script - it isn't to brilliant but im new and it does
    what i want it to do!!!
     
    MattJ83, Sep 7, 2006
    #7
  8. -berlin.de Guest

    MattJ83 <> wrote in comp.lang.perl.misc:
    > > > while (<LOG>) {
    > > > if (/elapsed/) {
    > > > my @lines = split /\n/; {

    > >
    > > I haven't followed this thread, but the split() above is nonsense. If
    > > you have read a line (with standard $/) the text contains at most one
    > > line feed at he end. Splitting on linefeed doesn't do more than chomp.
    > >
    > > Since it appears multiple times in your code, here and in your OP, I
    > > thought I'd mention it.
    > >
    > > Anno

    >
    > When i remove split from my code however, the code doesn't work?!


    Sure, just deleting the line will break it because the assignment to
    @lines goes away too. The split() doesn't do anything useful, however.

    With a little more context, your code is:

    while (<LOG>) {
    if (/elapsed/) {
    my @lines = split /\n/; {
    foreach my $elapsed1 (@lines) {
    # ...

    Change this to (untested):

    while ( <LOG> ) {
    chomp;
    if ( /elapsed/ ) {
    # ...

    that is, drop both the line with split() and the next one "foreach ...".
    The content of $_, which is what matters, will be the same in both
    cases. The "foreach" loop would run only one time anyway.

    Anno
     
    -berlin.de, Sep 7, 2006
    #8
  9. David Squire Guest

    MattJ83 wrote:
    > Right - i thought it might be btter if i just paste all of the code i
    > have done:
    >
    > #!/usr/central/bin/perl
    > use strict ;
    > #use warnings;
    > use DBI;
    >
    > my @files= </home/USERNAME/logs/*.log>;
    > foreach my $file (@files) {
    >
    > open (LOG,</home/USERNAME/logs/*.log>) or die "can't open LOG: $!\n";


    Surely you want to open each file in succession here. I don't know what
    happens when you pass the result of a glob as the filename to open, but
    I would bet that what you really wanted here was something like:

    open (my $LOG, '<', $file) or die "can't open $file: $!\n";

    (Note the use of a lexical file handle and the three argument form of
    open. They're good habits to get into.)

    >
    > while (<LOG>) {


    This gets a *single line* from the filehandle LOG and stores it
    implicitly in $_

    > if (/updates table/) {
    > my @lines = split /\n/; {


    As has already been pointed out, you only have a single line to work
    with here. You can't split into multiple lines on \n, since there is
    only one \n, and the end of the line.

    You need to change your logic to loop over the lines one by one.

    > foreach my $info (@lines) {
    >
    > while (<LOG>) {
    > if (/elapsed/) {
    > my @lines = split /\n/; {
    > foreach my $elapsed1 (@lines) {
    >
    > while (<LOG>) {
    > if (/conflicting|FASTSEARCH/) {
    > my @lines = split /\n/; {
    > foreach my $fast (@lines) {
    >
    > while (<LOG>) {
    > if (/elapsed/) {
    > my @lines = split /\n/; {
    > foreach my $elapsed2 (@lines) {
    >
    > while (<LOG>) {
    > if (/inversions/) {
    > my @lines = split /\n/; {
    > foreach my $inversions (@lines) {
    >
    > while (<LOG>) {
    > if (/elapsed/) {
    > my @lines = split /\n/; {
    > foreach my $elapsed3 (@lines) {
    >
    > close(LOG) ;


    [snip]

    Indentation would help a lot too.


    DS
     
    David Squire, Sep 7, 2006
    #9
  10. -berlin.de Guest

    MattJ83 <> wrote in comp.lang.perl.misc:
    > > > while (<LOG>) {
    > > > if (/elapsed/) {
    > > > my @lines = split /\n/; {

    > >
    > > I haven't followed this thread, but the split() above is nonsense. If
    > > you have read a line (with standard $/) the text contains at most one
    > > line feed at he end. Splitting on linefeed doesn't do more than chomp.
    > >
    > > Since it appears multiple times in your code, here and in your OP, I
    > > thought I'd mention it.
    > >
    > > Anno

    >
    > When i remove split from my code however, the code doesn't work?!


    Sure, you threw out the baby with the bath water. The assignment to
    @lines also goes away if you just delete the line.

    The point is, @lines will only ever contain one element, the current
    line minus its line feed. The same would be achieved by (untested)

    while ( <LOG> ) {
    chomp;
    if ( /elapsed/ ) {
    my @lines = ( $_);

    I seem to remember that a loop over @lines follows. Both the variable
    @lines and the loop can presumably go away too.

    Anno
     
    -berlin.de, Sep 7, 2006
    #10
  11. MattJ83 <> wrote:

    > Right - i thought it might be btter if i just paste all of the code i
    > have done:



    You should write your code so that its structure can be seen.

    Each new block should increase the indent level.


    > while (<LOG>) {
    > if (/updates table/) {
    > my @lines = split /\n/; {
    > foreach my $info (@lines) {
    >
    > while (<LOG>) {
    > if (/elapsed/) {
    > my @lines = split /\n/; {
    > foreach my $elapsed1 (@lines) {



    while (<LOG>) {
    if (/updates table/) {
    my @lines = split /\n/; { <== what is that "{" there for?
    foreach my $info (@lines) {

    while (<LOG>) {
    if (/elapsed/) {
    my @lines = split /\n/; {
    foreach my $elapsed1 (@lines) {

    Now it is easy to see that control can never get to /elapsed/
    unless /updates table/ matches first.

    Read the line once, test it many times.

    while (<LOG>) {
    if (/updates table/) {
    # do stuff
    }
    elsif (/elapsed/) {
    # do other stuff
    }


    > I just can't get the else or elseif statements to work...........



    That is because you are not writing them correctly.


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Sep 7, 2006
    #11
  12. MattJ83 Guest


    >
    > Sure, you threw out the baby with the bath water. The assignment to
    > @lines also goes away if you just delete the line.
    >
    > The point is, @lines will only ever contain one element, the current
    > line minus its line feed. The same would be achieved by (untested)
    >
    > while ( <LOG> ) {
    > chomp;
    > if ( /elapsed/ ) {
    > my @lines = ( $_);
    >
    > I seem to remember that a loop over @lines follows. Both the variable
    > @lines and the loop can presumably go away too.
    >
    > Anno


    By this do you mean i can replace that looping repetitious code by
    using chomp; and ($_); ? Is ($_) just a short hand way of writing the
    full variable or do u actually mean do this and then declare the $
    variables somewhere else?
     
    MattJ83, Sep 7, 2006
    #12
  13. MattJ83 Guest

    #!/usr/central/bin/perl
    use strict ;
    #use warnings;
    use DBI;

    my @files= </home/mmj19903/logs/*.log>;
    foreach my $file (@files) {

    open (LOG,</home/mmj19903/logs/*.log>) or die "can't open LOG: $!\n";

    while (<LOG>) {
    chomp;
    if (/updates table/){
    my @lines = ( $_); ;
    foreach my $info (@lines) {

    while (<LOG>) {
    chomp;
    if (/elapsed/) {
    my @lines = ($_);
    foreach my $elapsed1 (@lines) {

    ............
    I presume this is what you mean't by the chomp - or did u mean
    something more simple still?
    So without the foreach line - i just get a global symbol $info (etc)
    requires explicit package name....
     
    MattJ83, Sep 7, 2006
    #13
  14. David Squire Guest

    MattJ83 wrote:
    > #!/usr/central/bin/perl
    > use strict ;
    > #use warnings;
    > use DBI;
    >
    > my @files= </home/mmj19903/logs/*.log>;
    > foreach my $file (@files) {
    >
    > open (LOG,</home/mmj19903/logs/*.log>) or die "can't open LOG: $!\n";


    Read my earlier response about this line. You are not opening $file.


    DS
     
    David Squire, Sep 7, 2006
    #14
  15. MattJ83 Guest

    David Squire wrote:
    > MattJ83 wrote:
    > > #!/usr/central/bin/perl
    > > use strict ;
    > > #use warnings;
    > > use DBI;
    > >
    > > my @files= </home/mmj19903/logs/*.log>;
    > > foreach my $file (@files) {
    > >
    > > open (LOG,</home/mmj19903/logs/*.log>) or die "can't open LOG: $!\n";

    >
    > Read my earlier response about this line. You are not opening $file.
    >
    >
    > DS


    I wasn't sure what you were gettign at with this line David - it works
    for me - this code so far does what i want it to.....it opens the log
    file puts the path of the log into a database field, and pulls out the
    lines that have the soecific words in them and places these lines ina
    database CLOB field. This is what i want - it repeats for each log in
    the directory.....

    Im a bit reluctant to change the code when this one works ok for me...
    ?
     
    MattJ83, Sep 7, 2006
    #15
  16. David Squire Guest

    MattJ83 wrote:
    > David Squire wrote:
    >> MattJ83 wrote:
    >>> #!/usr/central/bin/perl
    >>> use strict ;
    >>> #use warnings;
    >>> use DBI;
    >>>
    >>> my @files= </home/mmj19903/logs/*.log>;
    >>> foreach my $file (@files) {
    >>>
    >>> open (LOG,</home/mmj19903/logs/*.log>) or die "can't open LOG: $!\n";

    >> Read my earlier response about this line. You are not opening $file.
    >>
    >>
    >> DS

    >
    > I wasn't sure what you were gettign at with this line David - it works
    > for me - this code so far does what i want it to.....it opens the log
    > file puts the path of the log into a database field, and pulls out the
    > lines that have the soecific words in them and places these lines ina
    > database CLOB field. This is what i want - it repeats for each log in
    > the directory.....
    >
    > Im a bit reluctant to change the code when this one works ok for me...


    It will work so long as there is only one file that matches the pattern
    in the glob </home/mmj19903/logs/*.log>. The lines above suggest that
    you want to open each of the files in @files in turn. Your code does not
    do that. You don't use the variable $file from the loop 'foreach my
    $file (@files) {...}' anywhere. What is that loop for then?

    DS
     
    David Squire, Sep 7, 2006
    #16
  17. MattJ83 Guest

    David Squire wrote:
    > MattJ83 wrote:
    > > David Squire wrote:
    > >> MattJ83 wrote:
    > >>> #!/usr/central/bin/perl
    > >>> use strict ;
    > >>> #use warnings;
    > >>> use DBI;
    > >>>
    > >>> my @files= </home/mmj19903/logs/*.log>;
    > >>> foreach my $file (@files) {
    > >>>
    > >>> open (LOG,</home/mmj19903/logs/*.log>) or die "can't open LOG: $!\n";
    > >> Read my earlier response about this line. You are not opening $file.
    > >>
    > >>
    > >> DS

    > >
    > > I wasn't sure what you were gettign at with this line David - it works
    > > for me - this code so far does what i want it to.....it opens the log
    > > file puts the path of the log into a database field, and pulls out the
    > > lines that have the soecific words in them and places these lines ina
    > > database CLOB field. This is what i want - it repeats for each log in
    > > the directory.....
    > >
    > > Im a bit reluctant to change the code when this one works ok for me...

    >
    > It will work so long as there is only one file that matches the pattern
    > in the glob </home/mmj19903/logs/*.log>. The lines above suggest that
    > you want to open each of the files in @files in turn. Your code does not
    > do that. You don't use the variable $file from the loop 'foreach my
    > $file (@files) {...}' anywhere. What is that loop for then?
    >
    > DS


    As far as my limited PERL knowledge takes me (not far) i was under the
    impression that that first few lines declares the directory then tells
    it to loop the path name of the log file in the directory to the LOGID
    column then open the log file and pull those lines (with /elapsed/ etc)
    into the corresponding fields. This then repeats for the next log file
    and then next.......it is only when the /pattern/ isn't matched that
    the log file is not being recorded into the directory......

    Thanks for your info...!!
     
    MattJ83, Sep 7, 2006
    #17
  18. David Squire Guest

    MattJ83 wrote:

    > As far as my limited PERL knowledge takes me (not far) i was under the
    > impression that that first few lines declares the directory then tells
    > it to loop the path name of the log file in the directory to the LOGID
    > column then open the log file and pull those lines (with /elapsed/ etc)
    > into the corresponding fields. ...


    [snip]

    Here is an annotated version of those first few lines:

    > my @files= </home/USERNAME/logs/*.log>;


    Create an array called @files and initialise it with a list of all the
    files or directories that match the pattern "/home/USERNAME/logs/*.log",
    i.e. the same things you would see if you typed "ls
    /home/USERNAME/logs/*.log" in your shell. This is called a glob.

    > foreach my $file (@files) {


    For each element of the array @files, create an alias for that element
    called $file, and execute the code in the block that follows. You seem
    not ever to use this - though if I were you I would.

    > open (LOG,</home/USERNAME/logs/*.log>) or die "can't open LOG: $!\n";


    Open a file handle LOG whose filename is given by the expression
    "</home/USERNAME/logs/*.log>". This glob, identical to the one above, is
    used here in scalar context, so what open gets is the next element of
    the list in the glob... which means that you code actually does what you
    want - to my surprise (I did not know about this behaviour of glob in a
    scalar context)!

    I still think it would be clearer to write:

    my @filenames= </home/USERNAME/logs/*.log>;
    foreach my $filename (@filenames) {
    open my $LOG, '<', $filename or die "can't open $filename: $!\n";
    .... # using $LOG wherever you had LOG
    }

    If you don't do that, your first two lines are redundant - there is
    certainly no need to do the glob twice.


    DS
     
    David Squire, Sep 7, 2006
    #18
  19. -berlin.de Guest

    MattJ83 <> wrote in comp.lang.perl.misc:
    >
    > >
    > > Sure, you threw out the baby with the bath water. The assignment to
    > > @lines also goes away if you just delete the line.
    > >
    > > The point is, @lines will only ever contain one element, the current
    > > line minus its line feed. The same would be achieved by (untested)
    > >
    > > while ( <LOG> ) {
    > > chomp;
    > > if ( /elapsed/ ) {
    > > my @lines = ( $_);
    > >
    > > I seem to remember that a loop over @lines follows. Both the variable
    > > @lines and the loop can presumably go away too.
    > >
    > > Anno

    >
    > By this do you mean i can replace that looping repetitious code by
    > using chomp; and ($_); ? Is ($_) just a short hand way of writing the
    > full variable or do u actually mean do this and then declare the $
    > variables somewhere else?


    Please use full pronouns.

    No, I don't mean you should literally replace your code with mine.
    All I'm saying is that my code does the same thing yours does but
    without the spurious split.

    You should reduce the code much more. I think another branch of
    this thread is quite on the mark.

    Anno
     
    -berlin.de, Sep 7, 2006
    #19
  20. David Squire Guest

    David Squire wrote:

    > If you don't do that, your first two lines are redundant - there is
    > certainly no need to do the glob twice.


    Ah. Not quite, since the loop means you call the second glob the right
    number of times. Still, I think it is hard to understand - at least I
    have shown that it was for me :)


    DS
     
    David Squire, Sep 7, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Xah Lee
    Replies:
    4
    Views:
    623
    Aaron Sherman
    Feb 11, 2005
  2. Synonymous
    Replies:
    10
    Views:
    507
    Synonymous
    Apr 22, 2005
  3. Marc Bissonnette

    Pattern matching : not matching problem

    Marc Bissonnette, Jan 8, 2004, in forum: Perl Misc
    Replies:
    9
    Views:
    244
    Marc Bissonnette
    Jan 13, 2004
  4. Bobby Chamness
    Replies:
    2
    Views:
    240
    Xicheng Jia
    May 3, 2007
  5. Replies:
    2
    Views:
    400
Loading...

Share This Page