Another Regular expression question

Discussion in 'Perl Misc' started by Dave Roberts, Jan 3, 2004.

  1. Dave Roberts

    Dave Roberts Guest

    I wrote a program to sort my MySQL log files. It works fine, but I
    get about a hundred errors such as:

    Malformed UTF-8 character (unexpected continuation byte 0x88, with no
    preceding start byte) in pattern match (m//) at ./subtest3.pl line 27,
    <DB_Dump> line 1685.

    I get an identical error for each of the two lines where I am doing
    the regex matching. The part that puzzles me is that after converting
    the files to hex, neither the source nor the destination files contain
    any of the characters mentioned in the numerous errors.

    Here's the code...

    #! /usr/bin/perl -w
    use strict;
    my (@DB_List, @DB_Dump, $line, $DB, $DB_Start, $DB_End,
    $Save_To_File);
    unless (open(DB_Dump, "/test/log")) {
    print ("File Problem");
    }
    unless (open(outfile, ">>/test/Sorted_DB_Dump")) {
    print ("File Problem");
    }
    @DB_Dump=<DB_Dump>; #put file into a list

    my @DB_List=("use sonia_db;","use maris_db;","use psa;"); # names of
    three databases

    foreach $DB (@DB_List){
    print outfile ("$DB\n"); # Stores "use DB" command to file

    foreach $line (@DB_Dump){

    if ($line =~ m/^($DB)/){ # Check if current DB name is found
    $DB_Start=1;
    $Save_To_File=1; # save data to file
    }
    else {
    $DB_Start=0;
    }

    if ($line =~ m/^use/){ # Check if new (unwanted) DB name is
    found
    $DB_End=1;
    }
    else{
    $DB_End=0;
    }

    if ($DB_End==1 && $DB_Start==0){ # if the "use" is found and I
    know it isn't the current DB
    $Save_To_File=0; # stop writing data to file
    }

    if ($Save_To_File && !$DB_Start && !$DB_End){
    print outfile ($line);
    }
    }
    }
    ______________End of Program

    Any ideas?

    Thanks, Dave
     
    Dave Roberts, Jan 3, 2004
    #1
    1. Advertising

  2. Dave Roberts

    Paul Lalli Guest

    More than likely, the lines of the file you're opening contain 'special'
    characters. Perl interpolates the variable before passing it to the
    RegExp engine. You need to escape any of the misbehaving characters:

    if ($line =~ /^(\Q$DB\E)/)) { ... }

    Paul Lalli


    On Sat, 3 Jan 2004, Dave Roberts wrote:

    > Date: 3 Jan 2004 08:01:33 -0800
    > From: Dave Roberts <>
    > Newsgroups: comp.lang.perl.misc
    > Subject: Another Regular expression question
    >
    > I wrote a program to sort my MySQL log files. It works fine, but I
    > get about a hundred errors such as:
    >
    > Malformed UTF-8 character (unexpected continuation byte 0x88, with no
    > preceding start byte) in pattern match (m//) at ./subtest3.pl line 27,
    > <DB_Dump> line 1685.
    >
    > I get an identical error for each of the two lines where I am doing
    > the regex matching. The part that puzzles me is that after converting
    > the files to hex, neither the source nor the destination files contain
    > any of the characters mentioned in the numerous errors.
    >
    > Here's the code...
    >
    > #! /usr/bin/perl -w
    > use strict;
    > my (@DB_List, @DB_Dump, $line, $DB, $DB_Start, $DB_End,
    > $Save_To_File);
    > unless (open(DB_Dump, "/test/log")) {
    > print ("File Problem");
    > }
    > unless (open(outfile, ">>/test/Sorted_DB_Dump")) {
    > print ("File Problem");
    > }
    > @DB_Dump=<DB_Dump>; #put file into a list
    >
    > my @DB_List=("use sonia_db;","use maris_db;","use psa;"); # names of
    > three databases
    >
    > foreach $DB (@DB_List){
    > print outfile ("$DB\n"); # Stores "use DB" command to file
    >
    > foreach $line (@DB_Dump){
    >
    > if ($line =~ m/^($DB)/){ # Check if current DB name is found
    > $DB_Start=1;
    > $Save_To_File=1; # save data to file
    > }
    > else {
    > $DB_Start=0;
    > }
    >
    > if ($line =~ m/^use/){ # Check if new (unwanted) DB name is
    > found
    > $DB_End=1;
    > }
    > else{
    > $DB_End=0;
    > }
    >
    > if ($DB_End==1 && $DB_Start==0){ # if the "use" is found and I
    > know it isn't the current DB
    > $Save_To_File=0; # stop writing data to file
    > }
    >
    > if ($Save_To_File && !$DB_Start && !$DB_End){
    > print outfile ($line);
    > }
    > }
    > }
    > ______________End of Program
    >
    > Any ideas?
    >
    > Thanks, Dave
    >
     
    Paul Lalli, Jan 3, 2004
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,331
  2. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    860
    Alan Moore
    Dec 2, 2005
  3. GIMME
    Replies:
    3
    Views:
    11,996
    vforvikash
    Dec 29, 2008
  4. Replies:
    7
    Views:
    145
    Logan Capaldo
    Jul 13, 2005
  5. Tony
    Replies:
    2
    Views:
    134
    Tad McClellan
    Apr 21, 2005
Loading...

Share This Page