check columns /tabs between two patterns

Discussion in 'Perl Misc' started by Marek, Sep 14, 2008.

  1. Marek

    Marek Guest

    Hello all!


    I have a tab separated text file. I want to check, whether the
    contents are in the right fields. I have constructed here a little
    example. Question is, how to check for right number of tabs between
    "pattern" and "number" ... In my example, there is checked only one
    wrong example. To be clear I take this example:

    (pattern3)\t{4,}(number3)

    this is checking 4 or more tabs between pattern3 and number3, which is
    wrong. Only 3 tabs are right! So I have to check also for 2 tabs or 1
    tab, which would be wrong too ...


    Hope this was clear


    best greetings marek


    #! /usr/local/bin/perl

    use warnings;
    use strict;

    while (<DATA>) {

    s/\s+#.+//;
    next if /^\s*$/;

    if (/(pattern1)\t{2,}(number1)\s*/i) {
    print
    "Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
    \n\t$_\n\n";

    }
    elsif (/(pattern2)\t{3,}(number2)\s*/i) {
    print
    "Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
    \n\t$_\n\n";

    }
    elsif (/(pattern3)\t{4,}(number3)\s*/i) {
    print
    "Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
    \n\t$_\n\n";

    }
    else {

    print "\nno match!\n\n";

    }
    }

    __DATA__

    pattern1 number1
    pattern2 number2
    pattern3 number3
    pattern1 number1 # wrong number of tabs (2tabs)
    pattern2 number2 # wrong number of tabs (3tabs)
    pattern3 number3 # wrong number of tabs (4tabs)
    pattern2 number2 # wrong number of tabs (1tab)
    pattern3 number3 # wrong number of tabs (2tabs)
     
    Marek, Sep 14, 2008
    #1
    1. Advertising

  2. Marek

    Dr.Ruud Guest

    Marek schreef:

    > I have a tab separated text file. I want to check, whether the
    > contents are in the right fields.


    Maybe you can use this approach:

    Read a line, split on /\t/ and store in an array, then do tests.


    while ( <> ) {
    /^\s*(?:#|$)/ and next; # comment- and blank lines

    my @data = split /\t/;
    ... # tests
    }


    > "Wrong number of tabs between \"$1\" and the number \"$2\" in the
    > line: \n\t$_\n\n";
    > [...]
    > "Wrong number of tabs between \"$1\" and the number \"$2\" in the
    > line: \n\t$_\n\n";
    > [...]
    > "Wrong number of tabs between \"$1\" and the number \"$2\" in the
    > line: \n\t$_\n\n";


    See also `perldoc -f sprintf`, you want to create a single format string
    for that.

    --
    Affijn, Ruud

    "Gewoon is een tijger."
     
    Dr.Ruud, Sep 14, 2008
    #2
    1. Advertising

  3. Marek <> wrote:

    > Question is, how to check for right number of tabs between
    > "pattern" and "number" ... In my example, there is checked only one
    > wrong example. To be clear I take this example:
    >
    > (pattern3)\t{4,}(number3)
    >
    > this is checking 4 or more tabs between pattern3 and number3, which is
    > wrong. Only 3 tabs are right! So I have to check also for 2 tabs or 1
    > tab, which would be wrong too ...



    > elsif (/(pattern3)\t{4,}(number3)\s*/i) {



    elsif ( ! /(pattern3)\t{3}(number3)\s*/i ) {


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
     
    Tad J McClellan, Sep 14, 2008
    #3
  4. On Sun, 14 Sep 2008 02:20:45 -0700, Marek wrote:

    > Hello all!
    >
    >
    > I have a tab separated text file. I want to check, whether the contents
    > are in the right fields. I have constructed here a little example.
    > Question is, how to check for right number of tabs between "pattern" and
    > "number" ... In my example, there is checked only one wrong example. To
    > be clear I take this example:
    >
    > (pattern3)\t{4,}(number3)
    >
    > this is checking 4 or more tabs between pattern3 and number3, which is
    > wrong. Only 3 tabs are right! So I have to check also for 2 tabs or 1
    > tab, which would be wrong too ...


    Something like (untested) /(pattern3)(?:\t{,2}|\t{4,}(number3?)/ maybe?

    But I would do something like:

    elsif (/(pattern2)(\t+)(number2)\s*/i) {
    print
    "Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
    \n\t$_\n\n"
    if length($2) != 2;

    }

    And even that can be optimized further, but this should get you going.

    HTH,
    M4
     
    Martijn Lievaart, Sep 14, 2008
    #4
  5. Marek

    Marek Guest

    Thank you all for your answers! I will stick to

    elsif (/(pattern2)(\t+)(number2)\s*/i) {
    print
    "Wrong number of tabs between \"$1\" and the number \"$2\" in the
    line:
    \n\t$_\n\n"
    if length($2) != 2;

    }

    Martijns suggestion! Instead of searching all positive possibilities
    of wrong numbers of \tabs, it is easier to say if not the right number
    of tabs ...

    Did not know this possibility to check with "length". Fantastic :)


    Thank you all again


    marek
     
    Marek, Sep 15, 2008
    #5
  6. On Sun, 14 Sep 2008 22:28:10 -0700, Marek wrote:

    > Thank you all for your answers! I will stick to
    >
    > elsif (/(pattern2)(\t+)(number2)\s*/i) {
    > print
    > "Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
    > \n\t$_\n\n"
    > if length($2) != 2;
    >
    > }
    >
    > Martijns suggestion! Instead of searching all positive possibilities of
    > wrong numbers of \tabs, it is easier to say if not the right number of
    > tabs ...
    >
    > Did not know this possibility to check with "length". Fantastic :)


    I like Tads suggestion better. I'm a bit ashamed I didn't think of that
    myself.

    M4
     
    Martijn Lievaart, Sep 15, 2008
    #6
  7. Marek

    Marek Guest

    But Tad's suggestion I don't understand!

    If something is *not* matching

    elsif ( ! /(pattern3)\t{3}(number3)\s*/i ) { ... }

    it will always match. Consider first line of __DATA__

    pattern1 number1

    this would match already, because it *does not* match, because of > !
    <

    But I am a simple beginner; probably I did not understand something
    here!



    greetings marek
     
    Marek, Sep 15, 2008
    #7
  8. Marek <> wrote:
    >
    >
    > But Tad's suggestion I don't understand!
    >
    > If something is *not* matching
    >
    > elsif ( ! /(pattern3)\t{3}(number3)\s*/i ) { ... }



    You said "Only 3 tabs are right", so to test if the data is "right":

    elsif ( /(pattern3)\t{3}(number3)\s*/i ) { ... } # right

    but the body of the elsif deals with when it is NOT right, so you need:

    elsif ( ! /(pattern3)\t{3}(number3)\s*/i ) { ... } # not right


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
     
    Tad J McClellan, Sep 16, 2008
    #8
  9. Marek

    Marek Guest

    Sorry Tad,

    but I understood you like follows:

    #! /usr/local/bin/perl

    use warnings;
    use strict;

    while (<DATA>) {

    s/\s+#.+//;
    next if /^\s*$/;

    if ( ! /(pattern1)\t(number1)\s*/i) {
    print
    "[First if:] Wrong number of tabs between \"$1\" and the number \"$2\"
    in the line:\n\t$_\n\n";

    }
    elsif ( ! /(pattern2)\t{2}(number2)\s*/i) {
    print
    "[First elsif:] Wrong number of tabs between \"$1\" and the number
    \"$2\" in the line:\n\t$_\n\n";

    }
    elsif ( ! /(pattern3)\t{3}(number3)\s*/i) {
    print
    "[First elsif:] Wrong number of tabs between \"$1\" and the number
    \"$2\" in the line:\n\t$_\n\n";

    }
    else {

    print "\nno match!\n\n";

    }
    }

    __DATA__

    pattern1 number1
    pattern2 number2
    pattern3 number3
    pattern1 number1 # wrong number of tabs (2tabs)
    pattern2 number2 # wrong number of tabs (3tabs)
    pattern3 number3 # wrong number of tabs (4tabs)
    pattern2 number2 # wrong number of tabs (1tab)
    pattern3 number3 # wrong number of tabs (2tabs)

    And this is not working. Certainly a misunderstanding? And here the
    version, how I understood Petr and Martjin ...

    #! /usr/local/bin/perl

    use warnings;
    use strict;

    while (<DATA>) {

    s/\s+#.+//;
    next if /^\s*$/;

    if (/(pattern1)(\t+)(number1)\s*/i) {
    print
    "Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
    \n\t$.: $_\n\n" if length($2) !=1 ;

    }
    elsif (/(pattern2)(\t+)(number2)\s*/i) {
    print
    "Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
    \n\t$.: $_\n\n" if length($2) !=2 ;

    }
    elsif (/(pattern3)(\t+)(number3)\s*/i) {
    print
    "Wrong number of tabs between \"$1\" and the number \"$2\" in the line:
    \n\t$.: $_\n\n" if length($2) !=3 ;

    }
    else {

    print "\nno match!\n\n";

    }
    }

    __DATA__

    pattern1 number1
    pattern2 number2
    pattern3 number3
    pattern1 number1 # wrong number of tabs (2tabs)
    pattern2 number2 # wrong number of tabs (3tabs)
    pattern3 number3 # wrong number of tabs (4tabs)
    pattern2 number2 # wrong number of tabs (1tab)
    pattern3 number3 # wrong number of tabs (2tabs)
     
    Marek, Sep 16, 2008
    #9
  10. On Mon, 15 Sep 2008 11:46:13 -0700, Marek wrote:

    > But Tad's suggestion I don't understand!
    >
    > If something is *not* matching
    >
    > elsif ( ! /(pattern3)\t{3}(number3)\s*/i ) { ... }
    >
    > it will always match. Consider first line of __DATA__
    >
    > pattern1 number1
    >
    > this would match already, because it *does not* match, because of > ! <
    >
    > But I am a simple beginner; probably I did not understand something
    > here!


    No, you're quite right and I was quite wrong, as well as Tad. Sorry,
    brain misfiring I guess.

    M4
     
    Martijn Lievaart, Sep 18, 2008
    #10
  11. Marek

    Tim Greer Guest

    Marek wrote:

    > Sorry Tad,
    >

    .... snip...
    >
    > And this is not working. Certainly a misunderstanding? And here the
    > version, how I understood Petr and Martjin ...
    >
    > #! /usr/local/bin/perl
    >
    > use warnings;
    > use strict;
    >
    > while (<DATA>) {
    >
    > s/\s+#.+//;
    > next if /^\s*$/;
    >
    > if (/(pattern1)(\t+)(number1)\s*/i) {
    > print
    > "Wrong number of tabs between \"$1\" and the number \"$2\" in the
    > line: \n\t$.: $_\n\n" if length($2) !=1 ;
    >
    > }
    > elsif (/(pattern2)(\t+)(number2)\s*/i) {
    > print
    > "Wrong number of tabs between \"$1\" and the number \"$2\" in the
    > line: \n\t$.: $_\n\n" if length($2) !=2 ;
    >
    > }
    > elsif (/(pattern3)(\t+)(number3)\s*/i) {
    > print
    > "Wrong number of tabs between \"$1\" and the number \"$2\" in the
    > line: \n\t$.: $_\n\n" if length($2) !=3 ;
    >
    > }
    > else {
    >
    > print "\nno match!\n\n";
    >
    > }
    > }
    >
    > __DATA__
    >
    > pattern1 number1
    > pattern2 number2
    > pattern3 number3
    > pattern1 number1 # wrong number of tabs (2tabs)
    > pattern2 number2 # wrong number of tabs (3tabs)
    > pattern3 number3 # wrong number of tabs (4tabs)
    > pattern2 number2 # wrong number of tabs (1tab)
    > pattern3 number3 # wrong number of tabs (2tabs)



    You are capturing $2 and printing the tab(s) as the number in your
    output when the number is wrong. You should use $3 for the actual
    number in the output.

    Also, just for the sake of getting rid of all of the if/else's, I've
    modified the example to use one line and capture the number itself
    (from pattern) to work more dynamically:

    Note: Watch for word wrapping in the below code example:

    #!/usr/bin/perl
    use warnings;
    use strict;

    while (<DATA>) {
    s/\s+#.+//;
    next if /^\s*$/;
    if (/pattern(\d+)(\t+)number(\d+)\s*/i) {
    print "Wrong number of tabs between \"pattern$1\" ",
    "and the number \"number$3\" in the line:\n\t$.:
    ",
    " $_\n\n" if length($2) != $1;
    } else {
    print "\nno match!\n\n";
    }
    }

    __DATA__
    pattern1 number1
    pattern2 number2
    pattern3 number3
    pattern1 number1 # wrong number
    of tabs (2tabs)
    pattern2 number2 # wrong number
    of tabs        (3tabs)
    pattern3 number3 # wrong number
    of tabs        (4tabs)
    pattern2 number2 # wrong number
    of tabs (1tab)
    pattern3 number3
    # wrong number of tabs        (2tabs)



    --
    Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
    Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
    and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
    Industry's most experienced staff! -- Web Hosting With Muscle!
     
    Tim Greer, Sep 19, 2008
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. crichmon
    Replies:
    4
    Views:
    488
    Mabden
    Jul 7, 2004
  2. qwweeeit
    Replies:
    2
    Views:
    652
    qwweeeit
    Dec 14, 2005
  3. rantingrick

    Tabs -vs- Spaces: Tabs should have won.

    rantingrick, Jul 16, 2011, in forum: Python
    Replies:
    95
    Views:
    1,853
    Roy Smith
    Jul 19, 2011
  4. John Kopanas
    Replies:
    2
    Views:
    295
    Gregory Brown
    Jan 29, 2007
  5. Amit Rawal
    Replies:
    0
    Views:
    134
    Amit Rawal
    Jun 23, 2009
Loading...

Share This Page