Why won't this split file script work?

Discussion in 'Perl Misc' started by Max, Jun 17, 2004.

  1. Max

    Max Guest

    I can't seem to figure out what I'm doing wrong, or maybe I'm just rushing
    as I need to split a 15000 line file into chunks.

    This script is supposed to work by giving the line number you want to start
    at and the line number you want to stop at. It should copy all lines in
    between the start and stop number to a file called $file-split. But it
    doesn't seem to work. If someone has a few minutes, can you tell me what
    I'm doing wrong.

    Thanks in advance.

    #!/usr/bin/perl

    $file = $ARGV[0];
    $start = $ARGV[1];
    $stop = $ARGV[2];

    print "$start\n";
    print "$stop\n";

    $cnt = 0;

    open (IN, "$file");
    open (OUT, "> $file-split");

    while (<IN>) {

    chomp;
    $cnt++;
    next if ($cnt lt $start);
    last if ($cnt gt $stop);
    print OUT "$_\n";
    }

    close (IN);
    close (OUT);
     
    Max, Jun 17, 2004
    #1
    1. Advertising

  2. [posted & mailed]

    On Thu, 17 Jun 2004, Max wrote:

    >This script is supposed to work by giving the line number you want to start
    >at and the line number you want to stop at. It should copy all lines in
    >between the start and stop number to a file called $file-split. But it
    >doesn't seem to work. If someone has a few minutes, can you tell me what


    >$cnt = 0;
    >
    >open (IN, "$file");
    >open (OUT, "> $file-split");
    >
    >while (<IN>) {
    >
    > chomp;
    > $cnt++;


    You can just use the $. variable instead of $cnt.

    > next if ($cnt lt $start);
    > last if ($cnt gt $stop);


    You're using string-wise operators. You want numerical comparisons:

    next if $. < $start;
    last if $. > $stop;

    > print OUT "$_\n";


    Why did you chomp() $_ if you're just going to print it with a newline at
    the end again?

    >}
    >
    >close (IN);
    >close (OUT);


    I'd write this as:

    while (<IN>) {
    print OUT if ($. == $start) .. ($. == $stop);
    last if $. == $stop;
    }

    That's using the .. operator (perldoc perlop).

    --
    Jeff Pinyan RPI Acacia Brother #734 RPI Acacia Corp Secretary
    "And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
    years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
    Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)
     
    Jeff 'japhy' Pinyan, Jun 17, 2004
    #2
    1. Advertising

  3. Max wrote:
    >
    > I can't seem to figure out what I'm doing wrong, or maybe I'm just rushing
    > as I need to split a 15000 line file into chunks.
    >
    > This script is supposed to work by giving the line number you want to start
    > at and the line number you want to stop at. It should copy all lines in
    > between the start and stop number to a file called $file-split. But it
    > doesn't seem to work. If someone has a few minutes, can you tell me what
    > I'm doing wrong.
    >
    > Thanks in advance.
    >
    > #!/usr/bin/perl


    You should enable warnings and strictures to let perl help you find
    mistakes.

    use warnings;
    use strict;


    > $file = $ARGV[0];
    > $start = $ARGV[1];
    > $stop = $ARGV[2];
    >
    > print "$start\n";
    > print "$stop\n";
    >
    > $cnt = 0;
    >
    > open (IN, "$file");
    > open (OUT, "> $file-split");


    You should *ALWAYS* verify that the files were opened correctly.

    open IN, $file or die "Cannot open $file: $!";
    open OUT, "> $file-split" or die "Cannot open $file-split: $!";


    > while (<IN>) {
    >
    > chomp;
    > $cnt++;
    > next if ($cnt lt $start);
    > last if ($cnt gt $stop);


    You are using string comparison operators which are not doing what you
    seem to expect them to do.

    $ perl -le'
    for ( qw[ 1 2 3 4 10 11 12 13 14 20 21 22 23 24 100 200 300 ] ) {
    print if $_ lt "12"
    }
    '
    1
    10
    11
    100

    Note that '100' is less than '12'. You should be using numerical
    comparison operators instead. Also you don't need the $cnt variable as
    perl provides the $. variable which does the same thing.

    next if $. < $start;
    last if $. > $stop;


    > print OUT "$_\n";
    > }
    >
    > close (IN);
    > close (OUT);



    John
    --
    use Perl;
    program
    fulfillment
     
    John W. Krahn, Jun 17, 2004
    #3
  4. "Max" <> wrote in message
    news:<mJlAc.2227$>...
    > If someone has a few minutes, can you tell me what I'm doing wrong.


    From what I can tell, you've made one or two logic errors (you're not telling
    the program to do what you want), and then a few stylistic "errors". The
    problem, I must assume, is that you use 'lt' and 'gt', rather than '<'
    and '>'. The former are for string comparison, the latter for numeric
    comparison. That is:

    '10' lt '5'
    10 > 5

    I assume you wanted the numeric comparison. Also, if you input 5 and 8 for
    $start and $stop, just lines 6, 7, and 8 are copied. You may want to futz
    with the boundaries.


    I also made some stylistic changes to the code. The most important is adding
    'use strict;' and 'use warnings;', which are the most helpful things in Perl
    since spliced bread. The others are less critical, but in the absence of any
    preformed habits otherwise, there's no reason not to just use the
    three-argument form of open all the time (for example).

    One other thought -- if you don't want to keep track of it youself, the
    variable $. (dollar-period) contains the current line number of the last
    filehandle that you've read from. As long as you're not messing with $/
    (which redefines for perl what a line is), $. should always correspond
    to your $cnt.

    Hope this helps. Code is tested but (of course) not guaranteed.


    #!/usr/bin/perl

    use strict;
    use warnings;

    my ($file, $start, $stop) = @ARGV;

    print "$start\n$stop\n";

    my $cnt = 0;

    open (IN, '<', $file) or die "Unable to open $file: $!";
    open (OUT, '>', $file . '-split') or die "Unable to open $file-split: $!";

    while (<IN>) {
    chomp;
    $cnt++;
    next if ($cnt < $start);
    last if ($cnt > $stop);
    print OUT "$_\n";
    }

    close (IN);
    close (OUT);
     
    Thomas Church, Jun 18, 2004
    #4
  5. "Max" <> wrote in message
    news:<mJlAc.2227$>...
    > I can't seem to figure out what I'm doing wrong, or maybe I'm just rushing
    > as I need to split a 15000 line file into chunks.


    One other thought: you don't need to chomp unless you actually care about
    eliminating the newline. Since you add it back in again anyway when you print,
    you can simplify the loop to: (untested)

    while (<IN>) {
    next if ($. < $start);
    last if ($. > $stop);
    print OUT $_;
    }
     
    Thomas Church, Jun 18, 2004
    #5
  6. Max

    Max Guest

    I really appreciate everybody's input. I got the script working and learned
    some very good programming tips.

    Thanks,
    Max

    "Thomas Church" <> wrote in message
    news:...
    > "Max" <> wrote in message
    > news:<mJlAc.2227$>...
    > > I can't seem to figure out what I'm doing wrong, or maybe I'm just

    rushing
    > > as I need to split a 15000 line file into chunks.

    >
    > One other thought: you don't need to chomp unless you actually care about
    > eliminating the newline. Since you add it back in again anyway when you

    print,
    > you can simplify the loop to: (untested)
    >
    > while (<IN>) {
    > next if ($. < $start);
    > last if ($. > $stop);
    > print OUT $_;
    > }
     
    Max, Jun 18, 2004
    #6
  7. On Thu, 17 Jun 2004 18:49:54 GMT, "Max" <> wrote:

    >I can't seem to figure out what I'm doing wrong, or maybe I'm just rushing
    >as I need to split a 15000 line file into chunks.

    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    BTW: but this is *not* what your script below is supposed to do!

    >This script is supposed to work by giving the line number you want to start
    >at and the line number you want to stop at. It should copy all lines in
    >between the start and stop number to a file called $file-split. But it
    >doesn't seem to work. If someone has a few minutes, can you tell me what
    >I'm doing wrong.


    Others already told you. This is how I would do it (just printing to
    STDOUT and generalized to more -or no- files on the cmd line):


    #!/usr/bin/perl

    use strict;
    use warnings;

    die "Usage: $0 <start> <stop> [<file(s)>]\n" unless @ARGV>=2;

    my ($start,$stop)=(shift,shift);

    while (<>) {
    print if $. == $start .. ($. == $stop and close ARGV);
    }

    __END__


    HTH,
    Michele
    --
    you'll see that it shouldn't be so. AND, the writting as usuall is
    fantastic incompetent. To illustrate, i quote:
    - Xah Lee trolling on clpmisc,
    "perl bug File::Basename and Perl's nature"
     
    Michele Dondi, Jun 18, 2004
    #7
  8. In article
    <>,
    Jeff 'japhy' Pinyan <> wrote:

    > I'd write this as:
    >
    > while (<IN>) {
    > print OUT if ($. == $start) .. ($. == $stop);
    > last if $. == $stop;
    > }


    According to the docs, you could actually write this as:

    while(<IN>) {
    print OUT if $start .. $stop;
    last if $. == $stop;
    }

    HTH,
    Ricky
     
    Richard Morse, Jun 21, 2004
    #8
  9. Max

    Jay Tilton Guest

    Richard Morse <> wrote:

    : In article
    : <>,
    : Jeff 'japhy' Pinyan <> wrote:
    :
    : > I'd write this as:
    : >
    : > while (<IN>) {
    : > print OUT if ($. == $start) .. ($. == $stop);
    : > last if $. == $stop;
    : > }
    :
    : According to the docs, you could actually write this as:
    :
    : while(<IN>) {
    : print OUT if $start .. $stop;
    : last if $. == $stop;
    : }

    Not true. perlop says:

    If either operand of scalar ``..'' is a constant expression, that
    operand is considered true if it is equal (==) to the current
    input line number (the $. variable).

    Neither $start nor $stop are constant expressions.
     
    Jay Tilton, Jun 21, 2004
    #9
  10. [posted & mailed]

    On Mon, 21 Jun 2004, Richard Morse wrote:

    > Jeff 'japhy' Pinyan <> wrote:
    >
    >> while (<IN>) {
    >> print OUT if ($. == $start) .. ($. == $stop);
    >> last if $. == $stop;
    >> }

    >
    >According to the docs, you could actually write this as:
    >
    > while(<IN>) {
    > print OUT if $start .. $stop;
    > last if $. == $stop;
    > }


    Not so. I once (ok, more than once) fell prey to that. The docs state
    (although not in the BOLD CAPITAL letters I'd like) that the implicit
    comparison to $. only takes place if the argument is a constant
    expression:

    If either operand of scalar ".." is a constant expression, that operand
    is considered true if it is equal ("==") to the current input line num-
    ber (the $. variable).

    To be pedantic, the comparison is actually "int(EXPR) == int(EXPR)",
    but that is only an issue if you use a floating point expression; when
    implicitly using $. as described in the previous paragraph, the compar-
    ison is "int(EXPR) == int($.)" which is only an issue when $. is set
    to a floating point value and you are not reading from a file. Fur-
    thermore, "span" .. "spat" or "2.18 .. 3.14" will not do what you want
    in scalar context because each of the operands are evaluated using
    their integer representation.

    --
    Jeff Pinyan RPI Acacia Brother #734 RPI Acacia Corp Secretary
    "And I vos head of Gestapo for ten | Michael Palin (as Heinrich Bimmler)
    years. Ah! Five years! Nein! No! | in: The North Minehead Bye-Election
    Oh. Was NOT head of Gestapo AT ALL!" | (Monty Python's Flying Circus)
     
    Jeff 'japhy' Pinyan, Jun 21, 2004
    #10
  11. Max

    Anno Siegel Guest

    Jeff 'japhy' Pinyan <> wrote in comp.lang.perl.misc:
    > [posted & mailed]
    > On Mon, 21 Jun 2004, Richard Morse wrote:


    [...]

    > >According to the docs, you could actually write this as:
    > >
    > > while(<IN>) {
    > > print OUT if $start .. $stop;
    > > last if $. == $stop;
    > > }

    >
    > Not so. I once (ok, more than once) fell prey to that. The docs state
    > (although not in the BOLD CAPITAL letters I'd like) that the implicit
    > comparison to $. only takes place if the argument is a constant
    > expression:
    >
    > If either operand of scalar ".." is a constant expression, that operand


    Another question is what exactly is a constant expression. Experimentally,
    literals and arithmetic expressions of literals are compared to $., as
    are constants defined through the "constant" pragma. However, "do{ 0}",
    and "do{ 1}" are not. So that's a cop-out if a constant boolean is
    needed at either end of scalar "..".

    The construct is one of Perl's less well-advised attempts at Doing
    What I Mean.

    Anno
     
    Anno Siegel, Jun 21, 2004
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Marina

    Re: WHY, WHY WON'T IT WORK???

    Marina, Jun 29, 2004, in forum: ASP .Net
    Replies:
    2
    Views:
    352
    Marina
    Jun 29, 2004
  2. Chad
    Replies:
    4
    Views:
    8,345
  3. Mr. SweatyFinger
    Replies:
    2
    Views:
    2,002
    Smokey Grindel
    Dec 2, 2006
  4. Jane Dickerson
    Replies:
    17
    Views:
    218
    Helen
    Dec 4, 2003
  5. Sara
    Replies:
    6
    Views:
    259
    John W. Krahn
    Apr 12, 2004
Loading...

Share This Page