Why won't this split file script work?

M

Max

I can't seem to figure out what I'm doing wrong, or maybe I'm just rushing
as I need to split a 15000 line file into chunks.

This script is supposed to work by giving the line number you want to start
at and the line number you want to stop at. It should copy all lines in
between the start and stop number to a file called $file-split. But it
doesn't seem to work. If someone has a few minutes, can you tell me what
I'm doing wrong.

Thanks in advance.

#!/usr/bin/perl

$file = $ARGV[0];
$start = $ARGV[1];
$stop = $ARGV[2];

print "$start\n";
print "$stop\n";

$cnt = 0;

open (IN, "$file");
open (OUT, "> $file-split");

while (<IN>) {

chomp;
$cnt++;
next if ($cnt lt $start);
last if ($cnt gt $stop);
print OUT "$_\n";
}

close (IN);
close (OUT);
 
J

Jeff 'japhy' Pinyan

[posted & mailed]

This script is supposed to work by giving the line number you want to start
at and the line number you want to stop at. It should copy all lines in
between the start and stop number to a file called $file-split. But it
doesn't seem to work. If someone has a few minutes, can you tell me what
$cnt = 0;

open (IN, "$file");
open (OUT, "> $file-split");

while (<IN>) {

chomp;
$cnt++;

You can just use the $. variable instead of $cnt.
next if ($cnt lt $start);
last if ($cnt gt $stop);

You're using string-wise operators. You want numerical comparisons:

next if $. < $start;
last if $. > $stop;
print OUT "$_\n";

Why did you chomp() $_ if you're just going to print it with a newline at
the end again?
}

close (IN);
close (OUT);

I'd write this as:

while (<IN>) {
print OUT if ($. == $start) .. ($. == $stop);
last if $. == $stop;
}

That's using the .. operator (perldoc perlop).
 
J

John W. Krahn

Max said:
I can't seem to figure out what I'm doing wrong, or maybe I'm just rushing
as I need to split a 15000 line file into chunks.

This script is supposed to work by giving the line number you want to start
at and the line number you want to stop at. It should copy all lines in
between the start and stop number to a file called $file-split. But it
doesn't seem to work. If someone has a few minutes, can you tell me what
I'm doing wrong.

Thanks in advance.

#!/usr/bin/perl

You should enable warnings and strictures to let perl help you find
mistakes.

use warnings;
use strict;

$file = $ARGV[0];
$start = $ARGV[1];
$stop = $ARGV[2];

print "$start\n";
print "$stop\n";

$cnt = 0;

open (IN, "$file");
open (OUT, "> $file-split");

You should *ALWAYS* verify that the files were opened correctly.

open IN, $file or die "Cannot open $file: $!";
open OUT, "> $file-split" or die "Cannot open $file-split: $!";

while (<IN>) {

chomp;
$cnt++;
next if ($cnt lt $start);
last if ($cnt gt $stop);

You are using string comparison operators which are not doing what you
seem to expect them to do.

$ perl -le'
for ( qw[ 1 2 3 4 10 11 12 13 14 20 21 22 23 24 100 200 300 ] ) {
print if $_ lt "12"
}
'
1
10
11
100

Note that '100' is less than '12'. You should be using numerical
comparison operators instead. Also you don't need the $cnt variable as
perl provides the $. variable which does the same thing.

next if $. < $start;
last if $. > $stop;

print OUT "$_\n";
}

close (IN);
close (OUT);


John
 
T

Thomas Church

Max said:
If someone has a few minutes, can you tell me what I'm doing wrong.

From what I can tell, you've made one or two logic errors (you're not telling
the program to do what you want), and then a few stylistic "errors". The
problem, I must assume, is that you use 'lt' and 'gt', rather than '<'
and '>'. The former are for string comparison, the latter for numeric
comparison. That is:

'10' lt '5'
10 > 5

I assume you wanted the numeric comparison. Also, if you input 5 and 8 for
$start and $stop, just lines 6, 7, and 8 are copied. You may want to futz
with the boundaries.


I also made some stylistic changes to the code. The most important is adding
'use strict;' and 'use warnings;', which are the most helpful things in Perl
since spliced bread. The others are less critical, but in the absence of any
preformed habits otherwise, there's no reason not to just use the
three-argument form of open all the time (for example).

One other thought -- if you don't want to keep track of it youself, the
variable $. (dollar-period) contains the current line number of the last
filehandle that you've read from. As long as you're not messing with $/
(which redefines for perl what a line is), $. should always correspond
to your $cnt.

Hope this helps. Code is tested but (of course) not guaranteed.


#!/usr/bin/perl

use strict;
use warnings;

my ($file, $start, $stop) = @ARGV;

print "$start\n$stop\n";

my $cnt = 0;

open (IN, '<', $file) or die "Unable to open $file: $!";
open (OUT, '>', $file . '-split') or die "Unable to open $file-split: $!";

while (<IN>) {
chomp;
$cnt++;
next if ($cnt < $start);
last if ($cnt > $stop);
print OUT "$_\n";
}

close (IN);
close (OUT);
 
T

Thomas Church

Max said:
I can't seem to figure out what I'm doing wrong, or maybe I'm just rushing
as I need to split a 15000 line file into chunks.

One other thought: you don't need to chomp unless you actually care about
eliminating the newline. Since you add it back in again anyway when you print,
you can simplify the loop to: (untested)

while (<IN>) {
next if ($. < $start);
last if ($. > $stop);
print OUT $_;
}
 
M

Max

I really appreciate everybody's input. I got the script working and learned
some very good programming tips.

Thanks,
Max
 
M

Michele Dondi

I can't seem to figure out what I'm doing wrong, or maybe I'm just rushing
as I need to split a 15000 line file into chunks.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

BTW: but this is *not* what your script below is supposed to do!
This script is supposed to work by giving the line number you want to start
at and the line number you want to stop at. It should copy all lines in
between the start and stop number to a file called $file-split. But it
doesn't seem to work. If someone has a few minutes, can you tell me what
I'm doing wrong.

Others already told you. This is how I would do it (just printing to
STDOUT and generalized to more -or no- files on the cmd line):


#!/usr/bin/perl

use strict;
use warnings;

die "Usage: $0 <start> <stop> [<file(s)>]\n" unless @ARGV>=2;

my ($start,$stop)=(shift,shift);

while (<>) {
print if $. == $start .. ($. == $stop and close ARGV);
}

__END__


HTH,
Michele
 
R

Richard Morse

Jeff 'japhy' Pinyan said:
I'd write this as:

while (<IN>) {
print OUT if ($. == $start) .. ($. == $stop);
last if $. == $stop;
}

According to the docs, you could actually write this as:

while(<IN>) {
print OUT if $start .. $stop;
last if $. == $stop;
}

HTH,
Ricky
 
J

Jay Tilton

: In article
: <Pine.SGI.3.96.1040617150701.326419A-100000@vcmr-64.server.rpi.edu>,
:
: > I'd write this as:
: >
: > while (<IN>) {
: > print OUT if ($. == $start) .. ($. == $stop);
: > last if $. == $stop;
: > }
:
: According to the docs, you could actually write this as:
:
: while(<IN>) {
: print OUT if $start .. $stop;
: last if $. == $stop;
: }

Not true. perlop says:

If either operand of scalar ``..'' is a constant expression, that
operand is considered true if it is equal (==) to the current
input line number (the $. variable).

Neither $start nor $stop are constant expressions.
 
J

Jeff 'japhy' Pinyan

[posted & mailed]

According to the docs, you could actually write this as:

while(<IN>) {
print OUT if $start .. $stop;
last if $. == $stop;
}

Not so. I once (ok, more than once) fell prey to that. The docs state
(although not in the BOLD CAPITAL letters I'd like) that the implicit
comparison to $. only takes place if the argument is a constant
expression:

If either operand of scalar ".." is a constant expression, that operand
is considered true if it is equal ("==") to the current input line num-
ber (the $. variable).

To be pedantic, the comparison is actually "int(EXPR) == int(EXPR)",
but that is only an issue if you use a floating point expression; when
implicitly using $. as described in the previous paragraph, the compar-
ison is "int(EXPR) == int($.)" which is only an issue when $. is set
to a floating point value and you are not reading from a file. Fur-
thermore, "span" .. "spat" or "2.18 .. 3.14" will not do what you want
in scalar context because each of the operands are evaluated using
their integer representation.
 
A

Anno Siegel

Jeff 'japhy' Pinyan said:
[posted & mailed]
]
According to the docs, you could actually write this as:

while(<IN>) {
print OUT if $start .. $stop;
last if $. == $stop;
}

Not so. I once (ok, more than once) fell prey to that. The docs state
(although not in the BOLD CAPITAL letters I'd like) that the implicit
comparison to $. only takes place if the argument is a constant
expression:

If either operand of scalar ".." is a constant expression, that operand

Another question is what exactly is a constant expression. Experimentally,
literals and arithmetic expressions of literals are compared to $., as
are constants defined through the "constant" pragma. However, "do{ 0}",
and "do{ 1}" are not. So that's a cop-out if a constant boolean is
needed at either end of scalar "..".

The construct is one of Perl's less well-advised attempts at Doing
What I Mean.

Anno
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top