Pulling out lines of text from a text file

P

poopdeville

Hi everybody,

I'm looking to write a Catalyst model to basically use a newline
delimited database, so that each line of the text file corresponds to a
datum. My question isn't about the module per se, but on smart
algorithms to pull a single line of text from an arbitrary text file.
I know the following would work:

#!/usr/bin/perl

use warnings;
use strict;

print "Enter a file\n";
my $file = <STDIN>;

print "Enter a number\n";
my $number = <STDIN>;

my @array;

open FILE, "$file";

while (<FILE>) { push @array; }
close FILE;

print $array[$number];
__END__

or something close to it should work. (I hope there aren't any errors
there, but if there are, I hope you get the idea of the naive
implementation I'm talking about). An anyone point the way to a faster
algorithm?

Thanks,
'cid 'ooh
 
G

Gunnar Hjalmarsson

... pull a single line of text from an arbitrary text file.
I know the following would work:

#!/usr/bin/perl

use warnings;
use strict;

print "Enter a file\n";
my $file = <STDIN>;

print "Enter a number\n";
my $number = <STDIN>;

my @array;

open FILE, "$file";

while (<FILE>) { push @array; }
close FILE;

print $array[$number];
__END__

or something close to it should work. (I hope there aren't any errors
there, but if there are, I hope you get the idea of the naive
implementation I'm talking about). An anyone point the way to a faster
algorithm?

Use the FAQ answer provided in "perldoc -q middle", i.e. Tie::File.

I leave it to you to compare the speed. ;-)
 
S

Sisyphus

Hi everybody,

I'm looking to write a Catalyst model to basically use a newline
delimited database, so that each line of the text file corresponds to a
datum. My question isn't about the module per se, but on smart
algorithms to pull a single line of text from an arbitrary text file.
I know the following would work:

#!/usr/bin/perl

use warnings;
use strict;

print "Enter a file\n";
my $file = <STDIN>;

print "Enter a number\n";
my $number = <STDIN>;

my @array;

open FILE, "$file";

while (<FILE>) { push @array; }
close FILE;

print $array[$number];
__END__

or something close to it should work. (I hope there aren't any errors
there, but if there are, I hope you get the idea of the naive
implementation I'm talking about). An anyone point the way to a faster
algorithm?

Thanks,
'cid 'ooh

If I understand you correctly (also untested) :

use warnings;
use strict;

print "Enter a file\n";
my $file = <STDIN>;

print "Enter a number\n";
my $number = <STDIN>;

# Always check that open succeeds
open FILE, "$file" or die "Can't open: $!";

while (<FILE>) {
if($. == $number) {print $_}
last; # no need to keep reading
}

# Always check that close succeeds
close FILE or die "Can't close: $!";

__END__

See the documentation for $. in 'perldoc perlvar'.

Cheers,
Rob
 
A

Anno Siegel

Hi everybody,

I'm looking to write a Catalyst model to basically use a newline
delimited database, so that each line of the text file corresponds to a
datum. My question isn't about the module per se, but on smart
algorithms to pull a single line of text from an arbitrary text file.
I know the following would work:

#!/usr/bin/perl

use warnings;
use strict;

print "Enter a file\n";
my $file = <STDIN>;

print "Enter a number\n";
my $number = <STDIN>;

my @array;

open FILE, "$file";

while (<FILE>) { push @array; }

You don't need an explicit loop here:

@array = said:
close FILE;

print $array[$number];
__END__

or something close to it should work. (I hope there aren't any errors
there, but if there are, I hope you get the idea of the naive
implementation I'm talking about). An anyone point the way to a faster
algorithm?

Look into Tie::File, it will simplify things.

Anno
 
M

Michael Greb

Hi everybody,

I'm looking to write a Catalyst model to basically use a newline
delimited database, so that each line of the text file corresponds to a
datum. My question isn't about the module per se, but on smart
algorithms to pull a single line of text from an arbitrary text file.
I know the following would work:

#!/usr/bin/perl

use warnings;
use strict;
print "Enter a file\n";

# For more of a standard prompt you probably just want:
print "Enter a file: ";
my $file = <STDIN>;

# $file will have a \n on the end use chomp to remove it:
chomp ($file);
print "Enter a number\n";

# Ditto:
print "Enter a number: ";
my $number = <STDIN>;

# Ditto:
chomp ($number);
my @array;

open FILE, "$file";

# You should check the return code of open rather then assuming it worked,
# you also don't need the quotes around $file since you aren't adding
# any extra text.

open FILE, $file or die "Can't open $file: $!\n";
while (<FILE>) { push @array; }

# Replaces this with:

@array = <FILE>;

# @lines would be a better name for the array though, @array is
# redundant and generic, we already know it is an array from the @
close FILE;

# Though less of an issue when you are just reading a file, you should
# get in the habit of checking the return code of close as well.
print $array[$number];
__END__

or something close to it should work. (I hope there aren't any errors
there, but if there are, I hope you get the idea of the naive
implementation I'm talking about). An anyone point the way to a faster
algorithm?

If you are just trying to read a whole file into an array by line, @var
= <FH> is better then a while loop and push. If all your program is
doing is printing out a specific line from a file (and especially if the
file is going to be large), it may be better to loop over the file until
you get to that line number (or EOF) and then print the line (or an
error). This way a 100k lines file isn't stored in RAM to print line
10.
 
T

Tad McClellan

Sisyphus said:
If I understand you correctly (also untested) :
print "Enter a file\n";
my $file = <STDIN>;


chomp $file; # ?

# Always check that open succeeds
open FILE, "$file" or die "Can't open: $!";
^^^^^^^
^^^^^^^

Never quote a lone variable.


perldoc -q vars

What's wrong with always quoting "$vars"?

while (<FILE>) {
if($. == $number) {print $_}
last; # no need to keep reading
}


A loop that must execute exactly zero or one time isn't much of a loop...
 
S

Sisyphus

"Tad McClellan"
..
..
A loop that must execute exactly zero or one time isn't much of a loop...

Heh ... indeed .... better make that (still untested):

while (<FILE>) {
if($. == $number) {
print $_;
last; # no need to keep reading
}
}

Cheers,
Rob
 
U

Uri Guttman

S> "Tad McClellan"
S> .
S> .
S> Heh ... indeed .... better make that (still untested):

S> while (<FILE>) {
S> if($. == $number) {
S> print $_;
S> last; # no need to keep reading
S> }
S> }

even better:

while (<FILE>) {
next if $. < $number ;
print $_ ;
last;
}

saves a whole block and those damned expensive {}. also it has a more
common style of indenting.

uri
 
P

poopdeville

Hi everybody,

I'm looking to write a Catalyst model to basically use a newline
delimited database, so that each line of the text file corresponds to a
datum. My question isn't about the module per se, but on smart
algorithms to pull a single line of text from an arbitrary text file.

Thanks everybody!
'cid 'ooh
 
X

Xicheng

Dr.Ruud said:
Uri Guttman schreef:


while (<FILE>) {
$. == $number or next;
print;
last;
}

why not just:

while(<FILE>) {
print and last if $. == $number;
}

Xicheng
 
J

John Bokma

Dr.Ruud said:
Uri Guttman schreef:


while (<FILE>) {
$. == $number or next;
print;
last;
}

Ok I just closed my "Reply" :-D.

The reason why I prefer this one is that I read the $. == $number as what
must be true for the rest to happen. Since I read left to right, I "see"
it faster compared to next if ...
 
S

Sisyphus

Xicheng said:
why not just:

while(<FILE>) {
print and last if $. == $number;
}

This is heading towards the providing of a good example of my reservations
about that construct that does away with the curly braces.

I mean - if:

while(<FILE>) {
print $_;
}

can be replaced with:

print $_ while <FILE>

then I expect that:

while(<FILE>) {
print and last if $. == $number;
}

can be replaced with:

print and last if $. == $number while <FILE>;

There's an inconsistency in the implementation of this feature that leaves
me feeling rather cold.

Cheers,
Rob
 
U

Uri Guttman

S> I mean - if:

S> while(<FILE>) {
S> print $_;
S> }

S> can be replaced with:

S> print $_ while <FILE>

S> then I expect that:

S> while(<FILE>) {
S> print and last if $. == $number;
S> }

S> can be replaced with:

S> print and last if $. == $number while <FILE>;

the problem with multiple statement modifier is that they are not clear
in what they mean and are tricky to correctly parse out. larry has
stated that perl5 will never get them but i think it has been discussed
in the perl6 lists.

any i would write that (if i wanted to write code like this which i
don't):

$. == $number and print and last while <FILE>;

or even (and the precedence is correct as , binds before 'and'

$. == $number and print, last while <FILE>;

but i don't like compound statements like that (which is not the same as
a boolean expression which modifies a statement).

uri
 
T

Tad McClellan

Sisyphus said:
This is heading towards the providing of a good example of my reservations
about that construct that does away with the curly braces.


The docs (perlsyn) call them "modifiers".

I mean - if:

while(<FILE>) {
print $_;
}

can be replaced with:

print $_ while <FILE>

then I expect that:

while(<FILE>) {
print and last if $. == $number;
}

can be replaced with:

print and last if $. == $number while <FILE>;


You shouldn't expect that because it is documented to be disallowed. <g>

perlsyn says you can have only 1 modifier, you are using 2 modifiers.

There's an inconsistency in the implementation of this feature that leaves
me feeling rather cold.


Yes, I see that too.

I'm pretty sure that Larry saw that too too. :)

I remember hearing/reading somewhere that he purposely limited
it to 1 modifier because it would/could lead to some really
hard-to-understand code.

I thought the "somewhere" was in the std docs, and that

grep BASIC *.pod

would find it (because the feature was borrowed from BASIC-PLUS),
but I don't see in there anymore (v5.8.7).



The objection to lack of curly brackets is the "dangling else"
problem in disguise, and so is probably a widely held objection.

Which may be why Larry limited modifiers to only one...?
 
D

Dr.Ruud

John Bokma:
Dr.Ruud:

Ok I just closed my "Reply" :-D.

The reason why I prefer this one is that I read the $. == $number as
what must be true for the rest to happen. Since I read left to right,
I "see" it faster compared to next if ...

It is much like what "perl -MO=Deparse,-x7" makes of this:

while (<FILE>) {
next unless $. == $number;
print;
last;
}

------

The "next if $. != $number" version becomes more like:

while (<FILE>) {
$. != $number and next;
print;
last;
}

======

The "$. < $number" version has a problem, if the same block can be used
in situations where $number can be any number.

while (<FILE>) {
die "huh?" if $. > $number;
next if $. < $number;
print;
last;
}

------

If you don't like to use 'next' but have no problem with 'last':

while (<FILE>) {
die "huh?" if $. > $number;
if ($. == $number) {
print;
last;
}
}

------

If you don't like to use 'next' nor 'last':

for (my $last = 0 ; defined($_=<>) and not $last ; ) {
die "huh?" if $. > $number;
if ($. == $number) {
print;
$last = 1;
}
}

(hey, just trying to scare you)
 
S

Sisyphus

..
..
I remember hearing/reading somewhere that he purposely limited
it to 1 modifier because it would/could lead to some really
hard-to-understand code.

When I look at the code that uri has just provided, I start to think that
the horse has already bolted on that score ... and that allowing one
modifier is one modifier too many :)

To save you looking it up, uri presented the following 2 alternatives:

$. == $number and print and last while <FILE>;
$. == $number and print, last while <FILE>;

That's plenty hard enough for *me* to understand :)

(And, yes - I've also pondered that "dangling else" as you called it ....
didn't realize it had a name.)

Cheers,
Rob
 
T

Tad McClellan

Sisyphus said:
(And, yes - I've also pondered that "dangling else" as you called it ....
didn't realize it had a name.)


It is a common topic in computer science, google it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,734
Messages
2,569,441
Members
44,832
Latest member
GlennSmall

Latest Threads

Top