regex newbie question

Z

ZMAN

Hello all!

Reading in lines from a file.
I want to ignore all the text before it gets to the line
"<!--document_starts_here-->"
and write out the remainder of the text to a file.

This is the way I'm attempting this..
Thanks in advance!
BZ

open DATAOUT, ">$data_file" or die "can't open $data_file $!";

foreach $line (@lines)
{

if ($line =~ m/<!--document_starts_here-->/i)
{
print "This line contains the word : $line\n";

#### write remainder of file out
}

print DATAOUT "$line";

}


close (DATAOUT)
 
A

A. Sinan Unur

Reading in lines from a file.
I want to ignore all the text before it gets to the line
"<!--document_starts_here-->"
and write out the remainder of the text to a file.

use strict;
use warnings;

missing.
open DATAOUT, ">$data_file" or die "can't open $data_file $!";

open my $data_out, '>', $data_file or die "Can't open $data_file: $!";

See

perldoc -q always

The question, of course, is where are you reading the data in?
foreach $line (@lines)
{

if ($line =~ m/<!--document_starts_here-->/i)
{
print "This line contains the word : $line\n";

#### write remainder of file out
}

print DATAOUT "$line";

}


close (DATAOUT)

Please post real code. Please see the posting guidelines for this group
to find out how you can help others help you.

It *seems* to me like you are initially slurping the file. There is no
need for that.

#! /usr/bin/perl

use strict;
use warnings;

while(<DATA>) {
next unless /<!--document_starts_here-->/i;
while(<DATA>) {
print;
}
}

__END__
<html>
<head>
<title>Test</title>
</head>

<!--document_starts_here-->
<body>
<h1>Some document</h1>
<p>ya ba da ba doo</p>
</body>
</html>
 
J

John W. Krahn

ZMAN said:
Reading in lines from a file.

It looks like you are iterating over an array instead.

I want to ignore all the text before it gets to the line
"<!--document_starts_here-->"
and write out the remainder of the text to a file.

If you were actually reading in from a file it would be a lot easier, like:

while ( <DATAIN> ) {
if ( /<!--document_starts_here-->/i .. eof DATAIN ) {
print DATAOUT;
}
}



John
 
T

Tad McClellan

ZMAN said:
Reading in lines from a file.


No you're not. Your code contains no input statements.

I want to ignore all the text before it gets to the line
"<!--document_starts_here-->"
and write out the remainder of the text to a file.


while ( <> ) {
last if $_ eq "<!--document_starts_here-->\n";
}

while ( <> ) {
print;
}
 
J

John W. Krahn

Tad said:
No you're not. Your code contains no input statements.


while ( <> ) {
last if $_ eq "<!--document_starts_here-->\n";

I think that the OP wanted a case insensitive match.
}

while ( <> ) {
print;
}

$_ = <> until /<!--document_starts_here-->/i;

print while <>;


# :)

John
 
Z

ZMAN

John
Thanks! That's just what I needed.
My goal was to pull over some html, remove the PHP, write them to a temp
dir, and then have SWISH index them.

In my haste I didn't post all of my code.
So, here it is, works great and thanks again!

#!/usr/local/bin/perl -w
use strict;

my $dest_dir = "../../temp/";
my $src_dir = "../../html/";

mkdir("$dest_dir",0777) || die "cannot mkdir $dest_dir: $!";

opendir(DIR, $src_dir);
my @files = readdir(DIR);
closedir(DIR);
my $file;

print "Indexing the following files\n";

foreach $file (@files)
{

next if($file eq '.');
next if($file eq '..');
next if($file =~ /\.php/);
next if($file =~ /\.inc/);
next if($file =~ /\.pl/);
next if($file =~ /\.cgi/);
next if($file =~ /\.txt/);
next if($file =~ /_R\.html$/);
next if(! ($file =~ /\.html$/));

print "$file\n";

open(FILE, "<$src_dir$file") or die "Can't open $file : $!";

open (DATAOUT, "> $dest_dir$file") or die "can't open $dest_dir$file
$!";

while ( <FILE> ) {
if ( /<!--document_starts_here-->/i .. eof FILE ) {
print DATAOUT;
}
}

}
close DATAOUT;

close FILE;

system("./swish -c swish.conf");
system("rm -rf $dest_dir");
 
T

Tad McClellan

ZMAN said:
mkdir("$dest_dir",0777) || die "cannot mkdir $dest_dir: $!";


perldoc -q vars

What's wrong with always quoting "$vars"?

mkdir($dest_dir, 0777) || die "cannot mkdir $dest_dir: $!";

opendir(DIR, $src_dir);


You should test the return value from opendir() just like you did
with mkdir().

next if($file =~ /\.php/);
next if($file =~ /\.inc/);
next if($file =~ /\.pl/);
next if($file =~ /\.cgi/);
next if($file =~ /\.txt/);


next if $file =~ /\.(php|inc|pl|cgi|txt)$/; # anchor to end of string

next if($file =~ /_R\.html$/);


You anchored here, but not the earlier ones.

next if(! ($file =~ /\.html$/));


next unless $file =~ /\.html$/;

You don't need the earlier batch of tests if you have this
one anyway...



[ snip TOFU ]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top