Question about range of Lines

M

mauro papandrea

Given this simple file:

cat t1

Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday

with

perl -ne 'print if /Tuesday/ .. /Friday/;' t1

I can get:

Tuesday
Wednesday
Thursday
Friday


Is there a simple way to modify the range selection in order to get all
the lines but the last in the range?
Namely, getting this output:

Tuesday
Wednesday
Thursday

An idea could be to push back last line but I do not know how to do that.

Thank you

Regards

Mauro

PS this is an oversimplified case for the sake of simplicity
 
L

lcof

The Thu, 15 Sep 2011 09:28:42 +0200, mauro papandrea wrote :
Given this simple file:

cat t1

Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday

with

perl -ne 'print if /Tuesday/ .. /Friday/;' t1

I can get:

Tuesday
Wednesday
Thursday
Friday


Is there a simple way to modify the range selection in order to get all
the lines but the last in the range?
Namely, getting this output:

Tuesday
Wednesday
Thursday

An idea could be to push back last line but I do not know how to do
that.

Thank you

Regards

Mauro

PS this is an oversimplified case for the sake of simplicity

You mean, something like this ?

perl -ne 'while(<>) { push @a, $_ if /Tuesday/ .. /Friday/; } pop @a;
print @a' < t1
 
M

Mauro

You mean, something like this ?

perl -ne 'while(<>) { push @a, $_ if /Tuesday/ .. /Friday/; } pop @a;
print @a'< t1

Thank you

However it is not so clear how it works; I thought that -n option was
supposed to add an implicit loop and there were no need to add a further
while (<>), not to say anything about reading from a redirection instead
of reading directly the file.

Regards

Mauro
 
M

Mauro

perl -ne '$a = 1 if /Tuesday/; $a = 0 if /Friday/; print if $a' t1

Thanl you

This appears much clearer, I hope to be able to adapt it to solve my
original problem, that was slightly different ( I am afraid I
oversimplified too much ).

Regards
 
L

Lucien Coffe

Mauro wrote :
Thank you

However it is not so clear how it works; I thought that -n option was
supposed to add an implicit loop and there were no need to add a further
while (<>), not to say anything about reading from a redirection instead
of reading directly the file.

Regards

Mauro

Sure, I did not took off the -n option after copy-pasting your line, so I
used a redirection. Consider the following :

perl -e 'while(<>) { push @a, $_ if /Tuesday/../Friday/ } pop @a; print
@a' t1

Good luck with your algorithm. Btw Tad McClellan's solution is much more
efficient.
 
M

Mauro

Sure, I did not took off the -n option after copy-pasting your line, so I
used a redirection. Consider the following :

perl -e 'while(<>) { push @a, $_ if /Tuesday/../Friday/ } pop @a; print
@a' t1

Good luck with your algorithm. Btw Tad McClellan's solution is much more
efficient.

Well, now it is much clearer; just wondering why this do not work:

perl -ne '{ push @a, $_ if /Tuesday/../Friday/ } pop @a; print @a' t1

Regards

Mauro
 
L

Lucien Coffe

Mauro wrote :
Well, now it is much clearer; just wondering why this do not work:

perl -ne '{ push @a, $_ if /Tuesday/../Friday/ } pop @a; print @a' t1

This gives :

while(<>) {
{
push @a, $_ if /Tuesday/../Friday/
}
pop @a;
print @a
}

That's not what you want. You have to use an END block to get the pop and
print instructions out of the while loop, like this :

perl -ne 'push @a, $_ if /Tuesday/../Friday/; END { pop @a; print @a }' t1
Regards

Mauro

{ push @a, $_ if /Tuesday/../Friday/ } pop @a; print @a
 
M

Mauro

This gives :

while(<>) {
{
push @a, $_ if /Tuesday/../Friday/
}
pop @a;
print @a
}

Oh, I see, those two {} more ... that is the difference, I missed that

Thank you for pointing out it to me

Regards

Mauro
 
J

John W. Krahn

mauro said:
Given this simple file:

cat t1

Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday

with

perl -ne 'print if /Tuesday/ .. /Friday/;' t1

I can get:

Tuesday
Wednesday
Thursday
Friday


Is there a simple way to modify the range selection in order to get all
the lines but the last in the range?
Namely, getting this output:

Tuesday
Wednesday
Thursday

$ echo "Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday" | perl -ne'print if (/Tuesday/ .. /Friday/) && !/Friday/'
Tuesday
Wednesday
Thursday




John
 
P

Peter J. Holzer

perl -ne '$a = 1 if /Tuesday/; $a = 0 if /Friday/; print if $a' t1

Another solution which avoids the range operator:

perl -e '$/ = undef; $s = <>; print $s =~ /^(Tuesday.*?)^Friday/sm;' t1

hp
 
M

Mauro

Another solution which avoids the range operator:
perl -e '$/ = undef; $s =<>; print $s =~ /^(Tuesday.*?)^Friday/sm;' t1

hp

Thank you, it is a nice idea ( but for too much long files, perhaps ).

Regards

Mauro
 
S

sln

Given this simple file:

cat t1

Sunday
Monday
Tuesday
Wednesday
Thursday
Friday
Saturday

with

perl -ne 'print if /Tuesday/ .. /Friday/;' t1

I can get:

Tuesday
Wednesday
Thursday
Friday


Is there a simple way to modify the range selection in order to get all
the lines but the last in the range?
Namely, getting this output:

Tuesday
Wednesday
Thursday

An idea could be to push back last line but I do not know how to do that.

Thank you

Regards

Mauro

PS this is an oversimplified case for the sake of simplicity

As others mentioned, I would go the simple way using an range operator
and an additional regex:

print if (/Tuesday/ .. /Friday/ and !/Tuesday); # not Tuesday
print if (/Tuesday/ .. /Friday/ and !/Tuesday|Friday/); # not Tuesday nor Friday
print if (/Tuesday/ .. /Friday/ and !/Friday/); # not Friday

If you want to use just a range operator, it becomes difficult (untested):

# not Tuesday
perl -ne 'print if $st=~/Tuesday/ .. /Friday/; $st=$_' t1

# not Tuesday nor Friday
(left as an exersize)

# not Friday
perl -e 'do {print if /Tuesday/ .. $st=~/Friday/; $_=$st;} while ($st=<>)' t1


-sln
 
B

Ben Bacarisse

As others mentioned, I would go the simple way using an range operator
and an additional regex:

print if (/Tuesday/ .. /Friday/ and !/Tuesday); # not Tuesday
print if (/Tuesday/ .. /Friday/ and !/Tuesday|Friday/); # not Tuesday nor Friday

I think the days of the week, all in order, were just an example. If
so, then these two are not quite what they seem because they exclude
extra Tuesdays. Of course, it's possible that the string that starts
the range never appears in it (in which case all is well) but the
example may have been oversimplified.
print if (/Tuesday/ .. /Friday/ and !/Friday/); # not Friday

There's no problem here because the end pattern can't appear in the
range -- by definition. And since this is what the OP asked for, a rang
and regex does seem like the way to go.

<snip>
 
S

sln

I think the days of the week, all in order, were just an example. If
so, then these two are not quite what they seem because they exclude
extra Tuesdays. Of course, it's possible that the string that starts
the range never appears in it (in which case all is well) but the
example may have been oversimplified.

I believe this is a true statement.
There's no problem here because the end pattern can't appear in the
range -- by definition. And since this is what the OP asked for, a rang
and regex does seem like the way to go.

And this is true too. But, I believe using the range operator in this
context does not imply a 'Friday' exists in the data, where 'Tuesday'
must if the condition is satisfied. Thats not saying anything is wrong
using the range operator like this, just that some extra buffering
might/might-not be needed. There is usually more to posters questions
than an academic translation.

-sln
 
M

Mauro

"but for too much long files, perhaps"

I would give this credibility if you can show some code that
guarantees there is a 'Friday'.

while (<>) {
if (/Tuesday/ .. /Friday/) {}
}

In this case, the operator .. flips true on $_=~/Tuesday/ and remains
so until after it finds $_=~/Friday/ or, until the while() is false.

I can see why you would be interested in how the range operator
works in this case, but don't promote it to something perfect.

-sln

I had expressed that doubt ( about long files ) since, if I understand
correctly, that solution read the whole file in memory.
The example I posted was an artificial oversimplified case, just to have
something concrete to apply code, not the real case ( where there is no
friday at all ).

Thank you

Regards

Mauro
 
M

Mauro

And this is true too. But, I believe using the range operator in this
context does not imply a 'Friday' exists in the data, where 'Tuesday'
must if the condition is satisfied. Thats not saying anything is wrong
using the range operator like this, just that some extra buffering
might/might-not be needed. There is usually more to posters questions
than an academic translation.

-sln

As a matter of fact, as I said in the other reply of mine, the original
problem was quite different, although it was always about a range of lines.
Perhaps I simplified a bit too much ...

Regards

Mauro
 
P

Peter J. Holzer

"but for too much long files, perhaps"

I would give this credibility if you can show some code that
guarantees there is a 'Friday'.

True. My code behaves differently than the original if the Friday is
missing. But maybe a range which doesn't end with Friday is malformed
and should be omitted. Mauro didn't specify, but this is definitely
something he would have to consider.

I was assuming that "Tuesday" and "Friday" are just arbitrary begin and
end markers and that lines don't have a natural order. But what if they
have? Given the input:

Monday
Wednesday
Thursday
Friday
Saturday
Sunday

should it print

Wednesday
Thursday

?

Then neither the range operator nor the regexp will work (both rely on
"Tuesday" being there) and you would need a different approach (like
converting weekdays to numbers and then checking whether the number is
in [2, 5)).

hp
 
P

Peter J. Holzer

I had expressed that doubt ( about long files ) since, if I understand
correctly, that solution read the whole file in memory.

Yes, it does. Whether that's a problem depends on how long the file is,
how much memory you have and what you are doing with the file. Caveat
emptor!

hp
 
M

Mauro

True. My code behaves differently than the original if the Friday is
missing. But maybe a range which doesn't end with Friday is malformed
and should be omitted. Mauro didn't specify, but this is definitely
something he would have to consider.

I was assuming that "Tuesday" and "Friday" are just arbitrary begin and
end markers and that lines don't have a natural order. But what if they
have? Given the input:

Monday
Wednesday
Thursday
Friday
Saturday
Sunday

should it print

Wednesday
Thursday

?

Then neither the range operator nor the regexp will work (both rely on
"Tuesday" being there) and you would need a different approach (like
converting weekdays to numbers and then checking whether the number is
in [2, 5)).

hp

Your assumption is quite correct, those day name were merely markers.
Perhaps I oversimplified too much.
In my original problem, my file is divided in sections and each of them
begins with a special line ( easily identifiable because no other line
can be like that, that file is produced by a tool ).
I need to read each section and apply a kind of filter whose action
depends upon the type of section ( the marker line has 2 parts: one
equal for every section, and this can be used to find sections, and a
second with a kind of label, that can be used to understand which
section it is ), some are to be discarded and others to be changed.
So I though that a line range was good way to start if I could omit last
line in the range ( so that it could be used for next section ) but I am
afraid I did not chose the right example and I apologize for that.

Regards

Mauro
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top