[OFF] sed equivalent of something easy in python


D

Daniel Fetchinson

This question is really about sed not python, hence it's totally off.
But since lots of unix heads are frequenting this list I thought I'd
try my luck nevertheless.

If I have a file with content

1
2
3
4
5
6
7
8
........

i.e. each line contains simply its line number, then it's quite easy
to convert it into

2
3
7
8
12
13
............

using python. The pattern is that the first line is deleted, then 2
lines are kept, 3 lines are deleted, 2 lines are kept, 3 lines are
deleted, etc, etc.

But I couldn't find a way to do this with sed and since the whole
operation is currently done with a bash script I'd hate to move to
python just to do this simple task.

What would be the sed equivalent?

Cheers,
Daniel
 
Ad

Advertisements

J

Jussi Piitulainen

Daniel said:
This question is really about sed not python, hence it's totally
off. But since lots of unix heads are frequenting this list I
thought I'd try my luck nevertheless. ....
using python. The pattern is that the first line is deleted, then 2
lines are kept, 3 lines are deleted, 2 lines are kept, 3 lines are
deleted, etc, etc.

But I couldn't find a way to do this with sed and since the whole
operation is currently done with a bash script I'd hate to move to
python just to do this simple task.

What would be the sed equivalent?

The following appears to work here. Both parts of the address are
documented as GNU extensions in the man page: 2~5 matches line 2 and
then every 5th line, and ,+1 tells sed to match also the 1 line after
each match. With -n, do not print by default, and p is the command to
print when an address matches.

sed -n '2~5,+1 p'

Tried with GNU sed version 4.1.2, never used sed this way before.

So, is there some simple expression in Python for this? Just asking
out of curiosity when nothing comes to mind, not implying that there
should be or that Python should be changed in any way.
 
J

Jussi Piitulainen

Jussi said:
The following appears to work here. Both parts of the address are
documented as GNU extensions in the man page: 2~5 matches line 2 and
then every 5th line, and ,+1 tells sed to match also the 1 line after
each match. With -n, do not print by default, and p is the command to
print when an address matches.

sed -n '2~5,+1 p'

Tried with GNU sed version 4.1.2, never used sed this way before.

So, is there some simple expression in Python for this? Just asking
out of curiosity when nothing comes to mind, not implying that there
should be or that Python should be changed in any way.

To expand, below is the best I can think of in Python 3 and I'm
curious if there is something much more concise built in that I am
missing.

def sed(source, skip, keep, drop):

'''First skip some elements from source,
then keep yielding some and dropping
some: sed(source, 1, 2, 3) to skip 1,
yield 2, drop 3, yield 2, drop 3, ...'''

for _ in range(0, skip):
next(source)
while True:
for _ in range(0, keep):
yield next(source)
for _ in range(0, drop):
next(source)
 
T

Tim Chase

To expand, below is the best I can think of in Python 3 and I'm
curious if there is something much more concise built in that I am
missing.

def sed(source, skip, keep, drop):

'''First skip some elements from source,
then keep yielding some and dropping
some: sed(source, 1, 2, 3) to skip 1,
yield 2, drop 3, yield 2, drop 3, ...'''

for _ in range(0, skip):
next(source)
while True:
for _ in range(0, keep):
yield next(source)
for _ in range(0, drop):
next(source)

Could be done as: (py2.x in this case, adjust accordingly for 3.x)

def sed(source, skip, keep, drop):
for _ in range(skip): source.next()
tot = keep + drop
for i, item in enumerate(source):
if i % tot < keep:
yield item

-tkc
 
A

Arnaud Delobelle

Tim Chase said:
Could be done as: (py2.x in this case, adjust accordingly for 3.x)

def sed(source, skip, keep, drop):
for _ in range(skip): source.next()
tot = keep + drop
for i, item in enumerate(source):
if i % tot < keep:
yield item

-tkc

With Python 2.7+ you can use itertools.compress:
.... return compress(source, chain([0]*skip, cycle([1]*keep + [0]*drop)))
.... [1, 2, 6, 7, 11, 12, 16, 17]
 
Ad

Advertisements

Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top