Python re expr from Perl to Python

M

Michael M.

In Perl, it was:


## Example: "Abc | def | ghi | jkl"
## -> "Abc ghi jkl"
## Take only the text betewwn the 2nd pipe (=cut the text in the 1st
pipe).
$na =~ s/\ \|(.*?)\ \|(.*?)\ \|/$2/g;

## -- remove [ and ] in text
$na =~ s/\[//g;
$na =~ s/\]//g;
# print "DEB: \"$na\"\n";


# input string
na="Abc | def | ghi | jkl [gugu]"
# output
na="Abc ghi jkl gugu"


How is it done in Python?
 
J

Jorge Godoy

Michael M. said:
In Perl, it was:


## Example: "Abc | def | ghi | jkl"
## -> "Abc ghi jkl"
## Take only the text betewwn the 2nd pipe (=cut the text in the 1st pipe).
$na =~ s/\ \|(.*?)\ \|(.*?)\ \|/$2/g;

## -- remove [ and ] in text
$na =~ s/\[//g;
$na =~ s/\]//g;
# print "DEB: \"$na\"\n";


# input string
na="Abc | def | ghi | jkl [gugu]"
# output
na="Abc ghi jkl gugu"


How is it done in Python?

The simplest form:
na="Abc | def | ghi | jkl [gugu]"
na_out = na.replace('def', '').replace(' | ', ' ').replace(' ', ' ').replace('[', '').replace(']', '').strip()
na_out 'Abc ghi jkl gugu'


Another form:
na_out = ' '.join(na.split(' | ')).replace('[', '').replace(']', '').replace(' def', '')
na_out 'Abc ghi jkl gugu'


There is the regular expression approach as well as several other
alternatives. I could list other (simpler, more advanced, etc.) alternatives,
but you can also play with Python by yourself. If you have a more concrete
specification, send it to the group.
 
C

Carsten Haese

In Perl, it was:


## Example: "Abc | def | ghi | jkl"
## -> "Abc ghi jkl"
## Take only the text betewwn the 2nd pipe (=cut the text in the 1st
pipe).
$na =~ s/\ \|(.*?)\ \|(.*?)\ \|/$2/g;

## -- remove [ and ] in text
$na =~ s/\[//g;
$na =~ s/\]//g;
# print "DEB: \"$na\"\n";


# input string
na="Abc | def | ghi | jkl [gugu]"
# output
na="Abc ghi jkl gugu"


How is it done in Python?

Here's an almost literal translation:

##################################################
import re
na = re.sub(r"\ \|(.*?)\ \|(.*?)\ \|", r"\2", na)
na = na.replace("[", "")
na = na.replace("]", "")
##################################################

Background information on regular expressions in Python can be found
here:

http://www.amk.ca/python/howto/regex/
http://docs.python.org/lib/module-re.html

Hope this helps,

Carsten.
 
P

Paddy

Michael said:
In Perl, it was:


## Example: "Abc | def | ghi | jkl"
## -> "Abc ghi jkl"
## Take only the text betewwn the 2nd pipe (=cut the text in the 1st
pipe).
$na =~ s/\ \|(.*?)\ \|(.*?)\ \|/$2/g;

## -- remove [ and ] in text
$na =~ s/\[//g;
$na =~ s/\]//g;
# print "DEB: \"$na\"\n";


# input string
na="Abc | def | ghi | jkl [gugu]"
# output
na="Abc ghi jkl gugu"


How is it done in Python?

Here is how to do it without regexps in python.
The first and last line below are all that are needed. The others show
intermediate expressions that lead to the result.
from itertools import groupby
na="Abc | def | ghi | jkl [gugu]"
[(g[0], ''.join(g[1])) for g in groupby(na, lambda c: c not in ' \t|[]')]
[(True, 'Abc'), (False, ' | '), (True, 'def'), (False, ' | '), (True,
'ghi'), (False, ' | '), (True, 'jkl'), (False, ' ['), (True, 'gugu'),
(False, ']')]
[''.join(g[1]) for g in groupby(na, lambda c: c not in ' \t|[]') if g[0]]
['Abc', 'def', 'ghi', 'jkl', 'gugu']
' '.join(''.join(g[1]) for g in groupby(na, lambda c: c not in ' \t|[]') if g[0]) 'Abc def ghi jkl gugu'


- Paddy.
 
P

Paddy

Paddy said:
Michael said:
In Perl, it was:


## Example: "Abc | def | ghi | jkl"
## -> "Abc ghi jkl"
## Take only the text betewwn the 2nd pipe (=cut the text in the 1st
pipe).
$na =~ s/\ \|(.*?)\ \|(.*?)\ \|/$2/g;

## -- remove [ and ] in text
$na =~ s/\[//g;
$na =~ s/\]//g;
# print "DEB: \"$na\"\n";


# input string
na="Abc | def | ghi | jkl [gugu]"
# output
na="Abc ghi jkl gugu"


How is it done in Python?

Here is how to do it without regexps in python.
The first and last line below are all that are needed. The others show
intermediate expressions that lead to the result.
from itertools import groupby
na="Abc | def | ghi | jkl [gugu]"
[(g[0], ''.join(g[1])) for g in groupby(na, lambda c: c not in ' \t|[]')]
[(True, 'Abc'), (False, ' | '), (True, 'def'), (False, ' | '), (True,
'ghi'), (False, ' | '), (True, 'jkl'), (False, ' ['), (True, 'gugu'),
(False, ']')]
[''.join(g[1]) for g in groupby(na, lambda c: c not in ' \t|[]') if g[0]]
['Abc', 'def', 'ghi', 'jkl', 'gugu']
' '.join(''.join(g[1]) for g in groupby(na, lambda c: c not in ' \t|[]') if g[0]) 'Abc def ghi jkl gugu'


- Paddy.

And I leave the deletion of def to the reader :)

(i.e: I missed that bit and adding it in would make a long
comprehension too long to comprehend).
 
L

Lloyd Zusman

I have a python (2.5) program with number of worker threads, and I want
to make sure that each of these does a context switch at appropriate
times, to avoid starvation. I know that I can do a time.sleep(0.001) to
force such a switch, but I'm wondering if this is the recommended
method.

Thanks in advance.
 
D

Duncan Booth

Lloyd Zusman said:
I have a python (2.5) program with number of worker threads, and I want
to make sure that each of these does a context switch at appropriate
times, to avoid starvation. I know that I can do a time.sleep(0.001) to
force such a switch, but I'm wondering if this is the recommended
method.

The recommended method is to start a new thread rather than following up on
an existing thread with an unrelated question.

Why do you think that just letting the threads run won't have the effect
you desire? Leave it to the system to schedule the threads.
 
L

Lloyd Zusman

Duncan Booth said:
The recommended method is to start a new thread rather than following up on
an existing thread with an unrelated question.

I accidentally hit "a" in my mailer instead of "w" ("reply" instead of
"compose"). Geez. It was an accident. I'm sorry.

Why do you think that just letting the threads run won't have the effect
you desire? Leave it to the system to schedule the threads.

I can already see that they don't have the effect I desire. They are
long numerical calculations in tight loops. I have to periodically put
explicit time.sleep(0.001) calls in place to force the context
switching, and I was wondering if that's the recommended method.
 
D

Duncan Booth

Lloyd Zusman said:
I can already see that they don't have the effect I desire. They are
long numerical calculations in tight loops. I have to periodically
put explicit time.sleep(0.001) calls in place to force the context
switching, and I was wondering if that's the recommended method.
Not really.

If the context isn't switching enough for you then try calling
sys.setcheckinterval(n) with varying values of n until you find one which
is suitable. Calling it with a lower value of n will increase the frequency
that you switch thread contexts, although of course it will also increase
the overall runtime for your program.

Alternatively you could try splitting your processing into smaller chunks
and ensure each thread does a small chunk at a time instead of a large one.

Why does it matter whether individual threads are being 'starved'? Surely
you want them all to complete in any case, so does it matter if they run
sequentially or in parallel?
 
L

Lloyd Zusman

Duncan Booth said:
[ ... ]

If the context isn't switching enough for you then try calling
sys.setcheckinterval(n) with varying values of n until you find one which
is suitable. Calling it with a lower value of n will increase the frequency
that you switch thread contexts, although of course it will also increase
the overall runtime for your program.

Thank you very much. The sys.setcheckinterval function is what I need.
It seems that the original writer of the app had set this interval to a
high value in a part of the code that I overlooked until you mentioned
this right now.

[ ... ]

Why does it matter whether individual threads are being 'starved'? Surely
you want them all to complete in any case, so does it matter if they run
sequentially or in parallel?

Because some of the threads perform monitoring and notification that
need to occur in a timely fashion. Since these threads are doing IO,
they switch context appropriately, but once one of the big
number-crunching threads gets control, it starves out the monitoring
threads, which is not a good thing for my app ... or at least it did
so with the original large checkinterval.
 
G

Gabriel Genellina

It seems that the original writer of the app had set this interval to a
high value in a part of the code that I overlooked until you mentioned
this right now.

[...] once one of the big
number-crunching threads gets control, it starves out the monitoring
threads, which is not a good thing for my app ... or at least it did
so with the original large checkinterval.

This is why such settings should be in a configuration file or in a
prominent place in the application...
I had a program where, deep in an unknown function, the original coder
changed the process priority - with no valid reason, and in any case,
that should be an application-level setting. It was hard to find why,
after doing such and such things, the system responsiveness were so
slow.
 
T

Thomas Ploch

Florian said:
Michael M. said:
In Perl, it was:


## Example: "Abc | def | ghi | jkl"
## -> "Abc ghi jkl"
## Take only the text betewwn the 2nd pipe (=cut the text in the 1st
pipe).
$na =~ s/\ \|(.*?)\ \|(.*?)\ \|/$2/g;

## -- remove [ and ] in text
$na =~ s/\[//g;
$na =~ s/\]//g;
# print "DEB: \"$na\"\n";


# input string
na="Abc | def | ghi | jkl [gugu]"
# output
na="Abc ghi jkl gugu"


How is it done in Python?
import re
na="Abc | def | ghi | jkl [gugu]"
m=re.match(r'(\w+ )\| (\w+ )\| (\w+ )\| (\w+ )\[(\w+)\]', na)
na=m.expand(r'\1\2\3\5')
na
'Abc def ghi gugu'

I'd rather have the groups grouped without the whitespaces
>>> import re
>>> na="Abc | def | ghi | jkl [gugu]"
>>> m=re.match(r'(\w+) \| (\w+) \| (\w+) \| (\w+) \[(\w+)\]', na)
>>> na=m.expand(r'\1 \3 \4 \5')
>>> na
'Abc ghi jkl gugu'

Thomas
 
F

Florian Diesch

Michael M. said:
In Perl, it was:


## Example: "Abc | def | ghi | jkl"
## -> "Abc ghi jkl"
## Take only the text betewwn the 2nd pipe (=cut the text in the 1st
pipe).
$na =~ s/\ \|(.*?)\ \|(.*?)\ \|/$2/g;

## -- remove [ and ] in text
$na =~ s/\[//g;
$na =~ s/\]//g;
# print "DEB: \"$na\"\n";


# input string
na="Abc | def | ghi | jkl [gugu]"
# output
na="Abc ghi jkl gugu"


How is it done in Python?
import re
na="Abc | def | ghi | jkl [gugu]"
m=re.match(r'(\w+ )\| (\w+ )\| (\w+ )\| (\w+ )\[(\w+)\]', na)
na=m.expand(r'\1\2\3\5')
na
'Abc def ghi gugu'



Florian
 
L

Larry Bates

In Perl, it was:


## Example: "Abc | def | ghi | jkl"
## -> "Abc ghi jkl"
## Take only the text betewwn the 2nd pipe (=cut the text in the 1st
pipe).
$na =~ s/\ \|(.*?)\ \|(.*?)\ \|/$2/g;

## -- remove [ and ] in text
$na =~ s/\[//g;
$na =~ s/\]//g;
# print "DEB: \"$na\"\n";


# input string
na="Abc | def | ghi | jkl [gugu]"
# output
na="Abc ghi jkl gugu"


How is it done in Python?

You don't really need regular expressions for this simple transformation:

na="Abc | def | ghi | jkl [gugu]"

na=" ".join([x.strip() for x in na.replace("[","|").replace("]","").split("|")])

-Larry
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top