sed to python: replace Q

R

Raymond

For some reason I'm unable to grok Python's string.replace() function.
Just trying to parse a simple IP address, wrapped in square brackets,
from Postfix logs. In sed this is straightforward given:

line = "date process text [ip] more text"

sed -e 's/^.*\[//' -e 's/].*$//'

yet the following Python code does nothing:

line = line.replace('^.*\[', '', 1)
line = line.replace('].*$', '')

Is there a decent description of string.replace() somewhere?

Raymond
 
H

happyriding

For some reason I'm unable to grok Python's string.replace() function.

line = "abc"
line = line.replace("a", "x")
print line

--output:--
xbc

line = "abc"
line = line.replace("[apq]", "x")
print line

--output:--
abc


Does the 5 character substring "[apq]" exist anywhere in the original
string?
 
R

Robert Bossy

Raymond said:
For some reason I'm unable to grok Python's string.replace() function.
Just trying to parse a simple IP address, wrapped in square brackets,
from Postfix logs. In sed this is straightforward given:

line = "date process text [ip] more text"

sed -e 's/^.*\[//' -e 's/].*$//'
alternatively:
sed -e 's/.*\[\(.*\)].*/\1/'
yet the following Python code does nothing:

line = line.replace('^.*\[', '', 1)
line = line.replace('].*$', '')

Is there a decent description of string.replace() somewhere?
In python shell:
help(str.replace)

Online:
http://docs.python.org/lib/string-methods.html#l2h-255

But what you are probably looking for is re.sub():
http://docs.python.org/lib/node46.html#l2h-405


RB
 
K

Kam-Hung Soh

For some reason I'm unable to grok Python's string.replace() function.
Just trying to parse a simple IP address, wrapped in square brackets,
from Postfix logs. In sed this is straightforward given:

line = "date process text [ip] more text"

sed -e 's/^.*\[//' -e 's/].*$//'

yet the following Python code does nothing:

line = line.replace('^.*\[', '', 1)
line = line.replace('].*$', '')

str.replace() doesn't support regular expressions.

Try:

import re
p = re.compile("^.*\[")
q = re.compile("].*$")
q.sub('',p.sub('', line))
Is there a decent description of string.replace() somewhere?

Raymond

Section 3.6.1 String Functions
 
K

Kam-Hung Soh

For some reason I'm unable to grok Python's string.replace() function..
Just trying to parse a simple IP address, wrapped in square brackets,
from Postfix logs. In sed this is straightforward given:

line = "date process text [ip] more text"

sed -e 's/^.*\[//' -e 's/].*$//'

yet the following Python code does nothing:

line = line.replace('^.*\[', '', 1)
line = line.replace('].*$', '')

str.replace() doesn't support regular expressions.

Try:

import re
p = re.compile("^.*\[")
q = re.compile("].*$")
q.sub('',p.sub('', line))

Another approach is to use the split() function in "re" module.

import re
re.split("[\[\]]", line)[1]

See http://docs.python.org/lib/node46.html
 
R

Raymond

Another approach is to use the split() function in "re" module.

Ah ha, thar's the disconnect. Thanks for all the pointers, my def is
now working. Still don't understand the logic behind this design though.
I mean why would any programming language have separate search or find
functions, one for regex and and another for non-regex based pattern
matching?

Aren't sed, awk, grep, and perl the reference implementations of search
and replace? They don't have non-regex functions, why does Python?
Wouldn't it be a lot simpler to use a flag, like grep's '-f', to change
the meaning of a search string to be literal?

My other gripe is with the kludgy object-oriented regex functions.
Couldn't these be better implemented in-line? Why should I, as a coder,
have to 're.compile()' when all the reference languages do this at compile
time, from a much more straightforward and easy to read in-line function...

Raymon
 
M

Marco Mariani

Raymond said:
Aren't sed, awk, grep, and perl the reference implementations of search
and replace?

I don't know about "reference implementations", but I daresay they are a
mess w.r.t. usability.
 
M

Mel

Raymond said:
My other gripe is with the kludgy object-oriented regex functions.
Couldn't these be better implemented in-line? Why should I, as a coder,
have to 're.compile()' when all the reference languages do this at compile
time, from a much more straightforward and easy to read in-line
function...

Because compile time doesn't do

pattern = raw_input ("Pattern, please: ")
saved_pattern = re.compile (pattern)

Mel.
 
D

Diez B. Roggisch

Ah ha, thar's the disconnect. Thanks for all the pointers, my def is
now working. Still don't understand the logic behind this design though.
I mean why would any programming language have separate search or find
functions, one for regex and and another for non-regex based pattern
matching?

Aren't sed, awk, grep, and perl the reference implementations of search
and replace? They don't have non-regex functions, why does Python?
Wouldn't it be a lot simpler to use a flag, like grep's '-f', to change
the meaning of a search string to be literal?

And by this possibly destroying other modules code that rely on their
respective strings being that - and not patterns.
My other gripe is with the kludgy object-oriented regex functions.
Couldn't these be better implemented in-line? Why should I, as a coder,
have to 're.compile()' when all the reference languages do this at compile
time, from a much more straightforward and easy to read in-line
function...

You can do that already, no need to - the patterns are cached. Albeit the
cache might be limited in size. but code like

m = re.match(pattern, s)

is not considerably slower than

rex = re.compile(pattern)
m = rex.match(s)

Diez
 
D

Dan Stromberg

Ah ha, thar's the disconnect. Thanks for all the pointers, my def is
now working. Still don't understand the logic behind this design
though. I mean why would any programming language have separate search
or find functions, one for regex and and another for non-regex based
pattern matching?

Aren't sed, awk, grep, and perl the reference implementations of search
and replace? They don't have non-regex functions, why does Python?
Wouldn't it be a lot simpler to use a flag, like grep's '-f', to change
the meaning of a search string to be literal?

My other gripe is with the kludgy object-oriented regex functions.
Couldn't these be better implemented in-line? Why should I, as a coder,
have to 're.compile()' when all the reference languages do this at
compile time, from a much more straightforward and easy to read in-line
function...

Raymon

Hm. Are regex's first class citizens in these languages, like they are
in python?

And from a language design perspective, isn't it much cleaner to put
regex's into just another portion of the runtime rather than dumping it
into the language definition proper?

It does actually make sense - to have a string method do a string thing,
and to have a regex method do a regex thing. And while command line
options are pretty nice when done well, there's nothing in particular
stopping one from using arguments with defaults in python.

I'm good with sed and grep, though I never got into awk much - perhaps a
small mistake. When it came to perl, I skipped it and went directly to
python, and have never regretted the decision. Python's got a much more
coherent design than perl, most certainly, and more than sed as well.
awk's not that bad though. And grep's nice and focused - I quite like
grep's design.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top