String Replace Problem...

A

andrea.gavana

Hello NG,

probably this is a basic question, but I'm going crazy... I am unable
to find an answer. Suppose that I have a file (that I called "Errors.txt")
which contains these lines:

MULTIPLY
'PERMX' @PERMX1 1 34 1 20 1 6 /
'PERMX' @PERMX2 1 34 21 41 1 6 /
'PERMX' @PERMX3 1 34 1 20 7 14 /
'PERMX' @PERMX4 1 34 21 41 7 14 /
'PERMX' @PERMX5 1 34 1 20 15 26 /
'PERMX' @PERMX6 1 34 21 41 15 26 /
'PERMX' @PERMX7 1 34 1 20 27 28 /
'PERMX' @PERMX8 1 34 21 41 27 28 /
'PERMX' @PERMX9 1 34 1 20 29 34 /
'PERMX' @PERMX10 1 34 21 41 29 34 /
'PERMX' @PERMX11 1 34 1 20 35 42 /
'PERMX' @PERMX12 1 34 21 41 35 42 /
'PERMX' @PERMX13 1 34 1 20 43 53 /
'PERMX' @PERMX14 1 34 21 41 43 53 /
'PERMX' @PERMX15 1 34 1 20 54 61 /
'PERMX' @PERMX16 1 34 21 41 54 61 /
/

I would like to replace all the occurrencies of the "keywords" (beginning
with the @ (AT) symbol) with some floating point value. As an example, this
is what I do:

# --- CODE BEGIN

import re
import string

# Set Some Dummy Parameter Values
parametervalues = range(1, 17)

# Open And Read The File With Keywords
fid = open("Errors.txt","rt")
onread = fid.read()
fid.close()

# Find All Keywords Starting with @ (AT)
regex = re.compile("[\@]\w+", re.IGNORECASE)
keywords = regex.findall(onread)

counter = 0

# Try To Replace The With Floats
for keys in keywords:
pars = parametervalues[counter]
onread = string.replace(onread, keys, str(float(pars)))
counter = counter + 1

# Write A New File With Replaced Values
fid = open("Errors_2.txt","wt")
fid.write(onread)
fid.close()

# --- CODE END


Now, I you try to run this little script, you will see that for keywords
starting from "@PERMX10", the replaced values are WRONG. I don't know why,
Python replace only the "@PERMX1" leaving out the last char of the keyword
(that are 0, 1, 2, 3, 4, 5, 6 ). These values are left in the file and I
don't get the expected result.

Does anyone have an explanation? What am I doing wrong?

Thanks to you all for your help.

Andrea.
 
P

Peter Hansen

'PERMX' @PERMX1 1 34 1 20 1 6 / ....
'PERMX' @PERMX10 1 34 21 41 29 34 / ....

I would like to replace all the occurrencies of the "keywords" (beginning
with the @ (AT) symbol) with some floating point value. As an example, this
is what I do:

# Find All Keywords Starting with @ (AT)
regex = re.compile("[\@]\w+", re.IGNORECASE)

You don't need the [, \ or ] around the "@" as it is
not a special character... (but that's not your problem here).
keywords = regex.findall(onread)

Here you get a list, in order, of all the matches, including
these: "@PERMX1" and "@PERMX10"
# Try To Replace The With Floats
for keys in keywords:
onread = string.replace(onread, keys, str(float(pars)))

Here you iterate through the list, replacing *all*
occurrences of each of them, one at a time, in the
full string.

Now imagine what happens to things like "@PERMX10" in the
full string when you are replacing "@PERMX1"....
Now, I you try to run this little script, you will see that for keywords
starting from "@PERMX10", the replaced values are WRONG. I don't know why,
Python replace only the "@PERMX1" leaving out the last char of the keyword
(that are 0, 1, 2, 3, 4, 5, 6 ). These values are left in the file and I
don't get the expected result.

Does anyone have an explanation? What am I doing wrong?

Basically, you are replacing things in the string one by one without
taking into account the fact that some of those things contain
others of those things.

Looking into "re.sub" might help, although in this case you
could solve the problem by doing one of several other things.

The simplest one that comes to mind is to sort the list of
keywords in reverse order by length of string, so that you
replace the longest items first.

-Peter
 
W

wes weston

Hello NG,

probably this is a basic question, but I'm going crazy... I am unable
to find an answer. Suppose that I have a file (that I called "Errors.txt")
which contains these lines:

MULTIPLY
'PERMX' @PERMX1 1 34 1 20 1 6 /
'PERMX' @PERMX2 1 34 21 41 1 6 /
'PERMX' @PERMX3 1 34 1 20 7 14 /
'PERMX' @PERMX4 1 34 21 41 7 14 /
'PERMX' @PERMX5 1 34 1 20 15 26 /
'PERMX' @PERMX6 1 34 21 41 15 26 /
'PERMX' @PERMX7 1 34 1 20 27 28 /
'PERMX' @PERMX8 1 34 21 41 27 28 /
'PERMX' @PERMX9 1 34 1 20 29 34 /
'PERMX' @PERMX10 1 34 21 41 29 34 /
'PERMX' @PERMX11 1 34 1 20 35 42 /
'PERMX' @PERMX12 1 34 21 41 35 42 /
'PERMX' @PERMX13 1 34 1 20 43 53 /
'PERMX' @PERMX14 1 34 21 41 43 53 /
'PERMX' @PERMX15 1 34 1 20 54 61 /
'PERMX' @PERMX16 1 34 21 41 54 61 /
/

I would like to replace all the occurrencies of the "keywords" (beginning
with the @ (AT) symbol) with some floating point value. As an example, this
is what I do:

# --- CODE BEGIN

import re
import string

# Set Some Dummy Parameter Values
parametervalues = range(1, 17)

# Open And Read The File With Keywords
fid = open("Errors.txt","rt")
onread = fid.read()
fid.close()

# Find All Keywords Starting with @ (AT)
regex = re.compile("[\@]\w+", re.IGNORECASE)
keywords = regex.findall(onread)

counter = 0

# Try To Replace The With Floats
for keys in keywords:
pars = parametervalues[counter]
onread = string.replace(onread, keys, str(float(pars)))
counter = counter + 1

# Write A New File With Replaced Values
fid = open("Errors_2.txt","wt")
fid.write(onread)
fid.close()

# --- CODE END


Now, I you try to run this little script, you will see that for keywords
starting from "@PERMX10", the replaced values are WRONG. I don't know why,
Python replace only the "@PERMX1" leaving out the last char of the keyword
(that are 0, 1, 2, 3, 4, 5, 6 ). These values are left in the file and I
don't get the expected result.

Does anyone have an explanation? What am I doing wrong?

Thanks to you all for your help.

Andrea.

andrea,
If you put in "keywords.reverse()" after getting
keywords, things may work better. Though not a good
fix, it illustrates part of the problem. If you replace
"@PERMX1" you also replace part of "@PERMX10".
Wouldn't it be better to read the file as lines
instead of strings?
wes
 
S

Sean McIlroy

I can't claim to have studied your problem in detail, but I get
reasonable results from the following:

filename = 'Errors.txt'
S = open(filename,'r').read().split()
f = lambda x: (x[0]=='@' and x[6:] + '.0') or (x=='/' and x + '\n') or
x
open(filename,'w').write(' '.join(map(f,S)))

HTH

-------------------------------------------------------------------------


Hello NG,

probably this is a basic question, but I'm going crazy... I am unable
to find an answer. Suppose that I have a file (that I called "Errors.txt")
which contains these lines:

MULTIPLY
'PERMX' @PERMX1 1 34 1 20 1 6 /
'PERMX' @PERMX2 1 34 21 41 1 6 /
'PERMX' @PERMX3 1 34 1 20 7 14 /
'PERMX' @PERMX4 1 34 21 41 7 14 /
'PERMX' @PERMX5 1 34 1 20 15 26 /
'PERMX' @PERMX6 1 34 21 41 15 26 /
'PERMX' @PERMX7 1 34 1 20 27 28 /
'PERMX' @PERMX8 1 34 21 41 27 28 /
'PERMX' @PERMX9 1 34 1 20 29 34 /
'PERMX' @PERMX10 1 34 21 41 29 34 /
'PERMX' @PERMX11 1 34 1 20 35 42 /
'PERMX' @PERMX12 1 34 21 41 35 42 /
'PERMX' @PERMX13 1 34 1 20 43 53 /
'PERMX' @PERMX14 1 34 21 41 43 53 /
'PERMX' @PERMX15 1 34 1 20 54 61 /
'PERMX' @PERMX16 1 34 21 41 54 61 /
/

I would like to replace all the occurrencies of the "keywords" (beginning
with the @ (AT) symbol) with some floating point value. As an example, this
is what I do:

# --- CODE BEGIN

import re
import string

# Set Some Dummy Parameter Values
parametervalues = range(1, 17)

# Open And Read The File With Keywords
fid = open("Errors.txt","rt")
onread = fid.read()
fid.close()

# Find All Keywords Starting with @ (AT)
regex = re.compile("[\@]\w+", re.IGNORECASE)
keywords = regex.findall(onread)

counter = 0

# Try To Replace The With Floats
for keys in keywords:
pars = parametervalues[counter]
onread = string.replace(onread, keys, str(float(pars)))
counter = counter + 1

# Write A New File With Replaced Values
fid = open("Errors_2.txt","wt")
fid.write(onread)
fid.close()

# --- CODE END


Now, I you try to run this little script, you will see that for keywords
starting from "@PERMX10", the replaced values are WRONG. I don't know why,
Python replace only the "@PERMX1" leaving out the last char of the keyword
(that are 0, 1, 2, 3, 4, 5, 6 ). These values are left in the file and I
don't get the expected result.

Does anyone have an explanation? What am I doing wrong?

Thanks to you all for your help.

Andrea.
 
S

Steven Bethard

Sean said:
f = lambda x: (x[0]=='@' and x[6:] + '.0') or (x=='/' and x + '\n') or
x

See "Inappropriate use of Lambda" in
http://www.python.org/moin/DubiousPython.

You're creating a named function, so there's no reason to use the
anonymous function syntax. Try:

def f(x):
return (x[0]=='@' and x[6:] + '.0') or (x=='/' and x + '\n') or x

or if it must be on one line:

def f(x): return (x[0]=='@' and x[6:] + '.0') or (x=='/' and x + '\n') or x

Personally, I would probably write this as IMHO more readable:

infile, outfile = open("Errors.txt"), open("Errors_2.txt")
for i, line in enumerate(infile):
permx, atpermx, rest = line.split(None, 2)
outfile.write(' '.join([permx, str(parameter_values), rest]))


In action:

py> s = """\
.... 'PERMX' @PERMX1 1 34 1 20 1 6
.... 'PERMX' @PERMX2 1 34 21 41 1 6
.... 'PERMX' @PERMX3 1 34 1 20 7 14
.... 'PERMX' @PERMX4 1 34 21 41 7 14
.... 'PERMX' @PERMX5 1 34 1 20 15 26
.... 'PERMX' @PERMX6 1 34 21 41 15 26
.... 'PERMX' @PERMX7 1 34 1 20 27 28
.... 'PERMX' @PERMX8 1 34 21 41 27 28
.... 'PERMX' @PERMX9 1 34 1 20 29 34
.... 'PERMX' @PERMX10 1 34 21 41 29 34
.... 'PERMX' @PERMX11 1 34 1 20 35 42
.... 'PERMX' @PERMX12 1 34 21 41 35 42
.... 'PERMX' @PERMX13 1 34 1 20 43 53
.... 'PERMX' @PERMX14 1 34 21 41 43 53
.... 'PERMX' @PERMX15 1 34 1 20 54 61
.... 'PERMX' @PERMX16 1 34 21 41 54 61
.... """
py> parameter_values = range(1, 17)
py> for i, line in enumerate(s.splitlines()):
.... permx, atpermx, rest = line.split(None, 2)
.... print ' '.join([permx, str(parameter_values), rest])
....
'PERMX' 1 1 34 1 20 1 6
'PERMX' 2 1 34 21 41 1 6
'PERMX' 3 1 34 1 20 7 14
'PERMX' 4 1 34 21 41 7 14
'PERMX' 5 1 34 1 20 15 26
'PERMX' 6 1 34 21 41 15 26
'PERMX' 7 1 34 1 20 27 28
'PERMX' 8 1 34 21 41 27 28
'PERMX' 9 1 34 1 20 29 34
'PERMX' 10 1 34 21 41 29 34
'PERMX' 11 1 34 1 20 35 42
'PERMX' 12 1 34 21 41 35 42
'PERMX' 13 1 34 1 20 43 53
'PERMX' 14 1 34 21 41 43 53
'PERMX' 15 1 34 1 20 54 61
'PERMX' 16 1 34 21 41 54 61

STeVe
 
S

Sean McIlroy

Alright, now it's too much. It's not enough that you're eliminating it
from the language, you have to stigmatize the lambda as well. You
should take some time to reflect that not everybody thinks the same
way. Those of us who are mathematically inclined like the lambda
because it fits in well with the way we already think. And besides, it
amounts to an explicit declaration that the function in question has
no side effects. And besides, it adds flexibility to the language. Go
ahead and throw it away, but you're making python less accessible for
those of us whose central concern is something other than programming.
("Single line" indeed!)
 
S

Steven Bethard

Sean said:
Alright, now it's too much. It's not enough that you're eliminating it
from the language, you have to stigmatize the lambda as well.

You misunderstand me. I don't have a problem with lambda when it's
appropriate, e.g. when used as an expression, where a statement is
forbidden. See my post examining some of the uses in the standard
library[1]. But when you're defining a named function, you should use
the named function construct. That's what it's there for. =)
And besides, it amounts to an explicit declaration that the function
in question has no side effects.

Not at all. Some examples of lambdas with side effects:

py> a
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
NameError: name 'a' is not defined
py> (lambda s: globals().__setitem__(s, 'side-effect'))('a')
py> a
'side-effect'

py> x = (lambda s: sys.stdout.write(s) or s.split())('side effect')
side effect
py> x
['side', 'effect']

py> s = set()
py> x = (lambda x: s.add(x) or x**2)(3)
py> s
set([3])
py> x
9

It is certainly possible to use lambda in such a way that it produces no
side effects, but it's definitely not guaranteed by the language.
And besides, it adds flexibility to the language.

I'm not sure I'd agree with flexibility... True, in some cases it can
allow more concise code, and occasionally it can read more clearly than
the other available options, but since you can use a function created
with def anywhere you can use a function created with lambda, I wouldn't
say that lambdas make Python any more flexible.

STeVe

[1]http://mail.python.org/pipermail/python-list/2004-December/257990.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,165
Latest member
JavierBrak
Top