Find the ID, but how to select/copy the whole string by ID?

L

Leon

Hi everybody,

I am a beginer for Python, hope can get help from you guys.
What I want to do is :

Input an ID -> find the ID in the file -> copy the whole string <str
id='xxx'>yyyyy</str>

stringID = str(raw_input('Enter the string ID : '))
file = open('strings.txt')
sourcefile = file.read()
file.close()
sourcefile.find (stringID)

but how can I select and copy the specific string from <str> to </str>
with id I input?

Thanks!!!
 
F

Francesco Guerrieri

stringID = str(raw_input('Enter the string ID : '))
file = open('strings.txt')
sourcefile = file.read()
file.close()
sourcefile.find (stringID)

but how can I select and copy the specific string from <str> to </str>
with id I input?

If the file you are parsing is in xml, there are many parser out there
which can help you (this is discussed very often on this list, even
today) .

If the format is very simple and you really want to do it by hand, you
could do something like:

stringID = raw_input('Enter the string ID:')
for line in open('strings.txt'):
if line.find(stringID) > -1:
print 'Found!'

Note that find returns the index where the substring you look for is
found, and it returns -1 if the substring is not found. The problem is
that -1 is a valid index to refer to a character in a string (actually
it refers to the last character of the string), so be careful with
interpreting the results of the find method.

francesco
 
Z

Zentrader

sourcefile.find(stringID) returns the start location. You can use
print to see this. You can then slice from start+len(stringID) and
print it out. That should give you enough info to figure out how to
find and extract to the end of the string tag as well. There are
other ways to do this, but string.find is the easiest to understand.
 
C

Carsten Haese

If the file you are parsing is in xml, there are many parser out there
which can help you (this is discussed very often on this list, even
today) .

If the format is very simple and you really want to do it by hand, you
could do something like:

stringID = raw_input('Enter the string ID:')
for line in open('strings.txt'):
if line.find(stringID) > -1:
print 'Found!'

Using find() for XML parsing is like using a machine gun to pound a nail
into a wall. It is absolutely the wrong tool for the job. If the input
file looks something like this:

<str id="123">blah</str>
<str id="12">spam</str>

, then looking for id 12 is going to match on the wrong ID. Besides,
that code only tells you where something that looks like the ID you're
looking for is in the file. There is no guarantee that the match
actually occurs inside an ID attribute. It also doesn't help in
retrieving the text contents of the <str> tag that has this ID.

If your input is an XML file, using an actual XML parser is the only
correct solution.
 
F

Francesco Guerrieri

, then looking for id 12 is going to match on the wrong ID. Besides,
that code only tells you where something that looks like the ID you're
looking for is in the file. There is no guarantee that the match
actually occurs inside an ID attribute. It also doesn't help in
retrieving the text contents of the <str> tag that has this ID.

If your input is an XML file, using an actual XML parser is the only
correct solution.

You're perfectly right.
The code example was purposedly incomplete for this very reason.
It could be made less sensitive to false matches by constructing a
correct substring, something like

pattern = '<str id = ' + stringID + '>'

and then in the loop:

line.find(pattern)

but I agree that it would be wrong to proceed this way.
The motivation of my reply was more to suggest a better way to iterate
over a file than else... but since I've been confusing it probably
would have been better to avoid.

francesco
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,444
Messages
2,571,709
Members
48,796
Latest member
Greg L.
Top