problem with re.MULTILINE

N

Necronymouse

Hello i ´ve got a little problem: I ´ve this text:
http://openpaste.org/en/secret/17343/pass-python and I need to parse
it. So i wrote this:
patternNode = re.compile("""
# Node (\w*).*
(.*)""", re.MULTILINE)


with open("test.msg", "r") as file:
testData = file.read()

for Node in re.findall(patternNode, testData):
print "Node:", Node[0]
print Node
<<<

but it prints only one line from text. If i am using re.DOTALL it
wouldn´t print anything.. So don´t you know whre the problem is?

Sorry for my English - it´s not my native language...
 
M

MRAB

Necronymouse said:
Hello i ´ve got a little problem: I ´ve this text:
http://openpaste.org/en/secret/17343/pass-python and I need to parse
it. So i wrote this:

patternNode = re.compile("""
# Node (\w*).*
(.*)""", re.MULTILINE)


with open("test.msg", "r") as file:
testData = file.read()

for Node in re.findall(patternNode, testData):
print "Node:", Node[0]
print Node
<<<

but it prints only one line from text. If i am using re.DOTALL it
wouldn´t print anything.. So don´t you know whre the problem is?
I assume you mean that it's giving you only the first line of text of
each node.

"(.*)" will capture a single (and possibly empty) line of text.

"(.+\n)" will capture a single non-empty line of text ending with a
newline.

I think you want to capture multiple non-empty lines, each line ending
with a newline:

patternNode = re.compile("""
# Node (\w*).*
((?:.+\n)*)""", re.MULTILINE)
Sorry for my English - it´s not my native language...

It's better than my Czech/Slovak (depending on what Google says)! :)
 
N

Necronymouse

Necronymouse said:
Hello i ´ve got a little problem: I ´ve this text:
http://openpaste.org/en/secret/17343/pass-pythonand I need to parse
it. So i wrote this:
patternNode = re.compile("""
# Node (\w*).*
(.*)""", re.MULTILINE)
with open("test.msg", "r") as file:
    testData = file.read()
for Node in re.findall(patternNode, testData):
    print "Node:", Node[0]
    print Node
<<<
but it prints only one line from text. If i am using re.DOTALL it
wouldn´t print anything.. So don´t you know whre the problem is?

I assume you mean that it's giving you only the first line of text of
each node.

"(.*)" will capture a single (and possibly empty) line of text.

"(.+\n)" will capture a single non-empty line of text ending with a
newline.

I think you want to capture multiple non-empty lines, each line ending
with a newline:

patternNode = re.compile("""
# Node (\w*).*
((?:.+\n)*)""", re.MULTILINE)
Sorry for my English - it´s not my native language...

It's better than my Czech/Slovak (depending on what Google says)! :)

Yeah this works ( ((?:.+\r\n)*) ), thanks.. It´s czech..
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,533
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top