re question

D

Daniel Schüle

Hello re gurus,

I wrote this pattern trying to get the "name" and the "content" of VHDL
package
I know that the file is a valid VHDL code, so actually there is no need to
perform
validation after 'end' token is found, but since it works fine I don't want
to touch it.

this is the pattern

pattern =
re.compile(r'^\s*package\s+(?P<name>\w+)\s+is\s+(?P<content>.*?)\s+end(\s+package)?(\s+(?P=name))?\s*;',
re.DOTALL | re.MULTILINE | re.IGNORECASE)

and the problem is that
package TEST is xyz end;
works but
package TEST123 is xyz end;
fails

\w is supposed to match [a-zA-Z0-9_] so I don't understand why numbers and
undescore let the pattern fail?
(there is a slight suspicion that it may be a re bug)

I also tried this pattern with the same results

pattern =
re.compile(r'^\s*package\s+(?P<name>.+?)\s+is\s+(?P<content>.*?)\s+end(\s+package)?(\s+(?P=name))?\s*;',
re.DOTALL | re.MULTILINE | re.IGNORECASE)

something must be wrong with (?P<name>\w+) inside the main pattern

thanks in advance
 
M

Marc 'BlackJack' Rintsch

this is the pattern

pattern =
re.compile(r'^\s*package\s+(?P<name>\w+)\s+is\s+(?P<content>.*?)\s+end(\s+package)?(\s+(?P=name))?\s*;',
re.DOTALL | re.MULTILINE | re.IGNORECASE)

and the problem is that
package TEST is xyz end;
works but
package TEST123 is xyz end;
fails

For me both work:

In [11]:pattern = (
t>.*?)\s+end(\s+package)?(\s+(?P=name))?\s*;', )\s+is\s+(?P<conten
.11.:re.DOTALL | re.MULTILINE | re.IGNORECASE))

In [12]:pattern.match('package TEST is xyz end;')
Out[12]:<_sre.SRE_Match object at 0x405b1650>

In [13]:pattern.match('package TEST123 is xyz end;')
Out[13]:<_sre.SRE_Match object at 0x405b15f8>

I have copy and pasted you code.

For debugging re's in Python you might take a
look at http://kodos.sourceforge.net/

Ciao,
Marc 'BlackJack' Rintsch
 
D

Daniel Schüle

Hi

[...]

hm, that's wired
I just tried it in python shell and it works but same code as script file
fails

for anyone who want to see for himself

# package.vhd file
bash % cat package.vhd
library ieee;
use ieee.std_logic_1164.all;

package TEST123 is
constant BASE
End Package Test;


# parser.py
bash % cat parser.py
#!/usr/bin/env python

import sys, re

reflags = re.DOTALL | re.MULTILINE | re.IGNORECASE

pattern =
re.compile(r'^\s*package\s+(?P<name>\w+)\s+is\s+(?P<content>.*?)\s+end(\s+package)?(\s+(?P=name))?\s*;',
reflags)

class PackageParser(object):
def __init__(self, filename):
self.package = file(filename).read()
def parse(self):
return pattern.search(self.package)

if __name__ == "__main__":
p = PackageParser("package.vhd")
m = p.parse()
if m is None:
print "nothing"
sys.exit(1)
print m.group("content")
print m.group("name")


# testing
bash % ./parser.py
nothing

ps:'2.4.2 (#2, Mar 3 2006, 13:32:59) \n[GCC 3.2.2 20030222 (Red Hat Linux
3.2.2-5)]'

Regards, Daniel
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,115
Latest member
JoshuaMoul
Top