Looking for a regular expression for this...

M

malahal

Hi,
My string is a multi line string that contains "filename
<filename>\n" and "host <host>\n" entries among other things.

For example: s = """ filename X
host hostname1
blah...
host hostname2
blah...
filename Y
host hostname3
"""
Given a host name, I would like to get its filename (The closest
filename reading backwards from the host line). I could read each line
until I hit the host name, but I am looking for an RE that will do job.
The answer should be "Y" for host hostname3 and "X" for either host
hostname1 or hostname2.

Thanks in advance.

--Malahal.
 
J

John Machin

Hi,
My string is a multi line string that contains "filename
<filename>\n" and "host <host>\n" entries among other things.

For example: s = """ filename X
host hostname1
blah...
host hostname2
blah...
filename Y
host hostname3
"""
Given a host name, I would like to get its filename (The closest
filename reading backwards from the host line). I could read each line
until I hit the host name, but I am looking for an RE that will do job.

Looking for? REs don't lurk in the undergrowth waiting to be found. You
will need to write one (unless some misguided person spoon-feeds you).
What have you tried so far?
 
F

faulkner

idk, most regexes look surprisingly like undergrowth.

malahal, why don't you parse s into a dict? read each couple of lines
into a key-value pair.
 
M

malahal

OK, I tried this one. I am actually trying to parse dhcpd.conf file.

def get_filename(self):
p = "^[ \t]*filename[ \t]+(\S+).*?host[ \t]+%s\s*$" % self.host
pat = re.compile(p, re.MULTILINE | re.DOTALL)
mo = pat.search(self.confdata)
if mo:
return mo.group(1)
else:
return ""

self.host is the hostname and self.confdata is the string. It actually
matches the first filename that appears before the host entry. I want
the last one that appears before the host entry. I tried '.*?' assuming
it works, but now I know why it doesn't work!

Since I am only interested in a particular host's filename, I could
easily parse line by line. That is how it is done now, but would like to
know if there any RE that does the trick!

Thanks, Malahal.
 
J

John Machin

OK, I tried this one. I am actually trying to parse dhcpd.conf file.

def get_filename(self):
p = "^[ \t]*filename[ \t]+(\S+).*?host[ \t]+%s\s*$" % self.host
pat = re.compile(p, re.MULTILINE | re.DOTALL)
mo = pat.search(self.confdata)
if mo:
return mo.group(1)
else:
return ""

self.host is the hostname and self.confdata is the string. It actually
matches the first filename that appears before the host entry. I want
the last one that appears before the host entry. I tried '.*?' assuming
it works, but now I know why it doesn't work!

Since I am only interested in a particular host's filename, I could
easily parse line by line. That is how it is done now, but would like to
know if there any RE that does the trick!

Instead of
.*?
try
(?:.(?!filename))*?

Now forget about it and go back to the presumably legible code that you
have already :)

Cheers,
John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top