Problem creating a regular expression to parse open-iscsi, iscsiadmoutput (help?)


R

rice.cruft

I am parsing the output of an open-iscsi command that contains several blocks of data for each data set. Each block has the format:

Target: iqn.1992-04.com.emc:vplex-000000008460319f-0000000000000007
Current Portal: 221.128.52.224:3260,7
Persistent Portal: 221.128.52.224:3260,7
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1996-04.de.suse:01:7c9741b545b5
Iface IPaddress: 221.128.52.214
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 154
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE


I have worked out the regex to grab the values I am interested with the exception of the 'iSCSI Connection State' and 'iSCSI Session State'. My regex is

regex = re.compile( r'''
# Target name, iqn
Target:\s+(?P<iqn>\S+)\s*
# Target portal
\s+Current\sPortal:\s*
(?P<ipaddr>\w+\.\w+\.\w+\.\w+):(?P<port>\d+),(?P<tag>\d+)
# skip lines...
[\s\S]*?
# Initiator name, iqn
Iface\s+Initiatorname:\s+(?P<initiatorName>\S+)\s*
# Initiator port, IP address
Iface\s+IPaddress:\s+(?P<initiatorIP>\S+)
# skip lines...
[\s\S]*?
# Session ID
SID:\s+(?P<SID>\d+)\s*
# Connection state
iSCSI\ +Connection\ +State:\s+(?P<connState>\w+\s*\w*)
[\s\S]*? <<<<<< without this the regex fails
# Session state
iSCSI\ +Session\ +State:\s+(?P<sessionState>\w+)
''', re.VERBOSE|re.MULTILINE)


I tried using \s* to swallow the whitespace between the to iSCSI lines. No joy... However [\s\S]*? allows the regex to succeed. But that seems to me to be overkill (I am not trying to skip lines of text here.) Also note that I am using \ + to catch spaces between the words. On the two problem lines, using \s+ between the label words fails.

The regex is compiled and fed to a finditer() call... With debug prints:

for m in regex.finditer(inp):
print 'SSSSSS %d' % len(m.groups())
for i in range(len(m.groups())):
print ' SSS--> %s' % (m.group(i+1))

myDetails = [ m.groupdict() for m in regex.finditer(inp)]
print 'ZZZZ myDetails %s' % myDetails


Any help would be appreciated. Lastly, a version of this regex as a non-VERBOSE expression works as expected.. Something about re.VERBOSE... ????

Thanks.

--Eric
 
Ad

Advertisements

A

Andreas Perstinger

I am parsing the output of an open-iscsi command that contains
severalblocks of data for each data set. Each block has the format: [SNIP]
I tried using \s* to swallow the whitespace between the to iSCSI
lines. No joy... However [\s\S]*? allows the regex to succeed. But that
seems to me to be overkill (I am not trying to skip lines of text here.)
Also note that I am using \ + to catch spaces between the words. On the
two problem lines, using \s+ between the label words fails.
Changing
# Connection state
iSCSI\ +Connection\ +State:\s+(?P<connState>\w+\s*\w*)
[\s\S]*? <<<<<< without this the regex fails
# Session state
iSCSI\ +Session\ +State:\s+(?P<sessionState>\w+)

to
# Connection state
iSCSI\s+Connection\s+State:\s+(?P<connState>\w+\s*\w*)\s*
# Session state
iSCSI\s+Session\s+State:\s+(?P<sessionState>\w+)

gives me
# 'test' is the example string
myDetails = [ m.groupdict() for m in regex.finditer(test)]
print myDetails
[{'initiatorIP': '221.128.52.214', 'connState': 'LOGGED IN', 'SID':
'154', 'ipaddr': '221.128.52.224', 'initiatorName':
'iqn.1996-04.de.suse:01:7c9741b545b5', 'sessionState': 'LOGGED_IN',
'iqn': 'iqn.1992-04.com.emc:vplex-000000008460319f-0000000000000007',
'tag': '7', 'port': '3260'}]

for your example (same for the original regex).
It looks like it works (Python 2.7.3) and there is something else
breaking the regex.

Bye, Andreas
 
K

Kevin LaTona

I am parsing the output of an open-iscsi command that contains several blocks of data for each data set. Each block has the format:
Lastly, a version of this regex as a non-VERBOSE expression works as expected.. Something about re.VERBOSE... ????
Snip


With the following code tweaks in Python 2.7.2, I find it works with VERBOSE for me, but not without.

I would say the regex could still use some more adjustments yet.

-Kevin





import re

inp ="""
Target: iqn.1992-04.com.emc:vplex-000000008460319f-0000000000000007
Current Portal: 221.128.52.224:3260,7
Persistent Portal: 221.128.52.224:3260,7
**********
Interface:
**********
Iface Name: default
Iface Transport: tcp
Iface Initiatorname: iqn.1996-04.de.suse:01:7c9741b545b5
Iface IPaddress: 221.128.52.214
Iface HWaddress: <empty>
Iface Netdev: <empty>
SID: 154
iSCSI Connection State: LOGGED IN
iSCSI Session State: LOGGED_IN
Internal iscsid Session State: NO CHANGE
"""

regex = re.compile( r'''
# Target name, iqn
Target:\s+(?P<iqn>\S+)\s*
# Target portal
\s+Current\sPortal:\s*
(?P<ipaddr>\w+\.\w+\.\w+\.\w+):(?P<port>\d+),(?P<tag>\d+)
# skip lines...
[\s\S]*?
# Initiator name, iqn
Iface\s+Initiatorname:\s+(?P<initiatorName>\S+)\s*
# Initiator port, IP address
Iface\s+IPaddress:\s+(?P<initiatorIP>\S+)
# skip lines...
[\s\S]*?
# Session ID
SID:\s+(?P<SID>\d+)\s*
# Connection state
iSCSI\ +Connection\ +State:\s+(?P<connState>\w+\s*\w*)
[\s\S]*?
# Session state iSCSI
iSCSI\s+Session\s+State:\s+(?P<sessionState>\w+)\s*
# Session state Internal
Internal\s+iscsid\s+Session\s+State:.*\s+(?P<ss2>\w+\s\w+)
''', re.VERBOSE|re.MULTILINE)

myDetails = [ m.groupdict() for m in regex.finditer(inp)][0]
for k,v in myDetails.iteritems():
print k,v




#*************

If you want just the values back in the order parsed this will work for now.


for match in regex.findall(inp):
for item in range(len(match)):
print match[item]
 
K

Kevin LaTona

With the following code tweaks in Python 2.7.2, I find it works with VERBOSE for me, but not without.


Sorry had a small bleep while writing that last line this AM.

Of course the regex pattern would work in VERBOSE mode as that was how it was presented.

Without VERBOSE each line of the pattern would of needed to have been enclosed in quote or double quote marks.


http://docs.python.org/2/library/re.html#re.VERBOSE


-Kevin
 
R

rice.stew

I am parsing the output of an open-iscsi command that contains
severalblocks of data for each data set. Each block has the format:

[SNIP]
for your example (same for the original regex).

It looks like it works (Python 2.7.3) and there is something else

breaking the regex.



Bye, Andreas

Indeed. "there is something else breaking the regex." ..and a note if you are trying this regex. You need to have more than one block of Target data to see issues related to scanning multiple instances of the data. My regex works as expected if I leave those two lines related to the iSCSI Connection and Session states. For now I am scratching my head...
 
Ad

Advertisements

R

rice.stew

Sorry had a small bleep while writing that last line this AM.



Of course the regex pattern would work in VERBOSE mode as that was how itwas presented.



Without VERBOSE each line of the pattern would of needed to have been enclosed in quote or double quote marks.





http://docs.python.org/2/library/re.html#re.VERBOSE





-Kevin

Yes. I tested with and without re.VERBOSE along with the required quoting changes. For both cases the oddities persist. Why [\s\S]+ is necessary between the two iSCSI Connection/Session lines is a mystery -- \s+? or similar should be sufficient to swallow the whitespace.

--Eric
 
Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top