split parameter line with quotes

T

teddyber

Hello,

first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

thanks
 
N

Nanjundi

Hello,

first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

thanks

This is unconventional and using eval is not SAFE too.[('algorithm', 'md5-sess'), ('maxbuf', 1024), ('charset', 'utf-8'),
('cipher', 'rc4-40,rc4-56,rc4,des,3des'), ('qop', 'auth,auth-int,auth-
conf')]....
algorithm = md5-sess
maxbuf = 1024
charset = utf-8
cipher = rc4-40,rc4-56,rc4,des,3des
qop = auth,auth-int,auth-conf

For safe eval, take a look at http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/364469

-N
 
J

Joshua Kugler

teddyber said:
first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

Take a look at the shlex module. You might be able to fiddle with the shlex
object and convince it to split on the commas. But, to be honest, that
above would be a lot easier to parse if the dividing commas were spaces
instead.

j
 
T

teddyber

This is unconventional and using eval is not SAFE too.>>> s = 'qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,3des",maxbuf=1024,charset="utf-8",algorithm="md5-sess"'
thanks for that. The problem is i don't have charset="utf-8" but
charset=utf-8. Sometimes " sometimes not!
[('algorithm', 'md5-sess'), ('maxbuf', 1024), ('charset', 'utf-8'),
('cipher', 'rc4-40,rc4-56,rc4,des,3des'), ('qop', 'auth,auth-int,auth-
conf')]>>> for k,v in d.iteritems(): print k, '=', v

...
algorithm = md5-sess
maxbuf = 1024
charset = utf-8
cipher = rc4-40,rc4-56,rc4,des,3des
qop = auth,auth-int,auth-conf

For safe eval, take a look athttp://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/364469

-N
 
P

Paul McGuire

Hello,

first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

thanks

Those quoted strings sure are pesky when you try to split along
commas. Here is a solution using pyparsing - note the argument field
access methods at the bottom. Also, the parse action attached to
integer will do conversion of the string to an int at parse time.

More info on pyparsing at http://pyparsing.wikispaces.com.

-- Paul

from pyparsing import Word, nums, alphas, quotedString, \
delimitedList, Literal, CharsNotIn, Dict, Group, \
removeQuotes

arg = '''qop="auth,auth-int,auth-conf",
cipher="rc4-40,rc4-56,rc4,des,3des",
maxbuf=1024,charset=utf-8,algorithm=md5-sess'''

# format is: delimited list of key=value groups, where value is
# a quoted string, an integer, or a non-quoted string up to the
next
# ',' character
key = Word(alphas)
EQ = Literal("=").suppress()
integer = Word(nums).setParseAction(lambda t:int(t[0]))
quotedString.setParseAction(removeQuotes)
other = CharsNotIn(",")
val = quotedString | integer | other

# parse each key=val arg into its own group
argList = delimitedList( Group(key + EQ + val) )
args = argList.parseString(arg)

# print the parsed results
print args.asList()
print

# add dict-like retrieval capabilities, by wrapping in a Dict
expression
argList = Dict(delimitedList( Group(key + EQ + val) ))
args = argList.parseString(arg)

# print the modified results, using dump() (shows dict entries too)
print args.dump()

# access the values by key name
print "Keys =", args.keys()
print "cipher =", args["cipher"]

# or can access them like attributes of an object
print "maxbuf =", args.maxbuf


Prints:

[['qop', 'auth,auth-int,auth-conf'], ['cipher', 'rc4-40,rc4-56,rc4,des,
3des'], ['maxbuf', 1024], ['charset', 'utf-8'], ['algorithm', 'md5-
sess']]

[['qop', 'auth,auth-int,auth-conf'], ['cipher', 'rc4-40,rc4-56,rc4,des,
3des'], ['maxbuf', 1024], ['charset', 'utf-8'], ['algorithm', 'md5-
sess']]
- algorithm: md5-sess
- charset: utf-8
- cipher: rc4-40,rc4-56,rc4,des,3des
- maxbuf: 1024
- qop: auth,auth-int,auth-conf
Keys = ['maxbuf', 'cipher', 'charset', 'algorithm', 'qop']
maxbuf = 1024
cipher = rc4-40,rc4-56,rc4,des,3des
 
R

Russ P.

Hello,

first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

thanks

The problem is that you are using commas for delimiters at two
different levels.

I would start by replacing the commas between quotation marks with
some other delimiter, such as spaces of semicolons. To do that, step
through each character and keep a count of quotation marks. While the
count is odd, replace each comma with the selected alternative
delimiter. While the count is even, leave the comma. [An alternative
would be to replace the commas outside the quotation marks.]

Once that is done, the problem is straightforward. Split the string on
commas (using string.split(",")). Then split each item in the list by
"=". Use the [0] element for the key, and use the [1] element for the
value (first stripping off the quotation marks if necessary). If you
need to further split each of the values, just split on whatever
delimiter you chose to replace the commas.
 
R

Russ P.

first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

The problem is that you are using commas for delimiters at two
different levels.

I would start by replacing the commas between quotation marks with
some other delimiter, such as spaces of semicolons. To do that, step
through each character and keep a count of quotation marks. While the
count is odd, replace each comma with the selected alternative
delimiter. While the count is even, leave the comma. [An alternative
would be to replace the commas outside the quotation marks.]

Once that is done, the problem is straightforward. Split the string on
commas (using string.split(",")). Then split each item in the list by
"=". Use the [0] element for the key, and use the [1] element for the
value (first stripping off the quotation marks if necessary). If you
need to further split each of the values, just split on whatever
delimiter you chose to replace the commas.


One more point. Whoever chose the structure of the string you are
parsing didn't do a very good job. If you know that person, you should
tell him or her to use different delimiters at the different levels.
Use commas for one level, and spaces or semicolons for the other
level. Then you won't have to "correct" the string before you parse
it.
 
R

Reedick, Andrew

-----Original Message-----
From: [email protected] [mailto:python-
[email protected]] On Behalf Of teddyber
Sent: Friday, January 11, 2008 1:51 PM
To: (e-mail address removed)
Subject: split parameter line with quotes

Hello,

first i'm a newbie to python (but i searched the Internet i swear).
i'm looking for some way to split up a string into a list of pairs
'key=value'. This code should be able to handle this particular
example string :

qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess

i know i can do that with some regexp (i'm currently trying to learn
that) but if there's some other way...

import re
s='''qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,3des",m
axbuf=1024,charset=utf-8,algorithm=md5-sess'''
print s

all = re.findall(r'(.*?)=(".*?"|[^"]*?)(,|$)', s)
print all

for i in all:
print i[0], "=", i[1].strip('"')


Output:
qop="auth,auth-int,auth-conf",cipher="rc4-40,rc4-56,rc4,des,3des",maxbuf
=1024,charset=utf-8,algorithm=md5-sess

[
('qop', '"auth,auth-int,auth-conf"', ','),
('cipher', '"rc4-40,rc4-56,rc4,des,3des"', ','),
('maxbuf', '1024', ','),
('charset', 'utf-8', ','),
('algorithm', 'md5-sess', '')
]

qop = auth,auth-int,auth-conf
cipher = rc4-40,rc4-56,rc4,des,3des
maxbuf = 1024
charset = utf-8
algorithm = md5-sess




*****

The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential, proprietary, and/or privileged material. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from all computers. GA621
 
T

teddyber

i know this is some kind of bad design but the problem is that i
receive this string from a jabber server and i cannot do anything to
change this. i should try to verify if that's correct implementation
of jabber protocol still...

The problem is that you are using commas for delimiters at two
different levels.
I would start by replacing the commas between quotation marks with
some other delimiter, such as spaces of semicolons. To do that, step
through each character and keep a count of quotation marks. While the
count is odd, replace each comma with the selected alternative
delimiter. While the count is even, leave the comma. [An alternative
would be to replace the commas outside the quotation marks.]
Once that is done, the problem is straightforward. Split the string on
commas (using string.split(",")). Then split each item in the list by
"=". Use the [0] element for the key, and use the [1] element for the
value (first stripping off the quotation marks if necessary). If you
need to further split each of the values, just split on whatever
delimiter you chose to replace the commas.

One more point. Whoever chose the structure of the string you are
parsing didn't do a very good job. If you know that person, you should
tell him or her to use different delimiters at the different levels.
Use commas for one level, and spaces or semicolons for the other
level. Then you won't have to "correct" the string before you parse
it.
 
T

teddyber

here's the solution i have for the moment :

t = shlex.shlex(data)
t.wordchars = t.wordchars + "/+.-"
r=''
while 1:
token = t.get_token()
if not token:
break
if not token==',': r = r+token
else: r = r + ' '
self.DEBUG(r,'ok')
for pair in r.split(' '):
key,value=pair.split('=', 1)
print(key+':'+value)

i know this is not perfect still but i'm coming a long way from very
bad php habits! :eek:)
and thanks for your help!
 
R

Ricardo Aráoz

teddyber said:
here's the solution i have for the moment :

t = shlex.shlex(data)
t.wordchars = t.wordchars + "/+.-"
r=''
while 1:
token = t.get_token()
if not token:
break
if not token==',': r = r+token
else: r = r + ' '
self.DEBUG(r,'ok')
for pair in r.split(' '):
key,value=pair.split('=', 1)
print(key+':'+value)

i know this is not perfect still but i'm coming a long way from very
bad php habits! :eek:)
and thanks for your help!

Maybe you like :
3des",maxbuf=1024,charset=utf-8,algorithm=md5-sess'
dict(zip([k[-1].strip() for k in (j.split(',') for j in ''.join(i
for i in x if i != '"').split('='))][:-1], [k[:-1] or k for k in
(j.split(',') for j in ''.join(i for i in x if i != '"').split('='))][1:]))

{'maxbuf': ['1024'], 'cipher': ['rc4-40', 'rc4-56', 'rc4', 'des', '
3des'], 'charset': ['utf-8'], 'algorithm': ['md5-sess'], 'qop': ['
auth', 'auth-int', 'auth-conf']}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top