best way to parse a function-call-like string?

M

mh

I have some strings that look like function calls, e.g.

"junkpkg.f1"
"junkpkg.f1()"
"junkpkg.f1('aaa')"
"junkpkg.f1('aaa','bbb')"
"junkpkg.f1('aaa','bbb','ccc')"
"junkpkg.f1('aaa','with,comma')"

and I need to split them into the function name and list of parms, e.g.

"junkpkg.f1", []
"junkpkg.f1", []
"junkpkg.f1", ['aaa']
"junkpkg.f1", ['aaa','bbb']
"junkpkg.f1", ['aaa','bbb','ccc']
"junkpkg.f1", ['aaa','with,comma']

What's the best way to do this? I would be interested in either
of two approaches:

- a "real" way which comprehensively

- a quick-and-dirty way which handles most cases, so that I
can get my coding partner running quickly while I do the
"real" way. will the csv module do the right thing for
the parm list?

Many TIA!!
Mark
 
T

Tim Wintle

I have some strings that look like function calls, e.g.

"junkpkg.f1"
"junkpkg.f1()"
"junkpkg.f1('aaa')"
"junkpkg.f1('aaa','bbb')"
"junkpkg.f1('aaa','bbb','ccc')"
"junkpkg.f1('aaa','with,comma')"

and I need to split them into the function name and list of parms,
e.g.

"junkpkg.f1", []
"junkpkg.f1", []
"junkpkg.f1", ['aaa']
"junkpkg.f1", ['aaa','bbb']
"junkpkg.f1", ['aaa','bbb','ccc']
"junkpkg.f1", ['aaa','with,comma']

quick and dirty

for s in string_list:
if "(" in s and s[-1] == ")":
parts = s.split("(")
fn, args = s[0],s[1][:-1].split(",")


Tim Wintle
 
M

mh

Robert Kern said:
On 2009-02-26 15:29, (e-mail address removed) wrote:
Use the compiler module to generate an AST and walk it. That's the real
way. For me, that would also be the quick way because I am familiar with
the API, but it may take you a little bit of time to get used to it.

ah, that's just what I was hoping for...
in the meantime, here's my quick and dirty:


import csv

def fakeparse(s):

print "======="
argl=[]
ix=s.find('(')
if ix == -1:
func=s
argl=[]
else:
func=s[0:ix]
argstr=s[ix+1:]
argstr=argstr[:argstr.rfind(')')]
print argstr
for argl in csv.reader([argstr], quotechar="'"):
pass

print s
print func
print argl

return func,argl


print fakeparse("junkpkg.f1")
print fakeparse("junkpkg.f1()")
print fakeparse("junkpkg.f1('aaa')")
print fakeparse("junkpkg.f1('aaa','bbb')")
print fakeparse("junkpkg.f1('aaa','bbb','ccc')")
print fakeparse("junkpkg.f1('aaa','xx,yy')")
 
R

Robert Kern

I have some strings that look like function calls, e.g.

"junkpkg.f1"
"junkpkg.f1()"
"junkpkg.f1('aaa')"
"junkpkg.f1('aaa','bbb')"
"junkpkg.f1('aaa','bbb','ccc')"
"junkpkg.f1('aaa','with,comma')"

and I need to split them into the function name and list of parms, e.g.

"junkpkg.f1", []
"junkpkg.f1", []
"junkpkg.f1", ['aaa']
"junkpkg.f1", ['aaa','bbb']
"junkpkg.f1", ['aaa','bbb','ccc']
"junkpkg.f1", ['aaa','with,comma']

What's the best way to do this? I would be interested in either
of two approaches:

- a "real" way which comprehensively

- a quick-and-dirty way which handles most cases, so that I
can get my coding partner running quickly while I do the
"real" way. will the csv module do the right thing for
the parm list?

Use the compiler module to generate an AST and walk it. That's the real
way. For me, that would also be the quick way because I am familiar with
the API, but it may take you a little bit of time to get used to it.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though
it had
an underlying truth."
-- Umberto Eco
 
J

John Machin

I have some strings that look like function calls, e.g.

        "junkpkg.f1"
        "junkpkg.f1()"
        "junkpkg.f1('aaa')"
        "junkpkg.f1('aaa','bbb')"
        "junkpkg.f1('aaa','bbb','ccc')"
        "junkpkg.f1('aaa','with,comma')"

Examples are better than no examples, but a grammar would be a great
help. What about
"junkpkg.f1('aaa','with)parenthesis')"
"junkpkg.f1('aaa','with''ONEapostrophe')"
and I need to split them into the function name and list of parms, e.g.

        "junkpkg.f1", []
        "junkpkg.f1", []
        "junkpkg.f1", ['aaa']
        "junkpkg.f1", ['aaa','bbb']
        "junkpkg.f1", ['aaa','bbb','ccc']
        "junkpkg.f1", ['aaa','with,comma']

What's the best way to do this?  I would be interested in either
of two approaches:

   - a "real" way which comprehensively

   - a quick-and-dirty way which handles most cases, so that I
     can get my coding partner running quickly while I do the
     "real" way.  will the csv module do the right thing for
     the parm list?

It should, for "most cases". Any reason you can't try it out for
yourself?

It appears to "work" in the sense that if you have isolated a string
containing the parameters, the csv module can be used to "parse" it:

| Python 2.6.1 (r261:67517, Dec 4 2008, 16:51:00) [MSC v.1500 32 bit
(Intel)] on win32
| Type "help", "copyright", "credits" or "license" for more
information.
| >>> import csv, cStringIO
| >>> argstring = "'aaa','with,comma'"
| >>> csvargs = lambda s: list(csv.reader(cStringIO.StringIO(s),
quotechar="'"))
| >>> csvargs(argstring)
| [['aaa', 'with,comma']]
| >>> csvargs("'aaa','with''ONEapostrophe'")
| [['aaa', "with'ONEapostrophe"]]
| >>>

HTH
John
 
P

Paul McGuire

I have some strings that look like function calls, e.g.

        "junkpkg.f1"
        "junkpkg.f1()"
        "junkpkg.f1('aaa')"
        "junkpkg.f1('aaa','bbb')"
        "junkpkg.f1('aaa','bbb','ccc')"
        "junkpkg.f1('aaa','with,comma')"

and I need to split them into the function name and list of parms, e.g.

Pyparsing will easily carve up these function declarations, and will
give you some room for easy extension once your parsing job starts to
grow to include other variants (like arguments other than quoted
strings, for instance). Using the results names ("name" and "args"),
then parsed fields are easy to get at after the parsing is done.

-- Paul

from pyparsing import Word, alphas, alphanums, delimitedList, \
Optional, Literal, Suppress, quotedString

LPAR,RPAR = map(Suppress,"()")
ident = Word(alphas+"_", alphanums+"_")
fnName = delimitedList(ident,".",combine=True)
arg = quotedString
fnCall = fnName("name") + Optional(LPAR +
Optional(delimitedList(arg)) + RPAR, default=[])("args")

tests = """junkpkg.f1
junkpkg.f1()
junkpkg.f1('aaa')
junkpkg.f1('aaa','bbb')
junkpkg.f1('aaa','bbb','ccc')
junkpkg.f1('aaa','with,comma')""".splitlines()

for t in tests:
fn = fnCall.parseString(t)
print fn.name
for a in fn.args:
print "-",a
print

Prints:

junkpkg.f1

junkpkg.f1

junkpkg.f1
- 'aaa'

junkpkg.f1
- 'aaa'
- 'bbb'

junkpkg.f1
- 'aaa'
- 'bbb'
- 'ccc'

junkpkg.f1
- 'aaa'
- 'with,comma'
 
M

mh

Paul McGuire said:
Pyparsing will easily carve up these function declarations, and will

I didn't know about this module, it looks like what I was looking
for... Thanks!!!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top