How can we get to the end of a quote inside a string

R

rajmohan.h

Hi all,
Suppose I have a string which contains quotes inside quotes -
single and double quotes interchangeably -
s = "a1' b1 " c1' d1 ' c2" b2 'a2"
I need to start at b1 and end at b2 - i.e. I have to parse the
single quote strings from inside s.

Is there an existing string quote parser which I can use or
should I write a parser myself?

If somebody could help me on this I would be much obliged.

Regards
kR/\/
 
A

Antoon Pardon

Hi all,
Suppose I have a string which contains quotes inside quotes -
single and double quotes interchangeably -
s = "a1' b1 " c1' d1 ' c2" b2 'a2"
I need to start at b1 and end at b2 - i.e. I have to parse the
single quote strings from inside s.

Is there an existing string quote parser which I can use or
should I write a parser myself?

If somebody could help me on this I would be much obliged.

You could use a combination of split and join in this case.

#use a single quote as a seperator to split the string is a list of substrings
ls = s.split("'")

#remove what comes before the first and after the last single quote
ls = ls[1:-1]

#reassemble the string between the outermost single quotes.
s = "'".join(ls)

#strip spaces in front and after if you wish
s = s.strip()
 
P

Paul McGuire

Hi all,
    Suppose I have a string which contains quotes inside quotes -
single and double quotes interchangeably -
 s = "a1' b1 " c1' d1 ' c2" b2 'a2"
     I need to start at b1 and end at b2 - i.e. I have to parse the
single quote strings from inside s.

Pyparsing defines a helper method called nestedExpr - typically it is
used to find nesting of ()'s, or []'s, etc., but I was interested to
see if I could use nestedExpr to match nested ()'s, []'s, AND {}'s all
in the same string (like we used to do in our algebra class to show
nesting of higher levels than parens - something like "{[a + 3*(b-c)]
+ 7}" - that is, ()'s nest within []'s, and []'s nest within {}'s).
This IS possible, but it uses some advanced pyparsing methods. I
adapted this example to map to your case - this was much simpler, as
""s nest within ''s, and ''s nest within ""s. I still keep a stack of
previous nesting, but I'm not sure this was absolutely necessary.
Here is the working code with your example:

from pyparsing import Forward, oneOf, NoMatch, Literal, CharsNotIn,
nestedExpr

# define special subclass of Forward, that saves previous contained
# expressions in a stack
class ForwardStack(Forward):
def __init__(self):
super(ForwardStack,self).__init__()
self.exprStack = []
self << NoMatch()
def __lshift__(self,expr):
self.exprStack.append(self.expr)
super(ForwardStack,self).__lshift__(expr)
return self
def pop(self):
self.expr = self.exprStack.pop()

# define the grammar
opening = ForwardStack()
closing = ForwardStack()
opening << oneOf(["'", '"'])
closing << NoMatch()
matchedNesting = nestedExpr(opening, closing, CharsNotIn('\'"'),
ignoreExpr=None)

# define parse-time callbacks
alternate = {'"':"'", "'":'"'}
def pushAlternate(t):
# closing expression should match the current opening quote char
closing << Literal( t[0] )
# if we find the other opening quote char, it is the beginning of
# a nested quote
opening << Literal( alternate[ t[0] ] )
def popClosing():
closing.pop()
opening.pop()
# when these expressions match, the parse action will be called
opening.setParseAction(pushAlternate)
closing.setParseAction(popClosing)

# parse the test string
s = """ "a1' b1 " c1' d1 ' c2" b2 'a2" """

print matchedNesting.parseString(s)[0]


Prints:

['a1', [' b1 ', [' c1', [' d1 '], ' c2'], ' b2 '], 'a2']


-- Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,072
Latest member
trafficcone

Latest Threads

Top