split a line, respecting double quotes

J

Jim

Is there some easy way to split a line, keeping together double-quoted
strings?

I'm thinking of
'a b c "d e"' --> ['a','b','c','d e']
.. I'd also like
'a b c "d \" e"' --> ['a','b','c','d " e']
which omits any s.split('"')-based construct that I could come up with.

Thank you,
JIm
 
V

vbgunz

Jim said:
Is there some easy way to split a line, keeping together double-quoted
strings?

using the re module I find this to probably be the easiest but in no
way is this gospel :)

import re
rex = re.compile(r'(".*?"|\S)')
sub = 'a b c "d e"'
res = [x for x in re.split(rex, sub) if not x.isspace()][1:-1]
print res # -> ['a', 'b', 'c', '"d e"']

basically import the re module, compile a pattern, identify a string,
create a list comprehension with a filter, slice out the result and
print to screen. I hope this helps.
 
F

faulkner

sorry, i didn't read all your post.
def test(s):
res = ['']
in_dbl = False
escaped = False
for c in s:
if in_dbl:
if escaped:
res[-1] += c
if c != '\\':
escaped = False
else:
res[-1] += c
if c == '\\':
escaped = True
elif c == '"':
res.append('')
in_dbl = False
elif c == ' ':
res.append('')
elif c == '"':
res.append('')
res[-1] += c
in_dbl = True
else:
res[-1] += c
while '' in res:
res.remove('')
return res
import re
re.findall('\".*\"|\S+', raw_input())
Is there some easy way to split a line, keeping together double-quoted
strings?

I'm thinking of
'a b c "d e"' --> ['a','b','c','d e']
. I'd also like
'a b c "d \" e"' --> ['a','b','c','d " e']
which omits any s.split('"')-based construct that I could come up with.

Thank you,
JIm
 
S

Steven Bethard

Jim said:
Is there some easy way to split a line, keeping together double-quoted
strings?

I'm thinking of
'a b c "d e"' --> ['a','b','c','d e']
. I'd also like
'a b c "d \" e"' --> ['a','b','c','d " e']
which omits any s.split('"')-based construct that I could come up with.
>>> import shlex
>>> shlex.split('a b c "d e"') ['a', 'b', 'c', 'd e']
>>> shlex.split(r'a b c "d \" e"')
['a', 'b', 'c', 'd " e']

Note that I had to use a raw string in the latter case because otherwise
there's no real backslash in the string::
'a b c "d \\" e"'

STeVe
 
V

vbgunz

Is there some easy way to split a line, keeping together double-quoted
import re
rex = re.compile(r'(".*?"|\S)')
sub = 'a b c "d e"'
res = [x for x in re.split(rex, sub) if not x.isspace()][1:-1]
print res # -> ['a', 'b', 'c', '"d e"']

instead of slicing the result out, you use this too:
res = [x for x in re.split(rex, sub) if x[0:].strip()]
 
S

Sion Arrowsmith

Jim said:
Is there some easy way to split a line, keeping together double-quoted
strings?

I'm thinking of
'a b c "d e"' --> ['a','b','c','d e']
. I'd also like
'a b c "d \" e"' --> ['a','b','c','d " e']
which omits any s.split('"')-based construct that I could come up with.
['a', 'b', 'c', 'd e']

It can't quite do the second one, but:['a', 'b', 'c', 'd " e']
isn't far off.

On the other hand, it's kind of a stupid solution. I'd really go with
shlex as someone suggested up thread.
 
R

Raymond Hettinger

Sion said:
['a', 'b', 'c', 'd " e']
isn't far off.

On the other hand, it's kind of a stupid solution.

IMO, this solution is on the right track.
FWIW, the StringIO wrapper is unnecessary.
Any iterable will do:
reader(['a b c "d e"'], delimiter=' ')


Raymond
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top