incorrect(?) shlex behaviour

B

bill

Consider:['$(which', 'sh)']

Is this behavior correct? It seems that I should
either get one token, or the list
['$','(','which','sh',')'],
but certainly breaking it the way it does is
erroneous.

Can anyone explain why the string is being split
that way?
 
B

bill

Its gets worse:.... b = t.read_token()
.... if not b: break
.... print b
....
2
&
['2>&1']

It strikes me that split should be behaving exactly the same way as
read_token, but that may be a misunderstanding on my part of what split
is doing.

However, it is totally bizarre that read_token discards the '>' symbol
in the string! I don't know much about lexical analysis, but it
strikes me that discarding characters is a bad thing.
 
M

M.E.Farmer

bill said:
Its gets worse:... b = t.read_token()
... if not b: break
... print b
...
2
&
['2>&1']

It strikes me that split should be behaving exactly the same way as
read_token, but that may be a misunderstanding on my part of what split
is doing.

However, it is totally bizarre that read_token discards the '>' symbol
in the string! I don't know much about lexical analysis, but it
strikes me that discarding characters is a bad thing.
From the docs:
split(s[, comments])
Split the string s using shell-like syntax. If comments is False
(the default), the parsing of comments in the given string will be
disabled (setting the commenters member of the shlex instance to the
empty string). This function operates in POSIX mode. New in version
2.3.

Maybe looking at string methods split might help.['($(which', 'sh)']
From the docs:
read_token()
Read a raw token. Ignore the pushback stack, and do not interpret
source requests. (This is not ordinarily a useful entry point, and is
documented here only for the sake of completeness.)

# Just like in my first post''
# Your way
Hth,
M.E.Farmer
 
D

Donn Cave

"bill said:
Consider:['$(which', 'sh)']

Is this behavior correct? It seems that I should
either get one token, or the list
['$','(','which','sh',')'],
but certainly breaking it the way it does is
erroneous.

Can anyone explain why the string is being split
that way?

Python 2.3.5 (#1, Mar 20 2005, 20:38:20)
[GCC 3.3 20030304 (Apple Computer, Inc. build 1809)] on darwin
Type "help", "copyright", "credits" or "license" for more information.A lexical analyzer class for simple shell-like syntaxes.


This has a little potential to mislead. Bourne shell
syntax is naturally "shell-like", but it is not "simple" -
as grammars go, it's a notorious mess. In theory, someone
could certainly write Python code to accurately parse Bourne
shell statements, but that doesn't appear to have been the
intention here. The "Parsing Rules" section of the documentation
describes what you can expect, and right off hand I don't see
how the result you got was erroneous.

Donn Cave, (e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top