shlex.split != shlex.shlex get_token til eof

P

p.lavarre

How can I instantiate shlex.shlex to behave like shlex.split does?

I see shlex.split gives me what I want:

import shlex
print shlex.split("1.2e+3")[0] # 1.2e+3

But every doc'ed instantiation of shlex.shlex surprisingly gives me
something else:

s1 = shlex.shlex("1.2e+3", None, False)
print s1.get_token() # 1
s2 = shlex.shlex("1.2e+3", None, True)
print s2.get_token() # 1

I can get closer to the shlex.split state by hacking the shlex.shlex
state before the first get_token invocation:

s3 = shlex.shlex("1.2e+3", None, True)
s3.wordchars += ".+-"
print s3.get_token() # 1.2e+3

But how can I know if I've discovered the last hack that I need?

I ask because I'm patching a cmd.Cmd app to tweak the interpretation of
tokens. It was calling shlex.split. My class instead subclasses
shlex.shlex to catch the context of instream.tell(). That context lets
me see the significant differences between such sources as r'0xAB' and
r'0x"AB"'.

Can I somehow override just what I mean to change in shlex.split, and
not more?

Curiously yours, thanks in advance, Pat LaVarre
http://docs.python.org/lib/module-shlex.html
2.4.2#67 Python, 2.3.5#1 Python
 
P

p.lavarre

shlex.split gives me what I want ...
every doc'ed instantiation of shlex.shlex ... gives me something else ...

Aye, the discrepancies are gross & legion - presumably astonishing only
newbies like me.

Here's a more dramatic example:
import shlex
shlex.split("//./PhysicalDrive9 //./Cdrom9 //./Tape9 //./A:")[0] '//./PhysicalDrive9'
shlex.shlex("//./PhysicalDrive9 //./Cdrom9 //./Tape9 //./A:", "", True).get_token() '/'
 
P

p.lavarre

I see shlex.split gives me what I want ...
shlex.shlex surprisingly gives me something else ...
I can get closer ... by hacking ...
.wordchars += ".+-"

Kindly offline I was told,

Try patching .whitespace_split = True instead. Compare:

shlex.split("//./PhysicalDrive9 //./Cdrom9 //./Tape9 //./A:")

lex = shlex.shlex("//./PhysicalDrive9 //./Cdrom9 //./Tape9 //./A:")
lex.whitespace_split = True
list(lex)

Why and how often this patch makes shlex.shlex and shlex.split agree, I
still don't know - but for these specific examples, it works.
 
P

p.lavarre

Kindly offline the answer is:

(a) Python installation usually includes source, and thus
(b) UTSL:

$ pwd
C:\Python25\Lib\shlex.py
$ ...
$ pwd
/Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5
$ grep -A 5 "def split" shlex.py
def split(s, comments=False):
lex = shlex(s, posix=True)
lex.whitespace_split = True
if not comments:
lex.commenters = ''
return list(lex)
$

Voila the THREE undocumented ways that shlex.split has been overriding
the shlex.shlex defaults: nothing in wordchars, yes something in
whitespace_split, but also stuff in commenters and in posix. (Of
these, the posix override is documented.)
 
P

p.lavarre

Correspondingly, to the Faq Wiki I added:

/// http://pyfaq.infogami.com/installed-index

Q: Where is Python installed on my machine?

A: Binaries in bin/, source in lib/, doc somewhere else. Read the
module source when you find its doc incomplete.

On Linux, ...

On the Mac, straight from Apple:
/System/Library/Frameworks/Python.framework/Versions/*

On the Mac, straight from Apple:
/System/Library/Frameworks/Python.framework/Versions/*

On the Mac, downloaded yourself:
/Library/Frameworks/Python.framework/Versions/*

On Windows, downloaded yourself:
C:\Python*

///
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top