escaping characters in filenames

J

J Kenneth King

I wrote a script to process some files using another program. One thing
I noticed was that both os.listdir() and os.path.walk() will return
unescaped file names (ie: "My File With Spaces & Stuff" instead of "My\
File\ With\ Spaces\ \&\ Stuff"). I haven't had much success finding a
module or recipe that escapes file names and was wondering if anyone
could point me in the right direction.

As an aside, the script is using subprocess.call() with the "shell=True"
parameter. There isn't really a reason for doing it this way (was just
the fastest way to write it and get a prototype working). I was
wondering if Popen objects were sensitive to unescaped names like the
shell. I intend to refactor the function to use Popen objects at some
point and thought perhaps escaping file names may not be entirely
necessary.

Cheers
 
M

MRAB

J said:
I wrote a script to process some files using another program. One thing
I noticed was that both os.listdir() and os.path.walk() will return
unescaped file names (ie: "My File With Spaces & Stuff" instead of "My\
File\ With\ Spaces\ \&\ Stuff"). I haven't had much success finding a
module or recipe that escapes file names and was wondering if anyone
could point me in the right direction.
That's only necessary if you're building a command line and passing it
as a string.
As an aside, the script is using subprocess.call() with the "shell=True"
parameter. There isn't really a reason for doing it this way (was just
the fastest way to write it and get a prototype working). I was
wondering if Popen objects were sensitive to unescaped names like the
shell. I intend to refactor the function to use Popen objects at some
point and thought perhaps escaping file names may not be entirely
necessary.
Pass the command line to Popen as a list of strings.
 
D

Dave Angel

J said:
I wrote a script to process some files using another program. One thing
I noticed was that both os.listdir() and os.path.walk() will return
unescaped file names (ie: "My File With Spaces & Stuff" instead of "My\
File\ With\ Spaces\ \&\ Stuff"). I haven't had much success finding a
module or recipe that escapes file names and was wondering if anyone
could point me in the right direction.

As an aside, the script is using subprocess.call() with the "shell=True"
parameter. There isn't really a reason for doing it this way (was just
the fastest way to write it and get a prototype working). I was
wondering if Popen objects were sensitive to unescaped names like the
shell. I intend to refactor the function to use Popen objects at some
point and thought perhaps escaping file names may not be entirely
necessary.

Cheers
There are dozens of meanings for escaping characters in strings.
Without some context, we're wasting our time.

For example, if the filename is to be interpreted as part of a URL, then
spaces are escaped by using %20. Exactly who is going to be using this
string you think you have to modify? I don't know of any environment
which expects spaces to be escaped with backslashes.

Be very specific. For example, if a Windows application is parsing its
own command line, you need to know what that particular application is
expecting -- Windows passes the entire command line as a single string.
But of course you may be invoking that application using
subprocess.Popen(), in which case some transformations happen to your
arguments before the single string is built. Then some more
transformations may happen in the shell. Then some more in the C
runtime library of the new process (if it happens to be in C, and if it
happens to use those libraries).

I'm probably not the one with the answer. But until you narrow down
your case, you probably won't attract the attention of whichever person
has the particular combination of experience that you're hoping for.

DaveA
 
N

Nobody

I wrote a script to process some files using another program. One thing
I noticed was that both os.listdir() and os.path.walk() will return
unescaped file names (ie: "My File With Spaces & Stuff" instead of "My\
File\ With\ Spaces\ \&\ Stuff"). I haven't had much success finding a
module or recipe that escapes file names and was wondering if anyone
could point me in the right direction.

As an aside, the script is using subprocess.call() with the "shell=True"
parameter. There isn't really a reason for doing it this way (was just
the fastest way to write it and get a prototype working). I was
wondering if Popen objects were sensitive to unescaped names like the
shell. I intend to refactor the function to use Popen objects at some
point and thought perhaps escaping file names may not be entirely
necessary.

Note that subprocess.call() is nothing more than:

def call(*popenargs, **kwargs):
return Popen(*popenargs, **kwargs).wait()

plus a docstring. It accepts exactly the same arguments as Popen(), with
the same semantics.

If you want to run a command given a program and arguments, you
should pass the command and arguments as a list, rather than trying to
construct a string.

On Windows the value of shell= is unrelated to whether the command is
a list or a string; a list is always converted to string using the
list2cmdline() function. Using shell=True simply prepends "cmd.exe /c " to
the string (this allows you to omit the .exe/.bat/etc extension for
extensions which are in %PATHEXT%).

On Unix, a string is first converted to a single-element list, so if you
use a string with shell=False, it will be treated as the name of an
executable to be run without arguments, even if contains spaces, shell
metacharacters etc.

The most portable approach seems to be to always pass the command as a
list, and to set shell=True on Windows and shell=False on Unix.

The only reason to pass a command as a string is if you're getting a
string from the user and you want it to be interpreted using the
platform's standard shell (i.e. cmd.exe or /bin/sh). If you want it to be
interpreted the same way regardless of platform, parse it into a
list using shlex.split().
 
J

J Kenneth King

Nobody said:
Note that subprocess.call() is nothing more than:

def call(*popenargs, **kwargs):
return Popen(*popenargs, **kwargs).wait()

plus a docstring. It accepts exactly the same arguments as Popen(), with
the same semantics.

If you want to run a command given a program and arguments, you
should pass the command and arguments as a list, rather than trying to
construct a string.

On Windows the value of shell= is unrelated to whether the command is
a list or a string; a list is always converted to string using the
list2cmdline() function. Using shell=True simply prepends "cmd.exe /c " to
the string (this allows you to omit the .exe/.bat/etc extension for
extensions which are in %PATHEXT%).

On Unix, a string is first converted to a single-element list, so if you
use a string with shell=False, it will be treated as the name of an
executable to be run without arguments, even if contains spaces, shell
metacharacters etc.

The most portable approach seems to be to always pass the command as a
list, and to set shell=True on Windows and shell=False on Unix.

The only reason to pass a command as a string is if you're getting a
string from the user and you want it to be interpreted using the
platform's standard shell (i.e. cmd.exe or /bin/sh). If you want it to be
interpreted the same way regardless of platform, parse it into a
list using shlex.split().

I understand; I think I was headed towards subprocess.Popen() either
way. It seems to handle the problem I posted about. And I got to learn
a little something on the way. Thanks!

Only now there's a new problem in that the output of the program is
different if I run it from Popen than if I run it from the command line.
The program in question is 'pdftotext'. More investigation to ensue.

Thanks again for the helpful post.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top