tempfile Question

  • Thread starter =?ISO-8859-1?Q?Gregory_Pi=F1ero?=
  • Start date
?

=?ISO-8859-1?Q?Gregory_Pi=F1ero?=

Hey group,

I have a command line tool that I want to be able to call from a
Python script. The problem is that this tool only writes to a file.

So my solution is to give the tool a temporary file to write to and
then have Python read that file. I figure that's the safest way to
deal with this sort of thing. (But I'm open to better methods).

Here's my code so far, could anyone tell me the proper way to use
tempfile. This code won't let the tool write to the file because
Python has it locked. But I'm worried that if I close the file then
windows might take it away? I have no idea.

<code>
PDFTOTEXTPATH=r'C:\xpdf-3.01pl2-win32\xpdf-3.01pl2-win32\pdftotext.exe'

def read_pdf_text(filepath):
outfile=tempfile.NamedTemporaryFile()
outfilename=outfile.name
command=r'"%s" "%s" "%s"' % (PDFTOTEXTPATH,filepath,outfilename)
result=os.popen(command).read()
pdftext=outfile.read()
outfile.close()
return pdftext
</code>

Much thanks!
 
J

John Machin

Hey group,

I have a command line tool that I want to be able to call from a
Python script. The problem is that this tool only writes to a file.

So my solution is to give the tool a temporary file to write to and
then have Python read that file. I figure that's the safest way to
deal with this sort of thing. (But I'm open to better methods).

Here's my code so far, could anyone tell me the proper way to use
tempfile. This code won't let the tool write to the file because
Python has it locked. But I'm worried that if I close the file then
windows might take it away? I have no idea.

Me neither, not having faced this situation before. To acquire an idea,
I'd Read The Fantastic Manual:
(my comments enclosed in [])
"""
TemporaryFile( [mode='w+b'[, bufsize=-1[, suffix[, prefix[, dir]]]]])

Return a file (or file-like) object that can be used as a temporary
storage area. The file is created using mkstemp. It will be destroyed as
soon as it is closed (including an implicit close when the object is
garbage collected).[That seems to answer one question] Under Unix, the
directory entry for the file is removed immediately after the file is
created. Other platforms do not support this; your code should not rely
on a temporary file created using this function having or not having a
visible name in the file system.
The mode parameter defaults to 'w+b' so that the file created can be
read and written without being closed. Binary mode is used so that it
behaves consistently on all platforms without regard for the data that
is stored. bufsize defaults to -1, meaning that the operating system
default is used.

The dir, prefix and suffix parameters are passed to mkstemp().


NamedTemporaryFile( [mode='w+b'[, bufsize=-1[, suffix[, prefix[, dir]]]]])

This function operates exactly as TemporaryFile() does, except that the
file is guaranteed to have a visible name in the file system (on Unix,
the directory entry is not unlinked). That name can be retrieved from
the name member of the file object. Whether the name can be used to open
the file a second time, while the named temporary file is still open,
varies across platforms (it can be so used on Unix; it cannot on Windows
NT or later [That seems to answer the other question]). New in version 2.3.
"""

So I'd be thinking about using the (deprecated) mktemp() instead,
perhaps trying to cut down the chance of conflicts by (a) using a prefix
e.g. "pdf2txttmp" and/or (b) using a dir of "." -- then asking the
cognoscenti what are the drawbacks of this approach.

HTH,
John
 
J

John Machin

Hey group,

I have a command line tool that I want to be able to call from a
Python script. The problem is that this tool only writes to a file.

Another Fantastic Manual gives another idea:
"""
Pdftotext reads the PDF file, PDF-file, and writes a text
file, text-file. If text-file is not specified, pdftotext
converts file.pdf to file.txt. If text-file is '-', the
text is sent to stdout.
"""

Why not try reading back the child process's stdout? I would have
thought there would be no shortage of examples and help for this idea ...

Cheers,
John
 
D

Dennis Lee Bieber

The dir, prefix and suffix parameters are passed to mkstemp().
said:
So I'd be thinking about using the (deprecated) mktemp() instead,

I think you passed over the mkstemp() variation. Granted, it, too,
returns an opened file, along with the full pathname of the file, but it
requires the caller to handle eventual disposal of the file.

Merely close the opened file; pass the pathname to the subprocess,
await completion of subprocess, reopen the file for use in Python...
Then at the end, close the file and use the pathname to delete the file
from the system.
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
J

John Machin

I think you passed over the mkstemp() variation. Granted, it, too,
returns an opened file, along with the full pathname of the file, but it
requires the caller to handle eventual disposal of the file.

Merely close the opened file; pass the pathname to the subprocess,
await completion of subprocess, reopen the file for use in Python...
Then at the end, close the file and use the pathname to delete the file
from the system.

I passed over mkstemp() because (according to my reading of the manual),
mkstemp() requires an *extra* step (close the file), leaving the
situation then *exactly* the same as with mktemp() i.e. some pirate
process may molest the file before the caller's child process can open
the file.

Cheers,
John
 
S

Steve Holden

John said:
I passed over mkstemp() because (according to my reading of the manual),
mkstemp() requires an *extra* step (close the file), leaving the
situation then *exactly* the same as with mktemp() i.e. some pirate
process may molest the file before the caller's child process can open
the file.
Surely if you set permissions correctly on /tmp (sticky-but to require
ownership for deletion) and you create your temporary file with sensible
ownership and permissions then rogue processes without root privileges
can't do anything bad to your files. Or am I wrong?

Of course if a rogue process has root privileges then all security bets
are off anyway.

regards
Steve
 
D

Dennis Lee Bieber

I passed over mkstemp() because (according to my reading of the manual),
mkstemp() requires an *extra* step (close the file), leaving the
situation then *exactly* the same as with mktemp() i.e. some pirate
process may molest the file before the caller's child process can open
the file.
mktemp() creates ONLY the file name, but not the file itself. This
means another process calling mktemp() has the possibility of generating
the SAME file name before the first opens/creates the named file.

mkstemp() not only creates the file name, but creates the physical
file. A second process calling mkstemp(), therefore, will NOT result in
the same file name being used.


-=-=-=-=-
mkstemp( [suffix[, prefix[, dir[, text]]]])

Creates a temporary file in the most secure manner possible. There are
no race conditions in the file's creation, assuming that the platform
properly implements the O_EXCL flag for os.open(). The file is readable
<snip>
mkstemp() returns a tuple containing an OS-level handle to an open file
(as would be returned by os.open()) and the absolute pathname of that
file, in that order. New in version 2.3.

<snip>

mktemp( [suffix[, prefix[, dir]]])
Return an absolute pathname of a file that did not exist at the time the
call is made. The prefix, suffix, and dir arguments are the same as for
mkstemp().
Warning: Use of this function may introduce a security hole in your
program. By the time you get around to doing anything with the file name
it returns, someone else may have beaten you to the punch.
-=-=-=-=-
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
J

John Machin

mktemp() creates ONLY the file name, but not the file itself. This
means another process calling mktemp() has the possibility of generating
the SAME file name before the first opens/creates the named file.

Thanks, Dennis. You are quite correct. I'm a dill: I read """Return an
absolute pathname of a file that did not exist at the time the call is
made""" as implying that the file existed after the call, brushed aside
my brief wonderment about what was deprecable about all that, and didn't
even try it at the interactive prompt.

Cheers,
John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top