subprocess woes

K

kj

I'm trying to write a function, sort_data, that takes as argument
the path to a file, and sorts it in place, leaving the last "sentinel"
line in its original position (i.e. at the end). Here's what I
have (omitting most error-checking code):

def sort_data(path, sentinel='.\n'):
tmp_fd, tmp = tempfile.mkstemp()
out = os.fdopen(tmp_fd, 'wb')
cmd = ['/usr/local/bin/sort', '-t', '\t', '-k1,1', '-k2,2']
p = Popen(cmd, stdin=PIPE, stdout=out)
in_ = file(path, 'r')
while True:
line = in_.next()
if line != sentinel:
p.stdin.write(line)
else:
break
in_.close()
p.stdin.close()
retcode = p.wait()
if retcode != 0:
raise CalledProcessError(retcode, cmd)
out.write(sentinel)
out.close()
shutil.move(tmp, path)


This works OK, except that it does not catch the stderr from the
called sort process. The problem is how to do this. I want to to
avoid having to create a new file just to capture this stderr
output. I would like instead to capture it to an in-memory buffer.
Therefore I tried using a StringIO object as the stderr parameter
to Popen, but this resulted in the error "StringIO instance has no
attribute 'fileno'".

How can I capture stderr in the scenario depicted above?

TIA!

kynn
 
M

Mike Driscoll

I'm trying to write a function, sort_data, that takes as argument
the path to a file, and sorts it in place, leaving the last "sentinel"
line in its original position (i.e. at the end).  Here's what I
have (omitting most error-checking code):

def sort_data(path, sentinel='.\n'):
    tmp_fd, tmp = tempfile.mkstemp()
    out = os.fdopen(tmp_fd, 'wb')
    cmd = ['/usr/local/bin/sort', '-t', '\t', '-k1,1', '-k2,2']
    p = Popen(cmd, stdin=PIPE, stdout=out)
    in_ = file(path, 'r')
    while True:
        line = in_.next()
        if line != sentinel:
            p.stdin.write(line)
        else:
            break
    in_.close()
    p.stdin.close()
    retcode = p.wait()
    if retcode != 0:
        raise CalledProcessError(retcode, cmd)
    out.write(sentinel)
    out.close()
    shutil.move(tmp, path)

This works OK, except that it does not catch the stderr from the
called sort process.  The problem is how to do this.  I want to to
avoid having to create a new file just to capture this stderr
output.  I would like instead to capture it to an in-memory buffer.
Therefore I tried using a StringIO object as the stderr parameter
to Popen, but this resulted in the error "StringIO instance has no
attribute 'fileno'".

How can I capture stderr in the scenario depicted above?

TIA!

kynn

According to the docs for subprocess module (which you don't appear to
be using even though that's what you used for your subject line), you
can set stderr to stdout:

http://docs.python.org/library/subprocess.html

You can use cStringIO to create a "file-like" object in memory:

http://docs.python.org/library/stringio.html
 
K

kj

In said:
I'm trying to write a function, sort_data, that takes as argument
the path to a file, and sorts it in place, leaving the last "sentinel"
line in its original position (i.e. at the end). =A0Here's what I
have (omitting most error-checking code):

def sort_data(path, sentinel=3D'.\n'):
=A0 =A0 tmp_fd, tmp =3D tempfile.mkstemp()
=A0 =A0 out =3D os.fdopen(tmp_fd, 'wb')
=A0 =A0 cmd =3D ['/usr/local/bin/sort', '-t', '\t', '-k1,1', '-k2,2']
=A0 =A0 p =3D Popen(cmd, stdin=3DPIPE, stdout=3Dout)
=A0 =A0 in_ =3D file(path, 'r')
=A0 =A0 while True:
=A0 =A0 =A0 =A0 line =3D in_.next()
=A0 =A0 =A0 =A0 if line !=3D sentinel:
=A0 =A0 =A0 =A0 =A0 =A0 p.stdin.write(line)
=A0 =A0 =A0 =A0 else:
=A0 =A0 =A0 =A0 =A0 =A0 break
=A0 =A0 in_.close()
=A0 =A0 p.stdin.close()
=A0 =A0 retcode =3D p.wait()
=A0 =A0 if retcode !=3D 0:
=A0 =A0 =A0 =A0 raise CalledProcessError(retcode, cmd)
=A0 =A0 out.write(sentinel)
=A0 =A0 out.close()
=A0 =A0 shutil.move(tmp, path)

This works OK, except that it does not catch the stderr from the
called sort process. =A0The problem is how to do this. =A0I want to to
avoid having to create a new file just to capture this stderr
output. =A0I would like instead to capture it to an in-memory buffer.
Therefore I tried using a StringIO object as the stderr parameter
to Popen, but this resulted in the error "StringIO instance has no
attribute 'fileno'".

How can I capture stderr in the scenario depicted above?

TIA!

kynn
According to the docs for subprocess module (which you don't appear to
be using even though that's what you used for your subject line),

Sorry, I should have been clearer. I *am* using subprocess; that's
were Popen and PIPE come from. I omitted the import lines along
with much else to keep the code concise. Maybe I overdid it.
you can set stderr to stdout:

This won't do: I'm already using stdout to collect the output of
sort.
You can use cStringIO to create a "file-like" object in memory:

Nope. I get the same error I get when I try this idea using
StringIO (i.e. "no attribute 'fileno'").

kj
 
K

kj

Upon re-reading my post I realize that I left out some important
details.


In said:
I'm trying to write a function, sort_data, that takes as argument
the path to a file, and sorts it in place, leaving the last "sentinel"
line in its original position (i.e. at the end).

I neglected to mention that the files I intend to use this with
are huge (of the order of 1GB); this is why I want to farm the work
out to GNU's sort.
Here's what I
have (omitting most error-checking code):

I should have included the following in the quoted code:

from subprocess import Popen, PIPE, CalledProcessError
def sort_data(path, sentinel='.\n'):
tmp_fd, tmp = tempfile.mkstemp()
out = os.fdopen(tmp_fd, 'wb')
cmd = ['/usr/local/bin/sort', '-t', '\t', '-k1,1', '-k2,2']
p = Popen(cmd, stdin=PIPE, stdout=out)
in_ = file(path, 'r')
while True:
line = in_.next()
if line != sentinel:
p.stdin.write(line)
else:
break
in_.close()
p.stdin.close()
retcode = p.wait()
if retcode != 0:
raise CalledProcessError(retcode, cmd)
out.write(sentinel)
out.close()
shutil.move(tmp, path)

This works OK, except that it does not catch the stderr from the
called sort process. The problem is how to do this. I want to to
avoid having to create a new file just to capture this stderr
output. I would like instead to capture it to an in-memory buffer.
Therefore I tried using a StringIO object as the stderr parameter
to Popen, but this resulted in the error "StringIO instance has no
attribute 'fileno'".
 
C

Chris Rebert

I'm trying to write a function, sort_data, that takes as argument
the path to a file, and sorts it in place, leaving the last "sentinel"
line in its original position (i.e. at the end).  Here's what I
have (omitting most error-checking code):

def sort_data(path, sentinel='.\n'):
   tmp_fd, tmp = tempfile.mkstemp()
   out = os.fdopen(tmp_fd, 'wb')
   cmd = ['/usr/local/bin/sort', '-t', '\t', '-k1,1', '-k2,2']
   p = Popen(cmd, stdin=PIPE, stdout=out)
   in_ = file(path, 'r')
   while True:
       line = in_.next()
       if line != sentinel:
           p.stdin.write(line)
       else:
           break
   in_.close()
   p.stdin.close()
   retcode = p.wait()
   if retcode != 0:
       raise CalledProcessError(retcode, cmd)
   out.write(sentinel)
   out.close()
   shutil.move(tmp, path)


This works OK, except that it does not catch the stderr from the
called sort process.  The problem is how to do this.  I want to to
avoid having to create a new file just to capture this stderr
output.  I would like instead to capture it to an in-memory buffer.
Therefore I tried using a StringIO object as the stderr parameter
to Popen, but this resulted in the error "StringIO instance has no
attribute 'fileno'".

How can I capture stderr in the scenario depicted above?

Use a pipe by setting stderr=PIPE?:

p = Popen(cmd, stdin=PIPE, stdout=out, stderr=PIPE)
#...
error_output = p.stderr.read()

Or am I missing something?

Cheers,
Chris
 
K

kj

In said:
I'm trying to write a function, sort_data, that takes as argument
the path to a file, and sorts it in place, leaving the last "sentinel"
line in its original position (i.e. at the end). =C2=A0Here's what I
have (omitting most error-checking code):

def sort_data(path, sentinel=3D'.\n'):
=C2=A0 =C2=A0tmp_fd, tmp =3D tempfile.mkstemp()
=C2=A0 =C2=A0out =3D os.fdopen(tmp_fd, 'wb')
=C2=A0 =C2=A0cmd =3D ['/usr/local/bin/sort', '-t', '\t', '-k1,1', '-k2,2'= ]
=C2=A0 =C2=A0p =3D Popen(cmd, stdin=3DPIPE, stdout=3Dout)
=C2=A0 =C2=A0in_ =3D file(path, 'r')
=C2=A0 =C2=A0while True:
=C2=A0 =C2=A0 =C2=A0 =C2=A0line =3D in_.next()
=C2=A0 =C2=A0 =C2=A0 =C2=A0if line !=3D sentinel:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0p.stdin.write(line)
=C2=A0 =C2=A0 =C2=A0 =C2=A0else:
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0break
=C2=A0 =C2=A0in_.close()
=C2=A0 =C2=A0p.stdin.close()
=C2=A0 =C2=A0retcode =3D p.wait()
=C2=A0 =C2=A0if retcode !=3D 0:
=C2=A0 =C2=A0 =C2=A0 =C2=A0raise CalledProcessError(retcode, cmd)
=C2=A0 =C2=A0out.write(sentinel)
=C2=A0 =C2=A0out.close()
=C2=A0 =C2=A0shutil.move(tmp, path)


This works OK, except that it does not catch the stderr from the
called sort process. =C2=A0The problem is how to do this. =C2=A0I want to= to
avoid having to create a new file just to capture this stderr
output. =C2=A0I would like instead to capture it to an in-memory buffer.
Therefore I tried using a StringIO object as the stderr parameter
to Popen, but this resulted in the error "StringIO instance has no
attribute 'fileno'".

How can I capture stderr in the scenario depicted above?
Use a pipe by setting stderr=3DPIPE?:
p =3D Popen(cmd, stdin=3DPIPE, stdout=3Dout, stderr=3DPIPE)
#...
error_output =3D p.stderr.read()

Thanks, that did the trick. Thanks!

kj
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top