Python open a named pipe == hanging?

R

Rochester

Hi,

I just found out that the general open file mechanism doesn't work
for named pipes (fifo). Say I wrote something like this and it
simply hangs python:

#!/usr/bin/python

import os

os.mkfifo('my_fifo')

open('my_fifo', 'r+').write('some strings.')
x = os.popen('cat my_fifo').read()

print x


I know I could use a tempfile instead of a fifo in this very
simple case, I just want to know is there a standard way of
handling fifos withing python. Especially the non-trivial case
when I want to call a diff like system program which takes two
files as input. Say I have two python string objects A and B, I
want to call diff to see what is the different between those two
strings, and return the finding as a string obj C back to python.
This certainly can be done in this way:

open('tmpfile1', 'w').write(A)
open('tmpfile2', 'w').write(B)
C = os.popen('diff tmpfile1 tmpfile2').read()

But that's kinda awkward isn't it? :) The Bash way of doing this
would be (suppose A is the stdout of prog2, B is the stdout of
prog3):

diff <(prog2) <(prog3) > C

What is the best way of doing this in Python?

Thank you!
 
R

Rob Williscroft

Rochester wrote in in
comp.lang.python:
I just found out that the general open file mechanism doesn't work
for named pipes (fifo). Say I wrote something like this and it
simply hangs python:

#!/usr/bin/python

import os

os.mkfifo('my_fifo')

open('my_fifo', 'r+').write('some strings.')
x = os.popen('cat my_fifo').read()

You will probably need to use os.open()

w = os.open( 'my_fifo', os.O_WRONLY )
r = os.open( 'my_fifo', os.O_RDONLY )

though I'm using windows now, so can't test it.
print x


I know I could use a tempfile instead of a fifo in this very
simple case, I just want to know is there a standard way of
handling fifos withing python.

See also os.pipe()

Rob.
 
A

Alex Martelli

Rochester said:
Hi,

I just found out that the general open file mechanism doesn't work
for named pipes (fifo). Say I wrote something like this and it
simply hangs python:

....just as it would "hang" any other language...!-). Looks like you may
not be fully cognizant of the semantics of fifos -- understandably,
because the man pages on most systems are not too limpid on the subject.
But there's a good one, e.g., at
<http://opengroup.org/onlinepubs/007908799/xsh/open.html> , and I
selectively quote...:

"""
O_RDWR
Open for reading and writing. The result is undefined if this flag is
applied to a FIFO.
"""

(which means that your open with r+ _might_ cause the system to make
demons fly out of your nose, according to e.g.
<http://everything2.com/index.pl?node_id=922462> ...:); so, _don't_ use
'r+' to open your fifo. But, moreover...:

"""
O_NONBLOCK
When opening a FIFO with O_RDONLY or O_WRONLY set: If O_NONBLOCK is set:

An open() for reading only will return without delay. An open() for
writing only will return an error if no process currently has the file
open for reading.
If O_NONBLOCK is clear:

An open() for reading only will block the calling thread until a thread
opens the file for writing. An open() for writing only will block the
calling thread until a thread opens the file for reading.
"""

This last paragraph is the crucial one: the fundamental semantics of
FIFOs in Unix is for the writing and reading processes (or more modernly
"threads":) to "rendezvous" around their respective calls to open (2),
blocking until both of them reach the open point. The semantics in the
nonblocking case(s) are somewhat weird (to my way of thinking, but then,
my Unix background _is_ somewhat dated!-), but they wouldn't help you
anyway, it seems to me -- looks like you'd like drastically different
semantics (with neither open blocking, or just the reading one), but
Unix just doesn't really offer them...


Alex
 
R

Rochester

Thank you for your advise. So, it turns out that fifos are quite useless
in Python programming then, which is quite disappointing to me :-(

I am not saying that I _have to_ use fifo, afterall it is a rather odd
thingy not in fasion since the last iceage... I am just disappointed by
the fact that the old plain Bash seems to excel Python in this special
aspect.

I am new to Python and much more comfortable in Bash programming. A
simple Bash script like this would take the advantage of a fifo, hence
reduce the overhead of unneccesarry temporary files creation:

#!/bin/bash

mkfifo my_fifo
echo "this is a string in my fifo!" > my_fifo &
cat my_fifo
rm my_fifo

Isn't it neat?

Anyway, I think every scripting language has its pros and cons. Bash is
probably more flexible in dealing with fifos and multiway pipes (through
the magic menchanism of process substitution).

Thank you!


On Thu, 03 Aug 2006 22:13:56 -0400, Alex Martelli <[email protected]> wrote:
 
D

Donn Cave

Rochester said:
I just found out that the general open file mechanism doesn't work
for named pipes (fifo). Say I wrote something like this and it
simply hangs python:

#!/usr/bin/python

import os

os.mkfifo('my fifo')

open('my fifo', 'r+').write('some strings.')
x = os.popen('cat my fifo').read()

print x

I believe your problem is that, by the time you open the
pipe for read, it has already been closed by its writer.
If you contrive to keep the file pointer around until after
the reader has opened it, then you can read some data from
it. (You can't read "all" the data, though - since you still
have the file open, it has no end of file - so you can't
solve the problem exactly as stated above.)

And the odds are fair that when you get this working, you
will run into some other inconvenient behavior. Named pipes
are a little tricky.
I know I could use a tempfile instead of a fifo in this very
simple case, I just want to know is there a standard way of
handling fifos withing python. Especially the non-trivial case
when I want to call a diff like system program which takes two
files as input. Say I have two python string objects A and B, I
want to call diff to see what is the different between those two
strings, and return the finding as a string obj C back to python.
This certainly can be done in this way:

open('tmpfile1', 'w').write(A)
open('tmpfile2', 'w').write(B)
C = os.popen('diff tmpfile1 tmpfile2').read()

But that's kinda awkward isn't it? :) The Bash way of doing this
would be (suppose A is the stdout of prog2, B is the stdout of
prog3):

diff <(prog2) <(prog3) > C

What is the best way of doing this in Python?

Version 1. That's also how shell programmers do it, as far
as I know. That bash thing is a neat gimmick, borrowed from
Plan 9's "rc", but not a standard shell feature and not needed
for conventional UNIX programming. That's my opinion.

Donn Cave, (e-mail address removed)
 
D

Donn Cave

Rochester said:
Thank you for your advise. So, it turns out that fifos are quite useless
in Python programming then, which is quite disappointing to me :-(

I am not saying that I _have to_ use fifo, afterall it is a rather odd
thingy not in fasion since the last iceage... I am just disappointed by
the fact that the old plain Bash seems to excel Python in this special
aspect.

Not by a very great margin, but it is indeed very convenient
for process creation and redirection, so when that's the
nature of the task, it's likely the right choice.
I am new to Python and much more comfortable in Bash programming. A
simple Bash script like this would take the advantage of a fifo, hence
reduce the overhead of unneccesarry temporary files creation:

#!/bin/bash

mkfifo my_fifo
echo "this is a string in my fifo!" > my_fifo &
cat my_fifo
rm my_fifo

Isn't it neat?

If you like it, good for you. Do you understand why it
works, when your Python one didn't? You put the output
in a background process; did it occur to you to try that
in Python?
Anyway, I think every scripting language has its pros and cons. Bash is
probably more flexible in dealing with fifos and multiway pipes (through
the magic menchanism of process substitution).

Multiway pipes?

Donn Cave, (e-mail address removed)
 
A

Antoon Pardon

Thank you for your advise. So, it turns out that fifos are quite useless
in Python programming then, which is quite disappointing to me :-(

I am not saying that I _have to_ use fifo, afterall it is a rather odd
thingy not in fasion since the last iceage... I am just disappointed by
the fact that the old plain Bash seems to excel Python in this special
aspect.

It doesn't.
I am new to Python and much more comfortable in Bash programming. A
simple Bash script like this would take the advantage of a fifo, hence
reduce the overhead of unneccesarry temporary files creation:

#!/bin/bash

mkfifo my_fifo
echo "this is a string in my fifo!" > my_fifo &
cat my_fifo
rm my_fifo

Isn't it neat?

Look you put the echo in the background, because otherwise your script
would block. Nothing stops you from starting a thread in python to open
the fifo in write mode, the thread would block but the main program
could still continue.

There also is the os.pipe and os.popen calls that may be more usefull
in specific cases.
 
A

Alex Martelli

Rochester said:
Thank you for your advise. So, it turns out that fifos are quite useless
in Python programming then, which is quite disappointing to me :-(

Uh? How so? They're exactly as useful (or useless) as in any other
language: if you want the open-for-writing to NOT block, you must start
the separate process that open-for-reads _before_ you open-for-write.
I am not saying that I _have to_ use fifo, afterall it is a rather odd
thingy not in fasion since the last iceage... I am just disappointed by
the fact that the old plain Bash seems to excel Python in this special
aspect.

If your purpose is to launch a large number of separate processes,
that's what bash does for every step, while in Python you have to ask
for separate processes _explicitly_ (in any of many ways, such as
os.system, os.spawn, os.popen, subprocess, and so forth). If you mostly
want to do processing in your "main process", then the "fork and exec a
new process for every small simple thing" approach of Bash is a minus.

But as long as you're happy with just starting new processes, go ahead,
you can easily do it in Python -- just start them in the right order!

I am new to Python and much more comfortable in Bash programming. A
simple Bash script like this would take the advantage of a fifo, hence
reduce the overhead of unneccesarry temporary files creation:

#!/bin/bash

mkfifo my_fifo
echo "this is a string in my fifo!" > my_fifo &
cat my_fifo
rm my_fifo

Isn't it neat?

Not at all -- in fact, I find it rather silly. Still, if you want the
same (silly) functionality in Python, it's trivial to do, without all
that many superfluous external processes and backgrounding -- just start
the processes in the right order, e.g., assuming 'cat' is something
potentially important that MUST run in an external process:


import os

fifoname = 'my_fifo'

os.mkfifo(fifoname)
op = os.popen('cat ' + fifoname)
print >> open(fifoname, 'w'), "some string"

of = open(fifoname, 'w')
print >>of, "some string"
of.close()
os.unlink(fifoname)

print op.read(),


As long as you start cat (which does the open-for-reading) before trying
to do the open-for-writing, everything will be hunky-dory -- just as the
manpage I selectively quoted clearly implied. This is not an exact
equivalent to your bash script (it only runs cat in an external process,
nothing else) but it would be easier to make it closer (forking more
useless extra processes) if that was somehow a requirement.
Anyway, I think every scripting language has its pros and cons. Bash is
probably more flexible in dealing with fifos and multiway pipes (through
the magic menchanism of process substitution).

What's "process substitution" -- fork and exec? Note the '&' you need
in your echo command to tell bash "and don't wait() for this process",
otherwise it would also "hang" (since the default is fork-exec-wait). In
Python, I have chosen to phrase "start an external process but don't
wait for it" as the popen line, and finish things off with the last
print statement; but of course there are abundant alternatives.

Basically, if just about all you want is to run external processes,
shell languages such as bash offer neater syntax, because fork, exec and
wait is what they do by default, so you save some "syntax overhead" made
necessary by languages which require this to be made explicit. But in
practice "proper" shell scripts (where, say, over half the lines are
usefully running external processes) far too often degenerate to the
point where most of the logic is in fact internal to the script, pushing
the fraction of lines usefully running external processes to 20% or
less, as you add logic, error-handling, and other glue functionality to
the script. In such situations, languages such as Perl, Python or Ruby
offer far better facilities (and the fact that they don't _by default_
fork, exec and wait for external processes helps them out here).

Thank you!

You're welcome, even though I fell the post you're responding to may not
have deserved thanks since it appears to not have been as useful and
clear as I had meant it to be...


Alex
 
A

Alex Martelli

Donn Cave said:
I believe your problem is that, by the time you open the
pipe for read, it has already been closed by its writer.

Hmmm, no: the problem is, he never opens the pipe for write, because the
open blocks (will not proceed until somebody opens the fifo for reading,
which in turn won't happen here because the open blocks).

Try:

a = open('my_fifo', 'w')
b = os.popen('cat my_fifo')
a.write ...
a.close()
c = b.read()

this STILL doesn't work, since the very first statement blocks. (I've
also removed the 'r+' mode in favor of 'w', since opening a FIFO for
reading AND writing produces undefined behavior, at least in all Unix
versions and variants I'm familiar with).
If you contrive to keep the file pointer around until after
the reader has opened it, then you can read some data from

The problem is that the writer can never finish opening it; the
workaround is to have a separate process open it for reading first.
it. (You can't read "all" the data, though - since you still
have the file open, it has no end of file - so you can't
solve the problem exactly as stated above.)

Actually, in CPython (1.5.2 to 2.5 included, at least), _IF_ open worked
normally then the file WOULD be closed by the statement

open('my fifo', 'r+').write('some strings.')

as the file object's reference counts drops to 0 at this statement's
end. (This would NOT necessarily happen in other Python
implementations, such as Jython or IronPython, but I suspect THOSE
implementations wouldn't have os.mkfifo...).
And the odds are fair that when you get this working, you
will run into some other inconvenient behavior. Named pipes
are a little tricky.

Very -- particularly their blocking behavior at open (which appears to
have perhaps tricked you in this case).


Alex
 
R

Rochester

Thanks Alex, now I think I understand much better the fifo/pipe mechanism
and how Python treats them.

For those who are interested, I would like to restate the problem I was
tring to solve and a working solution (inspired by Alex Martelli's code),
feel free to criticize it:

The problem:

I have an external program which takes two matrices (two text files) as
input, and produces an output to stdout, you can take diff as an example..
I want to call this program from within Python, without writing any
temporary file (for the performance reason).

In Bash, this task would be best written as:

#!/bin/bash

diff <(step1) <(step2) | step3

Where step1, step2 and step3 have to be independent external programs.

Now in Python, since there is no exact equivalence of <() magic a.k.a.
process substitution, I figured taht the best solution should be to create
a pair of fifos and do something like this:

#!/bin/bash

mkfifo fifo1 fifo2
step1 > fifo1 &
step2 > fifo2 &
diff step1 step2 | step3

And a working Python equivalent code is:

#!/usr/bin/python

import os

# do step1 and step2 in Python, say we end up with something like these:

s1 = "some string\n second line." # results from step1
s2 = "some string\n a different line." # results from step2

os.mkfifo('fifo1')
os.mkfifo('fifo2')

op = os.popen(' '.join(['diff', 'fifo1', 'fifo2'])) # this step is crucial!
print >> open('fifo1', 'w'), s1
print >> open('fifo2', 'w'), s2

os.unlink('fifo1')
os.unlink('fifo2')

x = op.read()

# Now do something about x

print x

P.S.: process substitution is a Bash hack which uses /dev/fd/<n> to send
(usually more than one) processes output to a program which takes (more
than one) filename as input. Heuristically speaking, this is like a
multiway pipe. You can find more about this mechanism here:

http://www.tldp.org/LDP/abs/html/process-sub.html
 
A

Antoon Pardon

Thanks Alex, now I think I understand much better the fifo/pipe mechanism
and how Python treats them.

For those who are interested, I would like to restate the problem I was
tring to solve and a working solution (inspired by Alex Martelli's code),
feel free to criticize it:

The problem:

I have an external program which takes two matrices (two text files) as
input, and produces an output to stdout, you can take diff as an example.
I want to call this program from within Python, without writing any
temporary file (for the performance reason).

In Bash, this task would be best written as:

#!/bin/bash

diff <(step1) <(step2) | step3

Where step1, step2 and step3 have to be independent external programs.

Now in Python, since there is no exact equivalence of <() magic a.k.a.
process substitution, I figured taht the best solution should be to create
a pair of fifos and do something like this:

#!/bin/bash

mkfifo fifo1 fifo2
step1 > fifo1 &
step2 > fifo2 &
diff step1 step2 | step3

And a working Python equivalent code is:

I think your code only works because of an artefacts that may not work
in general.
#!/usr/bin/python

import os

# do step1 and step2 in Python, say we end up with something like these:

s1 = "some string\n second line." # results from step1
s2 = "some string\n a different line." # results from step2

os.mkfifo('fifo1')
os.mkfifo('fifo2')

op = os.popen(' '.join(['diff', 'fifo1', 'fifo2'])) # this step is crucial!
print >> open('fifo1', 'w'), s1
print >> open('fifo2', 'w'), s2

This will not work in general. Suppose diff would open the two files
simultaneously and read the files in parralel. Since you first feed
the whole first file before you start the second, a deadlock could
occur if s1 was sufficiently large.

Something like the following instead of the two print statements
would be better IMO (not tested):

def cat(fn, st):
fl = file(fn, 'w')
fl.write(st)
fl.close()

Threading.Thread(target = cat, args = ('fifo1', s1)).start()
Threading.Thread(target = cat, args = ('fifo2', s2)).start()
 
D

Donn Cave

Hmmm, no: the problem is, he never opens the pipe for write, because the
open blocks (will not proceed until somebody opens the fifo for reading,
which in turn won't happen here because the open blocks).

Try:

a = open('my_fifo', 'w')
b = os.popen('cat my_fifo')
a.write ...
a.close()
c = b.read()

this STILL doesn't work, since the very first statement blocks. (I've
also removed the 'r+' mode in favor of 'w', since opening a FIFO for
reading AND writing produces undefined behavior, at least in all Unix
versions and variants I'm familiar with).

But it does work. I edited that excerpt only to complete
missing parts, and ran it on MacOS X and GNU Linux.

import os
f = '/tmp/r'
try:
os.unlink(f)
except:
pass
a = open(f, 'w')
b = os.popen('cat %s' % f)
a.write('chunks\n')
a.close()
c = b.read()
print repr(c)

Actually, in CPython (1.5.2 to 2.5 included, at least), _IF_ open worked
normally then the file WOULD be closed by the statement

open('my fifo', 'r+').write('some strings.')

Sure, but now we're back to closing the pipe before the reader
gets to it. That doesn't work.


Donn Cave, (e-mail address removed)
 
A

Alex Martelli

Donn Cave said:
But it does work. I edited that excerpt only to complete
missing parts, and ran it on MacOS X and GNU Linux.

import os
f = '/tmp/r'
try:
os.unlink(f)
except:
pass

You forgot to add os.mkfifo(f) here -- so you're writing and reading a
perfectly ordinary file... of course *that* gives no problems!-)


Alex
 
D

Donn Cave

You forgot to add os.mkfifo(f) here -- so you're writing and reading a
perfectly ordinary file... of course *that* gives no problems!-)

Of course you're right about that, and with that fixed we
see that you're right, the open blocks. In order to avoid
that, you have to open "r+", which after all is what the
original post proposed to do.

os.mkfifo(f)
a = open(f, 'r+')
a.write('chunks\n')
b = os.popen('cat %s' % f)
a.close()
c = b.readline()
print repr(c)

And again, if the close() moves up before the "cat", there's
no data - the read end has to open before the write end closes.

But I cheated when I replaced read() with readline(). The read end
("cat") doesn't detect the end of file, when there are two processes
involved. On NetBSD, when the child process closes the write
descriptor, that operation doesn't entirely succeed and the file
descriptor is left in a `no more information' state. On Linux,
one doesn't see that, but the result is the same. In any case, a
stream that can't have an end is not going to be very satisfactory.
[I don't know why I get tangled up in these named pipe problems,
when I know better myself than to use them!]

Donn Cave, (e-mail address removed)
 
A

Alex Martelli

Donn Cave said:
Of course you're right about that, and with that fixed we
see that you're right, the open blocks. In order to avoid
that, you have to open "r+", which after all is what the
original post proposed to do.

....and produces "undefined behavior" according to the manpage.
involved. On NetBSD, when the child process closes the write
descriptor, that operation doesn't entirely succeed and the file
descriptor is left in a `no more information' state. On Linux,
one doesn't see that, but the result is the same. In any case, a
stream that can't have an end is not going to be very satisfactory.
[I don't know why I get tangled up in these named pipe problems,
when I know better myself than to use them!]

I have no problems using named pipes _according to their documentation_
(so, in particular, no O_RDWR opening, and acceptance of their
thread-blocking behavior on O_RDONLY and O_WRONLY opening; I have no
experience with opening named pipes in non-blocking mode).


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,071
Latest member
MetabolicSolutionsKeto

Latest Threads

Top