piping with subprocess

R

Rick Dooling

I spent half a day trying to convert this bash script (on Mac)

textutil -convert html $1 -stdout | pandoc -f html -t markdown -o $2

into Python using subprocess pipes.

It works if I save the above into a shell script called convert.sh and then do

subprocess.check_call(["convert.sh", file, markdown_file])

where file and markdown_file are variables.

But otherwise my piping attempts fail.

Could someone show me how to pipe in subprocess. Yes, I've read the doc, especially

http://docs.python.org/2/library/subprocess.html#replacing-shell-pipeline

But I'm a feeble hobbyist, not a computer scientist.

Thanks

RD
 
D

Daniel da Silva

Try this:

from subprocess import check_output
import sys
check_output("textutil -convert html %s -stdout | pandoc -f html -t
markdown -o %s" % sys.argv[1:3], shell=True)
 
P

Peter Otten

Rick said:
I spent half a day trying to convert this bash script (on Mac)

textutil -convert html $1 -stdout | pandoc -f html -t markdown -o $2

into Python using subprocess pipes.

It works if I save the above into a shell script called convert.sh and
then do

subprocess.check_call(["convert.sh", file, markdown_file])

where file and markdown_file are variables.

But otherwise my piping attempts fail.

It is always a good idea to post your "best effort" failed attempt, if only
to give us an idea of your level of expertise.
Could someone show me how to pipe in subprocess. Yes, I've read the doc,
especially

http://docs.python.org/2/library/subprocess.html#replacing-shell-pipeline

But I'm a feeble hobbyist, not a computer scientist.

Try to convert the example from the above page

"""
output=`dmesg | grep hda`
# becomes
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
"""

to your usecase. Namely, replace

["dmesg"] --> ["textutil", "-convert", "html", infile, "-stdout"]
["grep", "hda"] --> ["pandoc", "-f", "html", "-t", "marktown", "-o",
outfile]

Don't forget to set

infile = ...
outfile = ...

to filenames (with absolute paths, to avoid one source of error).
If that doesn't work post the code you wrote along with the error messages.
 
R

Rick Dooling

Rick Dooling wrote:


I spent half a day trying to convert this bash script (on Mac)

textutil -convert html $1 -stdout | pandoc -f html -t markdown -o $2

into Python using subprocess pipes.

It works if I save the above into a shell script called convert.sh and
then do
subprocess.check_call(["convert.sh", file, markdown_file])
where file and markdown_file are variables.
But otherwise my piping attempts fail.



It is always a good idea to post your "best effort" failed attempt, if only

to give us an idea of your level of expertise.


Could someone show me how to pipe in subprocess. Yes, I've read the doc,




But I'm a feeble hobbyist, not a computer scientist.



Try to convert the example from the above page



"""

output=`dmesg | grep hda`

# becomes

p1 = Popen(["dmesg"], stdout=PIPE)

p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)

p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.

output = p2.communicate()[0]

"""



to your usecase. Namely, replace



["dmesg"] --> ["textutil", "-convert", "html", infile, "-stdout"]

["grep", "hda"] --> ["pandoc", "-f", "html", "-t", "marktown", "-o",

outfile]



Don't forget to set



infile = ...

outfile = ...



to filenames (with absolute paths, to avoid one source of error).

If that doesn't work post the code you wrote along with the error messages.

p1 = subprocess.Popen(["textutil", "-convert", "html", file], stdout=subprocess.PIPE)
p2 = subprocess.check_call(["pandoc", "-f", "html", "-t", "markdown", "-o", markdown_file], stdin=p1.stdout, stdout=subprocess.PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]

Errors

Traceback (most recent call last):
File "/Users/me/Python/any2pandoc.py", line 70, in <module>
convert_word_file(file, markdown_file)
File "/Users/me/Python/any2pandoc.py", line 59, in convert_word_file
output = p2.communicate()[0]
AttributeError: 'int' object has no attribute 'communicate'

I get a markdown_file created but it's empty.

Thanks,

RD

ps - Daniel's works fine but I still don't learn to pipe :)
 
R

Rick Dooling

Rick Dooling wrote:
I spent half a day trying to convert this bash script (on Mac)
textutil -convert html $1 -stdout | pandoc -f html -t markdown -o $2
into Python using subprocess pipes.
It works if I save the above into a shell script called convert.sh and
subprocess.check_call(["convert.sh", file, markdown_file])
where file and markdown_file are variables.
But otherwise my piping attempts fail.
It is always a good idea to post your "best effort" failed attempt, if only
to give us an idea of your level of expertise.
Could someone show me how to pipe in subprocess. Yes, I've read the doc,
But I'm a feeble hobbyist, not a computer scientist.
Try to convert the example from the above page

output=`dmesg | grep hda`
# becomes
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]

to your usecase. Namely, replace
["dmesg"] --> ["textutil", "-convert", "html", infile, "-stdout"]
["grep", "hda"] --> ["pandoc", "-f", "html", "-t", "marktown", "-o",

Don't forget to set
infile = ...
outfile = ...
to filenames (with absolute paths, to avoid one source of error).
If that doesn't work post the code you wrote along with the error messages.



p1 = subprocess.Popen(["textutil", "-convert", "html", file], stdout=subprocess.PIPE)

p2 = subprocess.check_call(["pandoc", "-f", "html", "-t", "markdown", "-o", markdown_file], stdin=p1.stdout, stdout=subprocess.PIPE)

p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.

output = p2.communicate()[0]



Errors



Traceback (most recent call last):

File "/Users/me/Python/any2pandoc.py", line 70, in <module>

convert_word_file(file, markdown_file)

File "/Users/me/Python/any2pandoc.py", line 59, in convert_word_file

output = p2.communicate()[0]

AttributeError: 'int' object has no attribute 'communicate'



I get a markdown_file created but it's empty.



Thanks,



RD



ps - Daniel's works fine but I still don't learn to pipe :)

Okay, sorry. I fixed that obvious goof

p1 = subprocess.Popen(["textutil", "-convert", "html", file], stdout=subprocess.PIPE)
p2 = subprocess.Popen(["pandoc", "-f", "html", "-t", "markdown", "-o", markdown_file], stdin=p1.stdout, stdout=subprocess.PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]

Now I get no errors, but I still get a blank markdown file.
 
M

Mark Lawrence

Rick Dooling wrote:


I spent half a day trying to convert this bash script (on Mac)

textutil -convert html $1 -stdout | pandoc -f html -t markdown -o $2

into Python using subprocess pipes.

It works if I save the above into a shell script called convert.sh and
then do
subprocess.check_call(["convert.sh", file, markdown_file])
where file and markdown_file are variables.
But otherwise my piping attempts fail.



It is always a good idea to post your "best effort" failed attempt, if only

to give us an idea of your level of expertise.


Could someone show me how to pipe in subprocess. Yes, I've read the doc,




But I'm a feeble hobbyist, not a computer scientist.



Try to convert the example from the above page



"""

output=`dmesg | grep hda`

# becomes

p1 = Popen(["dmesg"], stdout=PIPE)

p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)

p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.

output = p2.communicate()[0]

"""



to your usecase. Namely, replace



["dmesg"] --> ["textutil", "-convert", "html", infile, "-stdout"]

["grep", "hda"] --> ["pandoc", "-f", "html", "-t", "marktown", "-o",

outfile]



Don't forget to set



infile = ...

outfile = ...



to filenames (with absolute paths, to avoid one source of error).

If that doesn't work post the code you wrote along with the error messages.

Would you please read and action this
https://wiki.python.org/moin/GoogleGroupsPython to prevent us seeing the
double line spacing above, thanks.
 
R

Rick Dooling

Rick Dooling wrote:
I spent half a day trying to convert this bash script (on Mac)
textutil -convert html $1 -stdout | pandoc -f html -t markdown -o $2
into Python using subprocess pipes.
It works if I save the above into a shell script called convert.sh and
then do
subprocess.check_call(["convert.sh", file, markdown_file])
where file and markdown_file are variables.
But otherwise my piping attempts fail.
It is always a good idea to post your "best effort" failed attempt, if only to give us an idea of your level of expertise.
Could someone show me how to pipe in subprocess. Yes, I've read the doc,
especially http://docs.python.org/2/library/subprocess.html#replacing-shell-pipeline
But I'm a feeble hobbyist, not a computer scientist.
Try to convert the example from the above page
"""
output=`dmesg | grep hda`
# becomes
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
"""
to your usecase. Namely, replace
["dmesg"] --> ["textutil", "-convert", "html", infile, "-stdout"]
["grep", "hda"] --> ["pandoc", "-f", "html", "-t", "marktown", "-o" outfile]
Don't forget to set
infile = ...
outfile = ...
to filenames (with absolute paths, to avoid one source of error).
If that doesn't work post the code you wrote along with the error messages.
p1 = subprocess.Popen(["textutil", "-convert", "html", file], stdout=subprocess.PIPE)
p2 = subprocess.check_call(["pandoc", "-f", "html", "-t", "markdown", "-o", markdown_file], stdin=p1.stdout, stdout=subprocess.PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
Errors
Traceback (most recent call last):
File "/Users/me/Python/any2pandoc.py", line 70, in <module>
convert_word_file(file, markdown_file)
File "/Users/me/Python/any2pandoc.py", line 59, in convert_word_file
output = p2.communicate()[0]
AttributeError: 'int' object has no attribute 'communicate'
I get a markdown_file created but it's empty.
Thanks,
RD
ps - Daniel's works fine but I still don't learn to pipe :)
Okay, sorry. I fixed that obvious goof
p1 = subprocess.Popen(["textutil", "-convert", "html", file], stdout=subprocess.PIPE)

p2 = subprocess.Popen(["pandoc", "-f", "html", "-t", "markdown", "-o", markdown_file], stdin=p1.stdout, stdout=subprocess.PIPE)

p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.

output = p2.communicate()[0]

Now I get no errors, but I still get a blank markdown file.

Okay, blank lines removed. Apologies. I didn't know Google inserted them.

RD
 
M

Mark Lawrence

Okay, blank lines removed. Apologies. I didn't know Google inserted them.

RD

No problem, the whole snag is people don't know about this flaw in this
tool until they're told about it.
 
P

Peter Otten

Rick said:
Try to convert the example from the above page

"""
output=`dmesg | grep hda`
# becomes
p1 = Popen(["dmesg"], stdout=PIPE)
p2 = Popen(["grep", "hda"], stdin=p1.stdout, stdout=PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]
"""

to your usecase. Namely, replace

["dmesg"] --> ["textutil", "-convert", "html", infile, "-stdout"]
["grep", "hda"] --> ["pandoc", "-f", "html", "-t", "marktown", "-o",
outfile]

Don't forget to set

infile = ...
outfile = ...

to filenames (with absolute paths, to avoid one source of error).
If that doesn't work post the code you wrote along with the error
messages.

p1 = subprocess.Popen(["textutil", "-convert", "html", file],
stdout=subprocess.PIPE)
p2 = subprocess.check_call(["pandoc", "-f",
"html", "-t", "markdown", "-o", markdown_file], stdin=p1.stdout,
stdout=subprocess.PIPE)
p1.stdout.close() # Allow p1 to receive a SIGPIPE if p2 exits.
output = p2.communicate()[0]

Errors

Traceback (most recent call last):
File "/Users/me/Python/any2pandoc.py", line 70, in <module>
convert_word_file(file, markdown_file)
File "/Users/me/Python/any2pandoc.py", line 59, in convert_word_file
output = p2.communicate()[0]
AttributeError: 'int' object has no attribute 'communicate'

I get a markdown_file created but it's empty.

Well, you replaced the Popen() from the example with a check_call() which
uses a Popen instance internally, but does not expose it.

I recommend that you stick as closely to the example as possible until you
have a working baseline version. I'd try

textutil = subprocess.Popen(
["textutil", "-convert", "html", file],
stdout=subprocess.PIPE)
pandoc = subprocess.Popen(
["pandoc", "-f", "html", "-t", "markdown", "-o", markdown_file],
stdin=textutil.stdout)

textutil.stdout.close()
pandoc.communicate()
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top