Calling a C program from a Python Script

B

Brad Tilley

Is it possible to write a file open, then read program in C and then
call the C program from a Python script like this:

for root, files, dirs in os.walk(path)
for f in files:
try:
EXECUTE_C_PROGRAM

If possible, how much faster would this be over a pure Python solution?

Thank you,

Brad
 
G

Grant Edwards

Is it possible to write a file open, then read program in C and then
call the C program from a Python script like this:

Huh? What do you mean "write a file open"? You want to read a
C source file and execute the C source? If you have access to
a C interpreter, I guess you could invoke the interpreter from
python using popen, and feed the C source to it. Alternatively
you could invoke a compiler and linker from C to generate an
executable and then execute the resulting file.
for root, files, dirs in os.walk(path)
for f in files:
try:
EXECUTE_C_PROGRAM

You're going to have to explain clearly what you mean by
"EXECUTE_C_PROGRAM". If you want to, you can certainly run a
binary executable that was generated from C source, (e.g. an
ELF file under Linux or whatever a .exe file is under Windows).
If possible, how much faster would this be over a pure Python
solution?

Solution to what?
 
I

Istvan Albert

Brad said:
If possible, how much faster would this be over a pure Python solution?

It is like the difference between Batman and Ever.

batman is faster than ever
 
B

Brad Tilley

Grant said:
Huh? What do you mean "write a file open"? You want to read a
C source file and execute the C source? If you have access to
a C interpreter, I guess you could invoke the interpreter from
python using popen, and feed the C source to it. Alternatively
you could invoke a compiler and linker from C to generate an
executable and then execute the resulting file.




You're going to have to explain clearly what you mean by
"EXECUTE_C_PROGRAM". If you want to, you can certainly run a
binary executable that was generated from C source, (e.g. an
ELF file under Linux or whatever a .exe file is under Windows).

Appears I was finger-tied. I meant "a C program that opens and reads
files" while Python does everything else. How does one integrate C into
a Python script like that?

So, instead of this:

for root, files, dirs in os.walk(path)
for f in files:
try:
x = file(f, 'rb')
data = x.read()
x.close()
this:

for root, files, dirs in os.walk(path)
for f in files:
try:
EXECUTE_C_PROGRAM

From the Simpsons:
Frink: "Here we have an ordinary square."
Wiggum: "Whoa! Slow down egghead!"
 
I

It's me

I would expect C to run circles around the same operation under Python. As
a general rule of thumb, you should use C for time cirtical operations
(computer time, that is), and use Python for human time critical situations
(you can get a program developed much faster).

I just discovered a magical package call SWIG (http://www.swig.org) that
makes writing C wrappers for Python always a child's play. It's incredible!
Where were these guys years ago when I had to pay somebody moocho money to
develop a script language wrapper for my application!!!
 
S

Steven Bethard

It's me said:
I would expect C to run circles around the same operation under Python.

You should probably only expect C to run circles around the same
operations when those operations implemented entirely in Python. In the
specific (trivial) example given, I wouldn't expect Python to be much
slower:

Remember that CPython is implemented in C, and so all the builtin types
(including file) basically execute C code directly. My experience with
Python file objects is that they are quite fast when you're doing simple
things like the example above. (In fact, I usually find that Python is
faster than Java for things like this.)

Of course, the example above is almost certainly omitting some code that
really gets executed, and without knowing what that code does, it would
be difficult to predict exactly what performance gain you would get from
reimplementing it in C. Profile the app first, find out where the tight
spots are, and then reimplement in C if necessary (often, it isn't).

STeve
 
B

Brad Tilley

Steven said:
Remember that CPython is implemented in C, and so all the builtin types
(including file) basically execute C code directly. My experience with
Python file objects is that they are quite fast when you're doing simple
things like the example above.

I'm dealing with a terabyte of files. Perhaps I should have mentioned that.
 
G

Grant Edwards

Appears I was finger-tied. I meant "a C program that opens and
reads files"

That's still too vague to be meaningful. Just reading a file
seems pointless:

cat foo >/dev/null

Here, "cat" is a C program that "opens and reads" the file
named foo.
while Python does everything else. How does one
integrate C into a Python script like that?

So, instead of this:

for root, files, dirs in os.walk(path)
for f in files:
try:
x = file(f, 'rb')
data = x.read()
x.close()
this:

for root, files, dirs in os.walk(path)
for f in files:
try:
EXECUTE_C_PROGRAM

So you want the data returned to your Python program? If so,
you can't just execute a C program. If you want to use the
data in the file, you have to read the data from _somewhere_.

You can read it directly from the file, or you can read it from
a pipe, where it was put by the program that read it from the
file. The former is going to be far, far faster.
 
G

Grant Edwards

I'm dealing with a terabyte of files. Perhaps I should have mentioned that.

And you think you're going to read the entire file consisting
of terabytes of data into memory using either C or Python?
[That's the example you gave.]

Sounds like maybe you need to mmap() the files?

Or at least tell us what you're trying to do so we can make
more intelligent suggestions.
 
B

Brad Tilley

Grant said:
I'm dealing with a terabyte of files. Perhaps I should have mentioned that.


And you think you're going to read the entire file consisting
of terabytes of data into memory using either C or Python?
[That's the example you gave.]

Sounds like maybe you need to mmap() the files?

Or at least tell us what you're trying to do so we can make
more intelligent suggestions.

It was an overly-simplistic example. I realize that I can't read all of
the data into memory at once. I think that's obvious to most anyone.

I just want to know the basics of using C and Python together when the
need arises, that's all, I don't want to write a book about what exactly
it is that I'm involved in.

I'm going to take It's Me's advice and have a look at SWIG.

Thank you,

Brad
 
G

Grant Edwards

I just want to know the basics of using C and Python together
when the need arises, that's all, I don't want to write a book
about what exactly it is that I'm involved in.

I'm going to take It's Me's advice and have a look at SWIG.

There's also the ctypes module that lets you load and call
library functions written in C (or anything else with C calling
conventions). IMO, it's a bit easier to use than SWIG, since
you don't have to actually generate/install any python modules
to use it.

But, neither SWIG nor cypes has anything to do with executing a
_program_ written in C, which is what I thought you were asking
about...
 
J

Jeff Shannon

Brad said:
I just want to know the basics of using C and Python together when the
need arises, that's all, I don't want to write a book about what
exactly it is that I'm involved in.


Well, there's several different ways of using C and Python together, so
the only meaningful answer we can give you is "It depends on what you're
trying to do."

What your pseudocode seems to show would be appropriate for something
along the lines of os.system(), os.startfile(), or some variant of
popen(). Any of these will run a free-standing executable program. But
they may not be ideal for making data available to Python.

There's ctypes, which will allow you to call functions in a C shared
library (i.e. a DLL, not sure if it works for *nix .so files).

There's also the possibility of writing Python extensions in C. This is
a bit more up-front work, but may make usage easier. The extension
(provided it's not buggy) works just like any other Python module.

SWIG is a tool that automates the wrapping of existing C code into a
Python extension. Whether this is a suitable tool for you depends on
what code you already have, and what responsibilities you're hoping to
pass from Python code into C code.

And do note that, as others have said, calling C code won't
*necessarily* make your program work faster. It's only going to help if
you're replacing slow Python code with faster C code -- not all Python
code is slow, and not all C code is fast, and if you're writing C from
scratch then you want to be sure where the hotspots are and focus on
converting only those areas to C.

Jeff Shannon
Technician/Programmer
Credit International
 
A

Armin Steinhoff

Brad said:
Is it possible to write a file open, then read program in C and then
call the C program from a Python script like this:

for root, files, dirs in os.walk(path)
for f in files:
try:
EXECUTE_C_PROGRAM

If possible, how much faster would this be over a pure Python solution?

I would compile that C program into a shared library (*.so or *.dll ) in
order to use that shared library with ctypes ... that's the easiest way,
IMHO :) ( http://starship.python.net/crew/theller/ctypes )

Regards

Armin
 
M

Matt Gerrans

I wouldn't automatically assume that recursing the directories with a Python
script that calls a C program for each file is faster than doing the
processing in Python. For example, I found that using zlib.crc32()
directly in Python was no slower than calling a C program that calculates
CRCs of files. (for huge files, it was important to find the right size
buffer to use and not try to read the whole thing at once, of course -- but
the C program had to do the same thing). However, if all the processing
is done in Python code (instead of a C extension), there probably would be a
big performance difference. It is just a question of whether the overhead
of starting a separate process for each file is more time consuming than the
difference between the Python and C implementations.

The pure Python implementation is probably easier to write, so you can do it
that way and you're have something that works. *Then* if the performance is
not acceptable, try the other route.

Additionally, depending on how much directory crawling you are doing, you
can just do the whole darned thing in C and save another minute or so.

Anyway, I didn't see the simple answer to your question in this thread (that
doesn't mean it wasn't there). I think you could do something like this:

for root, files, dirs in os.walk(path)
for f in files:
try:
os.system( "cprog %s" % (os.path.join(root,f) )

I prefer naming like this, though:

for directory, filenames, subdirs in os.walk(startpath):
for filename in filenames:
...

(particularly, since "root" will not be the root directory, except maybe
once).

- Matt
 
C

Caleb Hattingh

Hi Brad

Not that I'm an expert but note:

1. If you already know C, fair enough. You should know what you are
getting into then. I sure as heck don't know it very well at all and I'm
not gonna make that time investment now. MAYBE if I really really needed
the extra speed (but this seems to arise more infrequently than one would
imagine) for something where I couldn't interface with some existing
binary library.

2. The PythonForDelphi crowd makes the creation of native binary
extensions with Delphi pathetically easy (which is about equivalent to the
degree of complexity I can handle). As I said in a previous post, C is
not the only game in town for binary extensions. Of course, I happen to
already know ObjectPascal (Delphi) quite well, so for me it is a good fit
- maybe not so much for you if pascal would be new for you. If both
pascal AND C are new for you, I suspect you will find Delphi a fair bit
easier (& faster) to learn. btw, Works with Kylix also. I don't know
about FPC.

3. As someone said previously, some of the 'builtin' python functionality
is compiled C anyway. This point alone often makes it very difficult to
qualify statements like 'python is slow'. You could even start with the
Cpython source for something like file access and see how you go with
optimization, if *that* performance was not enough for you.

4. Nobody mentioned Pyrex yet, I think it kinda allows you to write C
within your python scripts, and then handles that all intellligently,
compiles the necessary bits, and so on - try a google search for the facts
rather than my broken memory of features.

5. If all you are is curious about interfacing a C extension with Python -
that's cool too. I would be interested in hearing what to look out for in
the learning stage of developing C-extensions, for when I am overcome by
curiosity and feel the need to try it out.

Keep well
Caleb
 
F

Francis Lavoie

Do we need a special account or something to use the newsgroup instead
of the mailling list?
 
S

Steve Holden

Francis said:
Do we need a special account or something to use the newsgroup instead
of the mailling list?

Yes, you have to find an NNTP server that carries comp.lang.python. It's
possible your Internet service provider runs such a news server and will
let you access it as a part of your subscription (this is the case for
me). If not then you'll have to contract with a third party news service
(some of htese are available for free).

Alternatively, take a look at http://www.gmane.com/, a mailing
list/newsgroup interface service which I believe offers an NNTP service.

regards
Steve
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top