I
igor.kulkin
I have a small utility program written in Python which works pretty
slow so I've decided to implement it in C.
I did some benchmarking of Python's code performance. One of the parts
of the program is using Python's standard re (regular expressions)
module to parse the input file. As Python's routines to read from the
file and regular expressions are most likely implemented via native
libraries I would expect that the C code, which reads and parses the
file using exactly the same scheme, would show approximately the same
performance (or maybe better).
I was surprised to find out that the code in C works way slower
(actually about 300 times slower!!!) than the same code in Python.
I am running it all under Unix (I guess the version should not really
matter) and am using gcc with -O2 to compile C code.
The code does exactly the same in both languages:
1. inludes regexp library (module re in Python, regex.h in c)
2. compiles expression (the same expression is used with small
differences cause by slightly different syntax accepted by the
libraries)
3. reads input file line by line
4. parses the line using compiled regexp (in Pthon it's the call
of .match(..), in C it's the call of regexex(...)).
NOTHING MORE!
Does anyone know what's the problem?
slow so I've decided to implement it in C.
I did some benchmarking of Python's code performance. One of the parts
of the program is using Python's standard re (regular expressions)
module to parse the input file. As Python's routines to read from the
file and regular expressions are most likely implemented via native
libraries I would expect that the C code, which reads and parses the
file using exactly the same scheme, would show approximately the same
performance (or maybe better).
I was surprised to find out that the code in C works way slower
(actually about 300 times slower!!!) than the same code in Python.
I am running it all under Unix (I guess the version should not really
matter) and am using gcc with -O2 to compile C code.
The code does exactly the same in both languages:
1. inludes regexp library (module re in Python, regex.h in c)
2. compiles expression (the same expression is used with small
differences cause by slightly different syntax accepted by the
libraries)
3. reads input file line by line
4. parses the line using compiled regexp (in Pthon it's the call
of .match(..), in C it's the call of regexex(...)).
NOTHING MORE!
Does anyone know what's the problem?