J
jason willows
There have been many many many many discussions about obfuscating
python. To my dismay, most who answer are those who frequently post,
and they say things such as:
1) what's the point, in theory anything could eventually be decompiled
2) python is used for mostly internal stuff anyway, cuz its a "glue"
language, so why bother
3) use licensing and a good lawyer, it's the ONLY way
4) many programmers seem comfortable releasing their java and .net and
other interpreted code products into the market, so why not you?
I found most of these comments dismissive, and sometimes quite arrogant.
Frankly, the reasons why anyone would want to protect their code is
simple and should be observed because we are all programmers: we want to
protect our hard work.
Addressing the above points:
1) Anything could eventually be decompiled.... yes that's true. In a
perfect world. Have you ever tried to decompile C code and make sense
of it? Try a large C program. Good luck, you philosophers.
2) I don't see Python as merely a glue language. I see it as a serious
language for serious applications. Indeed, there are many commercial
examples of this, and Python works very well and is cost-efficient to
use. Incidentally, IBM and Microsoft have adopted Python for various
applications.... not that in itself should necessarily mean anything.
3) Using licensing and a good lawyer. I'm all for that! Now your code
has been stolen... and you are going to hire a lawyer to fight it out in
court. Months go by, maybe into years. The law offers no guarantees,
except to law makers. You've mortgaged your house to protect your
investment. If you win.
4) Others release their java and .net programs. Many obfuscate their
code before doing so, for the very same reasons a Python programmer
would want to do so.
I'm sick and tired of intelligent people acting like idiots.
Programmers should offer solutions, rather than anecdotal discussions
based on obvious points.
Here's my solution, it's not perfect, but it works well:
Use Pyrex, which translates your python sources (virtually unchanged) to
..c and then links them. You get natively compiled .pyd files (ie: dll),
just as though you had written a C program and compiled & linked it
yourself. I used this on all my source files except the one that starts
my program. I used py2exe (latest version) on the source file that
starts my program to create an EXE, and it also puts all my .pyd files
into the library.zip. The result is a program that is as difficult to
understand after decompile as a natively compiled C program, except for
the beginning source file (which should contain only a very small
fraction of your program logic anyway). I have done this on a
client-side python program that is composed of over 40 .py files and
from between 200 to 500 lines each file. It uses the wxPython widgets
for the GUI, Twisted for client/server communication, Pyro for
peer-to-peer communication, and the Crypto package for RSA public key
encryption. It runs without problems of any kind, especially ones that
may be related to the GUI or Twisted or Pyro or Crypto, and the increase
in speed of execution is very obvious.
Note on Pyrex: it can't handle "import *" and this addition construct "x
+= 1". So you may have to do a little bit of recoding, but that is all
the recoding I found that I had to do.
If you would like to discuss this constructively, email me at
(e-mail address removed) . I welcome a good programmer's discussion.
python. To my dismay, most who answer are those who frequently post,
and they say things such as:
1) what's the point, in theory anything could eventually be decompiled
2) python is used for mostly internal stuff anyway, cuz its a "glue"
language, so why bother
3) use licensing and a good lawyer, it's the ONLY way
4) many programmers seem comfortable releasing their java and .net and
other interpreted code products into the market, so why not you?
I found most of these comments dismissive, and sometimes quite arrogant.
Frankly, the reasons why anyone would want to protect their code is
simple and should be observed because we are all programmers: we want to
protect our hard work.
Addressing the above points:
1) Anything could eventually be decompiled.... yes that's true. In a
perfect world. Have you ever tried to decompile C code and make sense
of it? Try a large C program. Good luck, you philosophers.
2) I don't see Python as merely a glue language. I see it as a serious
language for serious applications. Indeed, there are many commercial
examples of this, and Python works very well and is cost-efficient to
use. Incidentally, IBM and Microsoft have adopted Python for various
applications.... not that in itself should necessarily mean anything.
3) Using licensing and a good lawyer. I'm all for that! Now your code
has been stolen... and you are going to hire a lawyer to fight it out in
court. Months go by, maybe into years. The law offers no guarantees,
except to law makers. You've mortgaged your house to protect your
investment. If you win.
4) Others release their java and .net programs. Many obfuscate their
code before doing so, for the very same reasons a Python programmer
would want to do so.
I'm sick and tired of intelligent people acting like idiots.
Programmers should offer solutions, rather than anecdotal discussions
based on obvious points.
Here's my solution, it's not perfect, but it works well:
Use Pyrex, which translates your python sources (virtually unchanged) to
..c and then links them. You get natively compiled .pyd files (ie: dll),
just as though you had written a C program and compiled & linked it
yourself. I used this on all my source files except the one that starts
my program. I used py2exe (latest version) on the source file that
starts my program to create an EXE, and it also puts all my .pyd files
into the library.zip. The result is a program that is as difficult to
understand after decompile as a natively compiled C program, except for
the beginning source file (which should contain only a very small
fraction of your program logic anyway). I have done this on a
client-side python program that is composed of over 40 .py files and
from between 200 to 500 lines each file. It uses the wxPython widgets
for the GUI, Twisted for client/server communication, Pyro for
peer-to-peer communication, and the Crypto package for RSA public key
encryption. It runs without problems of any kind, especially ones that
may be related to the GUI or Twisted or Pyro or Crypto, and the increase
in speed of execution is very obvious.
Note on Pyrex: it can't handle "import *" and this addition construct "x
+= 1". So you may have to do a little bit of recoding, but that is all
the recoding I found that I had to do.
If you would like to discuss this constructively, email me at
(e-mail address removed) . I welcome a good programmer's discussion.