python obfuscate

S

Sturla Molden

alister said:
Concentrate on making the product (even) better rather than trying to
hide the unhideable.

I think the number one reason for code obfuscation is an ignorant boss.

Another reason might be to avoid the shame of showing crappy code to the
customer.


Sturla
 
M

Mark H Harris

Does python has any good obfuscate?

Others have answered this well, but I thought I would give you
another opinion, perhaps more direct.

Obfuscation (hiding) of your source is *bad*, usually done for one
of the following reasons:
1) Boss is paranoid and fears loss of revenues due to intellectual
property theft.
2) Boss is ignorant of reverse engineering strategies available to
folks who want to get to the heart of the matter.
3) Boss and|or coders are embarrassed for clients (or other coders)
to see their art, or lack thereof. Sometimes this is also wanting to
hide the fact that the product really isn't "worth" the price being
charged for it?!?

There really is no good reason to obfuscate your code.
Currently our company wanna release one product developed by
python to our customer. But dont's wanna others see the py code.

This is the age of open source in computer science.

It is far better to develop a strategy and culture of openness.
Everyone benefits; especially your customers. I recommend the GPLv3
license. I also advocate for copyleft. How to leverage openness for
capital gain, you might ask? Answer: provide a value add. Its not just
about your code, or your "product". It should also be about your
service, maintenance, support, packing, manuals, news letters, &c.

Deliberately obfuscating your code is a negative; please consider an
alternative strategy.

marcus
 
M

Michael Torrie

Hi all, Does python has any good obfuscate?

Currently our company wanna release one product developed by python
to our customer. But dont's wanna others see the py code.

I googled for a while but mostly just say using pyc. Any better one?

Our product is deployed on Linux bed.

I guess it all depends on what you are really trying to do.

If you're trying to prevent people from making and using unauthorized
copies of your software then even obfuscating the code certainly won't
help that at all.

If you're trying to prevent people from learning trade secrets, then
simply don't put that part of your product in the hands of customers.
And on this point the language doesn't matter. Could be a binary
compiled from C++. Someone could, in theory, reverse-engineer and trace
the code and uncover your secret algorithm. The question is, is it
worth it for the mythical, theoretical, bad guy to do this? Is it worth
it for you to go to lengths to prevent this theoretical possibility?

If you have IP you truly need to keep secret, separate it out from your
application and stick it on a server and talk to it over some form of RPC.

If you're simply trying to keep the boss happy, simply wrapping up your
python scripts into a self-contained executable format (say py2exe or a
similar tool) is probably good enough.

Most end users will never know or care what you build the app with, even
if you have a directory full of open .py files. 99% of the users of a
popular ebook app called Calibre never know or care that it's made of
python and that you could go in and see the code. All they care about
is they can click an icon and the program launches and runs.
 
C

Chris Angelico

This is the age of open source in computer science.

It is far better to develop a strategy and culture of openness. Everyone
benefits; especially your customers. I recommend the GPLv3 license.

While I wholeheartedly agree with the ideal of open source, I don't
like the GPL (any version), because of the annoying restrictions that
end up running through projects. All sorts of projects can't go GPL,
ergo can't use readline. Why? Because readline went for a policy of
"force it to be GPL or nothing". Thank you so much, now I have to faff
around with PostgreSQL to get decent editing keys (and the legality of
that is apparently dubious, but IANAL and it's not my problem anyway).
Postgres is open source, but not GPL, and it's linked to some other
library (I disremember which) that's under a license incompatible with
the GPL.

For my code, I use the MIT license. Do what you like, only don't sue
me. Okay, that's not something everyone will want to use, but it does
make things easier on anyone who wants to distribute it. You want to
release a third-party build of my program? Or even just package up my
code into an installer? No problem; you aren't responsible to host the
code. With GPL software, you *are*, as I found out when I tried to
make a simple GTK updater; I'm legally required to make it clear that
the source code is available from the same web site as the binaries
are (even though I didn't build it, all I did was download the
binaries from their site and download the corresponding source
archives), and I'm also obliged from the perspective of practicality
to make it clear that the source code is not necessary, lest my users
be thoroughly confused. Completely unnecessary hassle; it's red tape
applied to those who're keeping everything open, in order to have a
weapon to wield against those who close things up.

I'm aware that the GPL has its place. I'm fully aware that GPL
violations, being pursued legally, help to ensure openness; and the
borderline cases of "we could go proprietary or we could go open
source" are sometimes tipped in favour of open source by an argument
of "we could use this if we go open"; but for most people, please,
pick a simpler license that puts less restrictions on usage.

ChrisA
 
C

Chris Angelico

Most end users will never know or care what you build the app with, even
if you have a directory full of open .py files. 99% of the users of a
popular ebook app called Calibre never know or care that it's made of
python and that you could go in and see the code. All they care about
is they can click an icon and the program launches and runs.

Absolutely. When you run "hg something_or_other", you would expect
that it's all written in Python, but some of it might not be, for all
you know. Certainly with git there are several languages used (some
are compiled binaries, some are shell scripts, some are Perl, gitk is
TCL...), and it doesn't matter at all. Who cares? I type a command and
it runs. If upstream decides to rewrite bash in Lua, I won't much
care, and probably wouldn't even know (although somehow I suspect
performance would drop... slightly...).

Adding to your list, though:

If you're trying to hide your source code for security, absolutely DO
NOT! This is one of the most common reasons I've heard of; either
because the "cryptographic" algorithms are hand-rolled and easy to
reverse-engineer if you have the source, or because the keys are
hard-coded in the program. Either way, you can't. It just won't work.
People can get at your crypto, and if it's broken as soon as someone
sees the source code, it's weak crypto to start with.

ChrisA
 
S

Sturla Molden

Mark H Harris said:
Obfuscation (hiding) of your source is *bad*, usually done for one
of the following reasons:
1) Boss is paranoid and fears loss of revenues due to intellectual
property theft.
2) Boss is ignorant of reverse engineering strategies available to
folks who want to get to the heart of the matter.
3) Boss and|or coders are embarrassed for clients (or other coders)
to see their art, or lack thereof. Sometimes this is also wanting to
hide the fact that the product really isn't "worth" the price being
charged for it?!?

You can also add fear of patent trolls to this list. Particularly if you
are in a startup and cannot afford a long battle in court. You can quickly
go bankrupt on attorney fees.

Sturla
 
G

Grant Edwards

The only reliable way to prevent a customer from reverse-engineering
your software is to not give them the software. For example, instead
of giving them software containing the critical code that you want to
protect, give them access to a web service running that code, which
you host and control.

If you do that the odds of them obtaining your code are reduced, but
don't assume they go to 0. ;)
 
G

Grant Edwards

It's worth noting, as an aside, that this does NOT mean you don't
produce or sell anything. You can keep your code secure by running it
on a server and permitting users to access it; that's perfectly safe.

You think a server that can be accessed by untrested people can be
perfectly safe?

Oh dear.
 
G

Grant Edwards

I think the number one reason for code obfuscation is an ignorant
boss.

Another reason might be to avoid the shame of showing crappy code to
the customer.

Another reason I've heard of is to try to reduce support efforts.

If you distribute something that's easy to modify, then people will.

And when it doesn't work, they'll call tech support and waste
everybody's time trying to track down bugs that aren't actually _in_
the product you're shipping.
 
J

Joshua Landau

Cython restains all the code as text, e.g. to readable generate exceptions.
Users can also still steal the extension modules and use them in their own
code. In general, Cython is not useful as an obfuscation tool.

Ah, thanks for the info. I imagine it's perfectly easy to get around
that, though, through basic removal at the C phase. I doubt it's
worthwhile doing so, but deobfuscation will still be harder than a
..pyc.
 
S

Steven D'Aprano

Another reason I've heard of is to try to reduce support efforts.

If you distribute something that's easy to modify, then people will.

The majority of people will treat your app as a black box. Of course, a
small minority (either out of actual competence, or sheer incompetence)
will try to modify anything supplied as source code. (And who is to say
that they shouldn't be permitted to, if they've bought your product?)
And when it doesn't work, they'll call tech support and waste
everybody's time trying to track down bugs that aren't actually _in_ the
product you're shipping.

I wonder whether Red Hat and Ubuntu have this problem? Somehow I think
that the magnitude of it is grossly exaggerated.

But in any case, this at least is trivially solved: take the md5 of your
application, then before doing any support check whether the md5 of their
copy has changed. A tiny Python script (small enough to visually inspect)
can do this on systems without a md5sum utility.
 
C

CM

You can also add fear of patent trolls to this list. Particularly if you
are in a startup and cannot afford a long battle in court. You can quickly
go bankrupt on attorney fees.

Sturla

You're saying that fear of patent trolls is yet another bad reason to
obfuscate your code? But then it almost sounds like you think it is a
justifiable reason. So I don't think I understand your point. Whether a
patent troll has your original code or not has no bearing on the patent
infringement.
 
D

Denis McMahon

Currently our company wanna release one product developed by python to
our customer. But dont's wanna others see the py code.

Your business model is fucked.
 
S

Sturla Molden

CM said:
You're saying that fear of patent trolls is yet another bad reason to
obfuscate your code? But then it almost sounds like you think it is a
justifiable reason. So I don't think I understand your point. Whether a
patent troll has your original code or not has no bearing on the patent
infringement.

There might be no infringment. Patent trolls usually possess invalid
patents, as they constitute no real invention. These are usually not
engineers who have invented something, but lawyers who have been granted
patent on vague thoughts for the purpose of "selling protection". The US
patent office has allowed this to happen, by believing that any invalid
patent can be challenged in court, so their review process is close to
non-existent. If patent trolls have your code they are in a better position
to blackmail. They can use your code to generate bogus "legal documents" in
the thousands, and thereby turn up your legal expenses.

Sturla
 
C

CM

There might be no infringment. Patent trolls usually possess invalid
patents, as they constitute no real invention. These are usually not
engineers who have invented something, but lawyers who have been granted
patent on vague thoughts for the purpose of "selling protection". The US
patent office has allowed this to happen, by believing that any invalid
patent can be challenged in court, so their review process is close to
non-existent. If patent trolls have your code they are in a better position
to blackmail. They can use your code to generate bogus "legal documents" in
the thousands, and thereby turn up your legal expenses.

Sturla

Ahh, I see. I suppose such an entity might try that. But I would hope it would not result in additional legal expenses, in that anyone with the smallest amount of legal understanding of patents knows that in doesn't matter in what way the invention is brought about in specific code, just that the *resulting invention* is similar enough to the claims of the patent. That is, the invention could be written in Python, or C, or COMAL, in whatever spaghetti the author wants, and none of that is pertinent to the issue of patent infringement (whereas it might very well be to the issue of copyright infringement). I would hope the defense lawyer(s) and judge struck that from the proceedings, but I am probably hoping for too rational an outcome.
 
S

Stefan Behnel

Sturla Molden, 11.04.2014 11:17:
Cython restains all the code as text, e.g. to readable generate exceptions.

No, it actually doesn't. It only keeps the code in C comments, to make
reading the generated code easier. Those comments get stripped during
compilation, obviously.

The only thing it keeps for its exception tracebacks is the line numbers,
both for the C code (which you can disable) and for the original Python
code. That shouldn't be very telling if you don't have the original source
code.

Stefan


PS: disclaimer: I never needed to obfuscate Python code with Cython, and
this use case is definitely not a design goal of the compiler. No
warranties, see the license.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top