Is Python suitable for a huge, enterprise size app?

P

Paul Rubin

Send e-mail to (e-mail address removed) -- actually, you can have commit
privs if you want.

I think I'm going to enter an SF bug on the issue if there isn't
already one. It's not obvious to me whether a reasonable fix is
possible, but at least it should be tracked. The current behavior is
weird and confusing.
 
M

Maurice LING

Peter said:
So you're saying that reverse engineering Java bytecode is illegal,
while doing the same with Python bytecode is not? Or something like
that? (And you're a lawyer, right? Because if you're not, and you're
not citing your sources, why is it we should put any value in these
comments about what is (legally) true?)

-Peter

I am not saying reverse engineering Python bytecodes is legal while
reverse engineering Java bytecodes is illegal. Please do not put words
into my mouth, I am just saying "codes locked in some packaged or zipped
form", reverse engineering is illegal unless specifically approved, be
it Python bytecodes, Java bytecodes, Smalltalk, compiled executables etc
etc... AND THIS HAD BEEN DISCUSSED IN OTHER THREADS, like in "bytecode
non-backcompatibility" in April 2005 of python-list.

I am not a lawyer. Based on what I've read and seen, I won't try it.
Below is an example of such articles by a law firm, citating case laws.
Please google the term "reverse engineering computer sources" and you
will find heaps of materials. You might like to test the depth of your
pockets in courtroom, be my guest


From http://www.jenkins-ip.com/serv/serv_6.htm

COMPUTER PROGRAMS

In the case of computer programs, the EU directive states (11) that the
ideas and principles underlying a program are not protected by
copyright, and that (12) logic, algorithms and programming languages may
to some extent comprise ideas and principles.

Analysis of the function of a program (but not decompilation (13))is
permitted under Article 5.3, if it is carried out by a licensed user in
the normal use of the program.

Reverse engineering is allowed under Article 6, but only for the single
purpose of producing an interoperable program (rather than a competing
program).

For this purpose, in addition to reverse engineering itself (i.e.
producing a high level version of the code) subsequent forward
engineering to produce the interoperable program is permitted.

However, the reverse engineer has to cross a host of formidable barriers
before he can make use of this right;

1. It must be indispensable to reverse engineer to obtain the
necessary information.
2. The reverse engineering has to be by a licensee or authorised user.
3. The necessary information must not already have been readily
available to those people.
4. Only the parts of the program necessary for interoperability
(i.e. the interfaces) can be reproduced.
5. The information generated by the reverse engineering cannot be
used for anything other than achieving interoperability of an
independently created program.
6. The information cannot be passed on to others except where
necessary for this purpose.
7. The information obtained cannot be used to make a competing
program (rather than just an interoperable one).
8. The "legitimate interests" of the copyright owner or "normal
exploitation" of the program must not be prejudice.

Thus, far from creating a general right to reverse engineer, these
provisions create only the smallest of openings for the reverse
engineer; they are intended for use only to defeat locked, confidential,
proprietary interfaces.


(11) Council Directive 96/9/EC 11 March 1996; enacted in the UK as S.I.
1997 No. 3032 13th recital
(12) 14th recital
(13) See, for example, Copyright, Designs and Patent Act 1988, Section
29(4)

SEGA ENTERPRISES LTD v. ACCOLADE INC 977 F.2D 1510 (9TH CIR. 1992)

This US software copyright case concerned Sega's video game console and
cartridges. The cartridges had a 20-25 byte code segment which was
interrogated by the console, as a security measure.

Accolade disassembled the code which was common to three different Sega
games cartridges, to find the security segment, and included it in
competing games cartridges.

The Ninth Circuit held this disassembly to be a permitted "fair use" of
the copyright in the games programs.

ATARI v. NINTENDO 975 F.2D 872 (FED. CIR. 1992)

This US software copyright case concerned Nintendo’s NES video game
console and cartridges. The cartridges contained a microprocessor, and
program code, and was interrogated by the console microprocessor, as a
security measure, like the Sega system. The security was potentially a
two-way process, with the console checking for a valid cartridge and the
potential for the cartridge to check for a valid console (which Nintendo
did not actually do).

Atari disassembled the program code which performed the security
signalling exchange (the interface code). However, they also had access
to a copy of the source code from the US Copyright Registry, to obtain
which they stated (untruthfully) that it was for the purposes of litigation.

They implemented the signalling exchange to validate the cartridge, thus
achieving compatibility of their cartridges with Nintendo consoles.
However, they went further and implemented the rest of the interface, to
validate the consoles, apparently in case Nintendo changed their product
in future. In each case, they copied some actual code, allegedly only to
the extent necessary.

The Court held that the intermediate copying during reverse engineering
was legitimate, as "fair use". However, Atari infringed copyright
nonetheless, in going too far in copying beyond what was strictly
necessary. The programmer apparently also had sight of the source code
from the US Copyright Registry, casting some doubt on whether the
copying was solely due to the reverse engineering operation.

Finally, Nintendo had a patent on the interface, and Atari were found to
infringe that too.

AUTODESK INC v. DYASON [1992] RPC 575 & (NO. 2) [1993] 12 RPC 259

This Australian software copyright case concerned a CAD package, which
was supplied with a hardware device containing an EPROM, called the
AutoCAD lock, which operated with part of the package called the
"Widget-C" program. The program sent a challenge signal to the lock,
which replied with a return signal. The program checked the return
signal against a lookup table. The lookup table comprised 16 bytes of a
30kByte program. An encrypted form of the lookup table was held in the
lock EPROM.

The Defendant studied the signals with an oscilloscope, and read them.
Apparently, the correct contents of the EPROM were deduced from this
functional analysis, without reading of the EPROM. They then produced an
alternative lock device. The Plaintiff alleged that the table was a
substantial part of the program, and that the program had thus been copied.

The Court held that the table was a substantial part of the program (an
issue of importance rather than size) and that it had been copied, and
that this was an infringement. A move to re-hear the case, on the basis
that the Court had misunderstood, was dismissed (with dissenting
judgements).

POWERFLEX SERVICES PTY LTD v. DATA ACCESS CORP [1997] 12 EIPR 732

This Australian software copyright case concerned a database management
system (Dataflex) which was "cloned" by the Defendant. Apparently,
although the Court referred to reverse engineering, the Defendant did
not actually analyse the code by decompilation or disassembly, but read
the manual to produce a lookalike using the same commands and file
structures. All this was held not to infringe.

However, Dataflex used compression of records, using a stored table of
Huffman codes to compress and decompress. The table thus constituted an
interface between the program and the files. There is no indication that
the files were to be shared with other programs, so the table was not an
interface to non-competing interoperable programs. The Defendant wanted
his program to be able to read files created by Dataflex and so he
calculated the required compression table, apparently from inspection of
stored records, which therefore reproduced that of Dataflex.

It was held that although the table was not a substantial part of the
program (or even part of the program at all), it attracted literary
copyright as a compilation in its own right, and that copyright was held
infringed.

CREATIVE TECHNOLOGY LTD v. AZTECH SYSTEMS PTE LTD [1997] FSR 491

This Singapore software copyright case concerned the Sound Blaster PC
Sound Card It contained firmware, and was supplied with an ancillary PC
program to operate with it. The Defendants wanted to produce a sound
card which could interoperate with PC programs which would operate with
the Sound Blaster – in other words, to create a competing substitute.
They bought and studied a Sound Blaster and ancillary program. They were
held (on the balance of probabilities) to have disassembled and reverse
engineered the firmware, and admitted running the ancillary program
(which they pleaded as a "fair dealing"). In their product, only about
4% of the code was identical to that of the Sound Blaster firmware, but
that 4% included redundancies and errors present in the original,
suggesting copying (going beyond that which is necessary).

The Court held (reversing the first instance) that although the ultimate
products did not involve a substantial reproduction of the original
program, there was clear evidence that reverse engineering had been used
and hence that an intermediate copy had been made. This and the admitted
copying (in unlicensed use) of the ancillary program were both not "fair
dealing", and hence infringed.
 
M

Maurice LING

Peter said:
So you're saying that reverse engineering Java bytecode is illegal,
while doing the same with Python bytecode is not? Or something like
that? (And you're a lawyer, right? Because if you're not, and you're
not citing your sources, why is it we should put any value in these
comments about what is (legally) true?)

-Peter

What I'm saying is reverse engineering anything is illegal unless
allowed by the laws of the state, be it <your language> bytecodes or
compiled executables, but if the original source codes are there, you
can see it.

To put it sexually and crudely (to get the idea across), if a female
strips and parade in front of me, I'm not violating any law to open my
eyes and look at it (whether morally or religiously right is a total
different matter) but it is criminal for me to grab any moving female,
strip her and look at her naked. Can see the point?

maurice
 
K

Kay Schluehr

Dave said:
Overall it's been such a positive experience for us that nobody in the company -
from grunt testers up to the CTO - has any reservations about using Python in
production anymore (even though initially they all did). All of the developers
have previous experience with using Java in production systems, and none
seriously consider it for new projects at all.

-Dave

Think there is still a lack of communicating success on that scale. If
I think about serious IT magazines published in Germany ( e.g. "Object
Spectrum" ), there are tons of experience reports about all kinds of
enterprise systems, that were created and managed in Java/dotNET.

There might be a slight progress if you review some of the talks on the
next Europython conference:

http://www.python-in-business.org/ep2005/alisttrack.chtml?track=692

But this is still internal from the fans of the language to other fans.

Ciao,
Kay
 
F

Fredrik Lundh

Paul said:
I think I'm going to enter an SF bug on the issue if there isn't
already one. It's not obvious to me whether a reasonable fix is
possible, but at least it should be tracked. The current behavior is
weird and confusing.

this has been reported before, and it won't get fixed (unless you're volunteering
to add Python-compatible garbage collection to Tk, that is).

</F>
 
E

elbertlev

Sure it does not. As well as C, unless you instaead of malloc use low
level os-dependant APIs.
 
D

Dieter Maurer

Fredrik Lundh said:
...
and unless your operating system is totally braindead, and thus completely unfit
to run huge enterprise size applications, that doesn't really matter much. leaks
are problematic, large peak memory use isn't.

Could you elaborate a bit?

Large peak memory use means that the application got a large
address space. What garantees that the residual memory use
(after the peak) is compact and not evenly spread across
the address space. While the OS probably is able to
reuse complete pages which are either unused for a longer
time or at least rarely accessed, it may become nasty when
almost every page contains a small amount of heavily used
memory.


Dieter
 
P

Paul Rubin

Dave Brueck said:
One thing from your experience that did resonate with me is that,
except for ftplib and occasionally urllib (for basic, one-shot GETs),
we don't use any of the standard library's "protocol" modules - partly
because we had to implement our own HTTP libraries for performance and
scalability reasons anyway, and partly because we had trouble figuring
out e.g. all the ins and outs of urllib/urllib2/httplib.

What do you use for HTTPS? And did you use the Cookie module in your
HTTP servers? You may have had problems without even being aware of
them (until recently if you used Cookie with its default settings, any
attacker could completely take over your server by sending you
carefully concoted cookies). I'm not trying to be contentious here,
just mentioning a couple further cases of where problems aren't
visible from far away but are there when you look close.
 
P

Paul Rubin

Fredrik Lundh said:
this has been reported before, and it won't get fixed (unless you're
volunteering to add Python-compatible garbage collection to Tk, that is).

Yeah, I think I understand what the issue is. I can think of some
kludgy possible fixes but I assume they've been thought about already
and rejected. The workaround of making the application save an extra
reference isn't too bad, but all relevant docs that say anything about
these images should mention the requirement emphatically.
 
D

Dave Brueck

Paul said:
What do you use for HTTPS?

Hi Paul,

m2crypto (plus some patches to make asynchronous SSL do what we needed).
And did you use the Cookie module in your
HTTP servers? You may have had problems without even being aware of
them (until recently if you used Cookie with its default settings, any
attacker could completely take over your server by sending you
carefully concoted cookies).

Are you referring to the use of pickle for cookie serialization? In any case, we
didn't use Cookie.py from the stdlib (on the servers, nearly everything related
to URLs & HTTP was custom-built, with the exception of urlparse, for the
aforemenioned reasons).

-Dave
 
E

elbertlev

C programs also can be disassembled. Serious people do not consider
braking the machine code harder byte-code.
 
P

Paul Rubin

Dave Brueck said:
m2crypto (plus some patches to make asynchronous SSL do what we needed).

That seems to be a nice piece of code, but it's still at version 0.13;
if something goes wrong, are you sure you want to explain that you
were using beta-test software to protect your customers' production
financial transactions? There's also been some traffic on the
python-crypto list about Zope encountering memory leaks with it. I
haven't read the messages carefully though, so I'm not sure what the
situation is.
Are you referring to the use of pickle for cookie serialization?

Yes.
 
D

Dave Brueck

Paul said:
That seems to be a nice piece of code, but it's still at version 0.13;

Version numbers are fairly relative, though. In another project we're using some
proprietary, closed source libraries (unrelated to crypto) that are version 3
and they seem buggier and less stable than m2crypto.

And don't get me started on Microsoft products (we've been using DirectShow *9*
in some stuff, and due to bugs in DirectShow we were completely and utterly
screwed despite what the documentation said; things just didn't work as they
should, and Microsoft has confirmed that it's a bug that will be fixed in some
future release - so we had to backtrack, ripping out code that should work just
fine, and take another stab at getting DirectShow to cooperate). Version NINE
(supposedly).
if something goes wrong, are you sure you want to explain that you
were using beta-test software to protect your customers' production
financial transactions?

lol - what? We're not doing any financial transactions with it (IOW - we
evaluated m2crypto for what we needed to do, and it's a good fit - that we
didn't evaluate it in terms of what we don't need it to do doesn't bother me).

Having said that - I think we probably *would* use it for production financial
transactions - but that's more a matter of closed vs. open source than Python vs
not.

Besides, do you really think that, if something went wrong, you'd in the end
have some meeting where you explain to your customer that you were using beta
software? Of course not - it just doesn't work that way. Either they won't care
and will drop you because of the problem (regardless of the source) or they want
some broad details.
There's also been some traffic on the
python-crypto list about Zope encountering memory leaks with it.

Ok... so? I mean, if there's a memory leak, and it's hurting us, we have
options: we can go look in the source code, we can make Zope reboot itself
often, we can hire somebody to fix it, we can see if the author wants to give us
a support contract, etc.

Memory leaks aren't exactly unique to Python - according to bugs.sun.com, there
are currently 382 *open* bugs related to memory leaks in the JDK alone. If
you're using Java in your "huge, enterprise size app" and get bit by one of
those bugs, and if you're not a big enough company to get some
ridiculously-priced support contract from Sun, what are your options? Again,
this seems more like an open-vs-closed source issue, but to me it's another
reason why I'd feel uncomfortable using Java in mission critical work.

-Dave
 
P

Paul Rubin

Dave Brueck said:
Version numbers are fairly relative, though. In another project we're
using some proprietary, closed source libraries (unrelated to crypto)
that are version 3 and they seem buggier and less stable than m2crypto.

Yeah, OpenSSL itself is something like 0.97, which however does sound
closer to a final release than 0.13 of anything.
Having said that - I think we probably *would* use it for production
financial transactions - but that's more a matter of closed vs. open
source than Python vs not.

That makes some sense; once you have the code in-house and the
expertise to evaluate it and maintain it, it's at least no worse than
something you wrote yourself. If you just downloaded it and plopped
it into your application with your eyes closed, that might be asking
for trouble.
Ok... so? I mean, if there's a memory leak, and it's hurting us, we
have options: ...

I just mean the memory leak is a symptom that the software isn't yet
completely debugged.
Memory leaks aren't exactly unique to Python - according to
bugs.sun.com, there are currently 382 *open* bugs related to memory
leaks in the JDK alone.

Are any of those memory leaks in JSSE? It's one thing to have a
problem in some noncritical GUI widget, another to have it in the
crypto code.
 
J

Joal Heagney

Peter said:
So you're saying that reverse engineering Java bytecode is illegal,
while doing the same with Python bytecode is not? Or something like
that? (And you're a lawyer, right? Because if you're not, and you're
not citing your sources, why is it we should put any value in these
comments about what is (legally) true?)

-Peter

I think he's saying that if you distributed your python code as
byte-compiled module.pyc or module.pyo form, rather than ascii-text
module.py form, it would be harder for a reverse-engineer to say "But I
was JUST looking at it!". Especially when the lawyers are involved.

Joal Heagney
 
A

Antoon Pardon

Op 2005-05-20 said:
Could you elaborate a bit?

Large peak memory use means that the application got a large
address space. What garantees that the residual memory use
(after the peak) is compact and not evenly spread across
the address space.

Well nothing. But how do you want to return memory back
to the O.S. when the residual memory use isn't compact
and evenly spread across the address space?
 
P

Paul Rubin

Antoon Pardon said:
Well nothing. But how do you want to return memory back
to the O.S. when the residual memory use isn't compact
and evenly spread across the address space?

All large-scale language implementations with automatic storage
management that I know of use compacting storage schemes, e.g. copying
garbage collectors (maybe generational, multi-threaded/realtime, or
whatever). I think Python will have to do the same, sooner or later.
 
F

Fredrik Lundh

Dieter said:
Could you elaborate a bit?

Large peak memory use means that the application got a large
address space. What garantees that the residual memory use
(after the peak) is compact and not evenly spread across
the address space.

nothing guarantees that, of course. but I've never seen that
happen. and I'm basing my comments on observed behaviour in
real systems, not on theoretical worst-case scenarios. every
time I've seen serious fragmentation, it's been related to leaks,
not peak memory usage.

</F>
 
D

Dieter Maurer

Fredrik Lundh said:
...
nothing guarantees that, of course. but I've never seen that
happen. and I'm basing my comments on observed behaviour in
real systems, not on theoretical worst-case scenarios.

I observed in real systems (Zope) that the system got slower
and slower as the amount of allocated memory increased -- although
the OS was far from its memory resource limits (and virtual memory size
was not much larger then resident memory size). Flushing caches
(and thereby releasing most memory) did not speed up things
but restarting did.

I do not understand this observed behaviour.
every
time I've seen serious fragmentation, it's been related to leaks,
not peak memory usage.

An analysis did not reveal serious leaks, in the cases mentioned above.


Dieter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top