rotor replacement

?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Paul said:
Oops, sorry, it's in the os module:

http://docs.python.org/lib/os-miscfunc.html

The difference is simply a matter of the packaging.

No, it's not. It also is a matter of code size, and impact. Small
additions can be reviewed and studied more easily, and need to be
tested on less users. A new module is on a larger scale than
a mere new function.
> Unless you're saying that if I
wanted to add AES to the string module (so you could say
'spam and sausage'.aes_encrypt('swordfish banana')) instead of writing a
separate module, then we wouldn't need this discussion.

Indeed, if it was a single new function to an existing module, I would
not require that this be delivered to users first. It is entire new
libraries that I worry about.

If you would propose a change to the string module to add an aes_encrypt
function, I would immediately reject that patch, of course, because that
function does not belong to the string module.
What matters is the code complexity, not whether
something is in a separate module or not.

A module *is* typically more complex than a single function. If
your new module has only a single new function, we should discuss
whether it really needs to be a separate module.
Well, if he indicates that it's not a policy and that the question is
still open, then I could see getting interested again in writing an
AES module. At the moment I continue to see his python-dev post as
quite discouraging.

And, again, I consider this perfectly fine. This would be a volunteer
effort, and volunteers are free to work on whatever they please.

Furthermore, people who want an AES module for Python could get one
from

http://sourceforge.net/projects/cryptkit/

Maybe Bryan Mongeau will contribute this code to Python some day.
Not true. For example, you once invited me to work on an ancillary
message feature for the socket module (SF bug 814689), and so it's
been on my want-to-do-one-of-these-days list since then. I think it's
reasonable for me to have taken your message there as an expression of
interest, sufficient to get me to want to work on it. So it's bogus
to say the Python developers should avoid expressing interest in
something that hasn't already been written.

I did not say that. I said we don't normally invite people to work
on anything - I said not that we *should* not invite them. Now that
you mention it, I find that there is an important exception from my
factual statement: I do regularly ask people reporting bugs
or requesting features to work fixing the bugs or implementing the
features. It is perfectly fine if they say "no" then. If they say
yes, there is an implied promise that I'll review their code when
they are done.

As it appears to be clear that you are not going to implement
an AES module in the foreseeable future, and as it also seems
to be clear that you cannot talk me into changing my views on
how Python should be developed, I think further discussing this
entire thing is pointless.

Regards,
Martin
 
P

Paul Rubin

Martin v. Löwis said:
Indeed, if it was a single new function to an existing module, I would
not require that this be delivered to users first. It is entire new
libraries that I worry about.

Why is it different if a single new function is added to an existing
module, or if the single new function has the boilerplate of a new
module wrapped around it?

Look at the sha and md5 modules. They are very similar in both
interface and implementation. The only internal function that's
really different is the update operation; they actually might have
been combined into one module that did the other operations with the
same code. But, it's also reasonable to have them as separate
modules. If users start needing sha256, it could be done the same
way, one new update operation and the rest boilerplate, but in
practice it would probably be a separate module.

Are you saying if there was user demand for an sha256 module and
someone wrote one, you'd still require a year of separate distribution?
A module *is* typically more complex than a single function. If
your new module has only a single new function, we should discuss
whether it really needs to be a separate module.

I previously had the mistaken belief that urandom was a new module
rather than a function inserted into an existing module. Note that
the urandom's implementation is not ultra-trivial.

An AES or DES addition to an existing module that implements just one
call:
ECB(key, data, direction)
would be a huge improvement over what we have now. A more complete
crypto module would have some additional operations, but ECB is the
only one that's really essential. I already have a pure-Python module
that does all the other operations using ECB as a subroutine.
It's speed isn't great but it's useable in some real applications.
It's only the ECB operation that's intolerably slow in Python.

If you think a function like that could be added to some existing
module with less hassle than adding a new module, then I can write one
and submit it.
And, again, I consider this perfectly fine. This would be a volunteer
effort, and volunteers are free to work on whatever they please.

Well, volunteers are more likely to work on modules that are mentioned
as being welcome by the developers, than modules affected by explicit
prior developers' public decisions that cast a chill over the hope of
ever getting such a module accepted.
Furthermore, people who want an AES module for Python could get one from

Come on, you're being deliberately obtuse, we've discussed this over
and over. There are plenty of AES modules that people can get from
somewhere. The topic is what it takes to have an AES module that
people don't NEED to get from anywhere, because they already have it
from having Python installed. Do I have to keep repeating "batteries
included" until you understand what it means?
http://sourceforge.net/projects/cryptkit/

Maybe Bryan Mongeau will contribute this code to Python some day.

Well, that code has been around for over a year, people are using it,
etc. Are you saying you'll support its inclusion if Bryan offers to
contribute it? I've examined that module, I wouldn't consider it
ideal for the core (besides AES, it has some complicated additional
functions that aren't useful to most people), but it would certainly
take care of my AES needs (it's apparently missing DES though).
I did not say that. I said we don't normally invite people to work
on anything - I said not that we *should* not invite them.

I would say that inviting people to work on a module for the stdlib
means the developers have thought about whether such a module would be
useful and worth including, and are indicating that they're favorable
to the idea. However, you wrote:

In Message-ID: <[email protected]>
So if the module was primarily written to be included in the core, I
would initially reject it for that very reason. After one year or so
in its life, and a recognizable user base, inclusion can be considered.

The context was new modules in general, not specifically an AES
module. Since "considered" means "thought about", so you said
inclusion shouldn't even be thought about until the module is already
done. That's completely in conflict with the idea of inviting anyone
to work on a new module, since inviting means that there's been some
thought.
Now that you mention it, I find that there is an important exception
from my factual statement: I do regularly ask people reporting bugs
or requesting features to work fixing the bugs or implementing the
features. It is perfectly fine if they say "no" then. If they say
yes, there is an implied promise that I'll review their code when
they are done.

I would say there's an implied promise of something more than a code
review. There's an implied statement that you agree that the proposed
new functionality is useful, which means the patch has a good chance
of being accepted to the stdlib if it's not too messy or cumbersome.
That's a heck of a lot different from saying "why don't you write that
patch and distribute it independently for a year on a purely
speculative basis, and then I'll think about whether it's worthwhile
to include it or not".
As it appears to be clear that you are not going to implement an AES
module in the foreseeable future,

The reason for that is as far as I can tell, even if I follow 100% of
your prescription of writing the module, releasing it independently
and supporting it for a year, and then offering to contribute it
complete with favorable user reviews and a promise of two years of
further support, the probability of it being accepted and included is
still close to zero. In more recent messages you've suggested that my
reading of that probability is wrong and that it's actually higher
than zero.

So let me just ask you one final question: suppose I do all that
stuff. The question: in your personal opinion, based on the best
information you have, what is the your own subjective estimate of the
probability?

I won't say I'd immediately replace my estimate with yours, but if you
name a reasonably high number and tell me that you really believe that
number, then I could get interested again.
and as it also seems to be clear that you cannot talk me into
changing my views on how Python should be developed, I think further
discussing this entire thing is pointless.

Well, you can have whatever views you want, but one I've thing I've
realized from this thread is that many of the frustrations I often
encounter with Python are a direct result of a development process
that's successful in some ways but dysfunctional in others.
 
S

Skip Montanaro

Paul> No. Those are programs people have written in Python or as Python
Paul> extensions.

What's your point? That I have to download and perhaps install them to use
them? In that case, how are these two scenarios different:

* I have to download and build the MySQLdb package to talk to MySQL
servers from Python code

* I have to ensure that the readline library and include files are
installed on my system before the readline module (which is included
in the core distribution) can be built

I and many other people happily use external packages other people have
written as well as make stuff available. My guess is that you do as well.
If everyone adopted your position that it wasn't Python unless it had been
added to the core, we'd all be reinventing lots of wheels or tackling much
less challenging tasks, if we programmed in Python at all. Here's an
incomplete list of stuff not in the core I have used happily over the past
several years to do my jobs using Python:

* MySQLdb, Sqlite, pycopg, sybase-python - all database modules
* CSV, Object Craft's csv, DSV - csv modules predating csv in the core
* SpamBayes
* Quixote
* Docutils
* MoinMoin
* Pyrex
* Psyco
* PyInline
* PyGTK
* xmlrpclib before it was in the core
* MAL's mx.DateTime before the core datetime module was available
* timeout_socket before sockets supported timeouts

Many of those things I could never have written myself, either for lack of
time, lack of skill or both. I'm grateful they were available when I needed
them and feel no qualms about using them even though they are not
distributed with Python proper.

Notice another interesting feature of several of those items: csv,
xmlrpclib, mx.DateTime, timeout_socket. They were all modules I used that
eventually wound up in the core in some fashion. They didn't go in the core
first, then demonstrate their usefulness. It was the other way around.

Not everything that is useful belongs in the core distribution. I think you
are confusing "batteries included" with "everything, including the kitchen
sink".

Skip
 
S

Skip Montanaro

Martin> A module *is* typically more complex than a single function.

And one that deals with cryptography is likely to be even more complex.

Skip
 
P

Paul Rubin

Skip Montanaro said:
What's your point? That I have to download and perhaps install them to use
them? In that case, how are these two scenarios different:

* I have to download and build the MySQLdb package to talk to MySQL
servers from Python code

* I have to ensure that the readline library and include files are
installed on my system before the readline module (which is included
in the core distribution) can be built

The difference is that once Python is installed on your machine and
you can get a ">>>" prompt, you have readline available right away but
you have to download something to use MySQLdb. Whoever took care of
your Python installation, and it may not have been you, also took care
of readline. The past several OS distributions I've installed have
included Python and readline out of the box, so I never had to think
about readline. The last time I used a Python instance that didn't
come with the OS (on Windows XP at work), the IT department had
installed Python on my desktop before I started using it, so I still
didn't have to worry about readline. But any module that doesn't
come in the distro, I have to download myself.
I and many other people happily use external packages other people have
written as well as make stuff available. My guess is that you do as well.

No, I don't. I do use them sometimes but I'm unhappy about them. If
I can write something using a core module instead of an external
module, I prefer to use the core module. So I'll generally use dbm
instead of MySQL unless I really need MySQL, which I haven't yet in
Python (I've used MySQL with Perl dbi, but Perl, you know, shudder).

Also, external module installation scripts often don't work properly,
so I end up having to wrestle the code to get it installed. And if a
geek like me has such trouble installing external modules, what hope
does a normal end-user have? Maybe if you're using Windows, that
stuff has been debugged better, but I've had poor results under
GNU/Linux.

I've had this discussion here before, maybe not with you. What I
really want is zero installations of anything. I just want to go to
the store and buy a new computer and have a complete OS install with
full sources and a full set of applications including Python already
installed when I first power it up. My practical approximation is to
buy a new computer, immediately reformat the HD to remove the icky
Redmond virus, and then install a GNU/Linux distro that includes
Python (and readline). If Python really aims for world domination,
that means it has to shoot for being preinstalled on almost every new
computer the way Windows is now. And all the interesting modules
should be there, maybe in a "contrib" directory that gets little or no
maintenance priority from the core team.
If everyone adopted your position that it wasn't Python unless it
had been added to the core, we'd all be reinventing lots of wheels
or tackling much less challenging tasks, if we programmed in Python
at all. Here's an incomplete list of stuff not in the core I have
used happily over the past several years to do my jobs using Python:

That makes no sense at all. That list is a list of programs written
in the Python language. They are Python programs, where Python is an
adjective. Python, the noun referring to a piece of software,
generally means the stuff in the Python distro. That doesn't stop
programs outside the distro from being useful. Mediawiki is a PHP
program. That doesn't mean Mediawiki is part of PHP.
* MySQLdb, Sqlite, pycopg, sybase-python - all database modules

These should all be in the core if Python wants to be a serious
competitor to PHP, which comes with interfaces for those db's and
several additional ones besides. That these modules are missing are a
significant library deficiency.
* CSV, Object Craft's csv, DSV - csv modules predating csv in the core

That's fixed now, csv is in the core.
* SpamBayes

I have the impression this is an application and not a module, or
anyway is written mainly to support one application. Should be
separate. Also, it's written in Python(?) rather than C, which means
the installation headaches from not being in the core aren't so bad.
* Quixote

Don't know what this is.
* Docutils

Should be in the core if it's what I think it is.
* MoinMoin

Application, should be separate. Also, GPL'd, I think. Can't be
distributed under PSF license.

Sort of problematic, would be interesting to have something like this
in the core but maybe Pyrex as it currently stands isn't the answer.

I have the impression that PyPy is going to depend on Pyrex in a
fundamental way, so it will have to be in the core when we dump CPython.

I think this isn't ready for prime time yet. Should go into the core
once it is.
* PyInline

Not sure what this is.

wxPython might be a better choice. wxpython.org quotes Guido as
saying "wxPython is the best and most mature cross-platform GUI
toolkit, given a number of constraints. The only reason wxPython isn't
the standard Python GUI toolkit is that Tkinter was there first". I
don't have direct experience with either wxPython or PyGTK, though.
If I can get by with Tkinter, I'd rather do that, since Tkinter is in
the core.
* xmlrpclib before it was in the core

1. Did you really need this, instead of some more reasonable rpc format?
xdrlib has been in the core forever.
2. Isn't xmlrpclib written in Python?
* MAL's mx.DateTime before the core datetime module was available

See, as Python improved, those things went into the core.
* timeout_socket before sockets supported timeouts

Could you use sigalarm instead?
Many of those things I could never have written myself, either for
lack of time, lack of skill or both. I'm grateful they were
available when I needed them and feel no qualms about using them
even though they are not distributed with Python proper.

Sure, it's fine if you have all those modules and you write a Python
program that uses, say, five of them. External modules aren't so bad
when the developer and the end user are the same person. What happens
if you send your Python program to a nonprogrammer friend who has just
a vanilla Python installation? Now he has to download and install
those five modules too. You send him the url's where you got the
modules a year ago. What are the chances that the 5 url's even all
still work, much less the chance of him being able to install and run
all 5 of the modules without needing help? What if the versions he
downloads (from separate developers) have gotten out of sync with each
other and can't interoperate any more?

If Python's maintainers are going to recommend those modules to people
as a substitute for providing those functions in the core, it would be
a big help if the modules were mirrored on python.org instead of
merely linked, since a lot of links turn into 404's over time.
Notice another interesting feature of several of those items: csv,
xmlrpclib, mx.DateTime, timeout_socket. They were all modules I used that
eventually wound up in the core in some fashion. They didn't go in the core
first, then demonstrate their usefulness. It was the other way around.

I'm not sure about timeout_socket and it sounds like it should have
just been a patch to the socket module, not a new module. csv is
quite a complex module and took a lot of tweaking and PEP editing
before standardization. But the need for it was obvious; the only
good reason it wasn't in the core ages ago was that no one had done
the work of writing it and shaking it out. xmlrpclib, not sure. How
long was it in separate distribution? Also, xmlrpc is a pretty new
protocol so it took a while before people wanted it. DES has been
around since the 1970's and AES has about the same users as DES, so
there's a known level of demand. That should be enough to say yes or
no to whether there should be a core module or not.

Also, your notion of trying to create a "category king" of AES modules
doesn't reflect how these things work. There are at least 10
different AES modules that provide the same AES function. If somebody
is using one, they have no reason to switch to another. If it takes
20 visible users for including a module to be worthwhile, and those 10
modules have 5 users each, those populations are going to stay stable
until one of them goes in the core and becomes the standard. (And
actually: mxCrypto is the most capable of these packages and might be
the one with the most users, but it's completely unsuitable for the
core because of its size).
Not everything that is useful belongs in the core distribution. I
think you are confusing "batteries included" with "everything,
including the kitchen sink".

Well, if you compare the Python stdlib with the toolkits that come
with competing languages, say PHP or Java, you can see that Python's
lib could be enhanced with considerably more stuff without being
excessive.
 
P

Paul Rubin

Skip Montanaro said:
And one that deals with cryptography is likely to be even more complex.

No. The AES module would have about the same complexity as the SHA module.
 
P

Paul Rubin

Paul Rubin said:
actually: mxCrypto is the most capable of these packages and might be
the one with the most users, but it's completely unsuitable for the
core because of its size).

Oops, I should say, mxCrypto itself isn't that large; the issue is
that it needs OpenSSL which is a big unwieldy program. Having
mxCrypto in the core as an OpenSSL interface is a legitimate notion.
But there should be something that doesn't depend on OpenSSL.
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Paul said:
An AES or DES addition to an existing module that implements just one
call:
ECB(key, data, direction)
would be a huge improvement over what we have now.

Apparently, people disagree on what precisely the API should be. E.g.
cryptkit has

obj = aes(key)
obj.encrypt(data)

I think I would prefer explicit encrypt/decrypt methods over a
direction parameter. Whether or not selection of mode is a separate
parameter, or a separate method, might be debatable - I'ld personally
prefer a separate method. However, we would have to ask users.
If you think a function like that could be added to some existing
module with less hassle than adding a new module, then I can write one
and submit it.

I would trust my intuition more for a single function than for an
entire API. In this specific proposal, I think I would trust my
intuition and reject the ECB function because of the direction
argument.
Come on, you're being deliberately obtuse, we've discussed this over
and over. There are plenty of AES modules that people can get from
somewhere. The topic is what it takes to have an AES module that
people don't NEED to get from anywhere, because they already have it
from having Python installed. Do I have to keep repeating "batteries
included" until you understand what it means?

I fully understand what you desire - to include the module "as a
battery". What makes this decision difficult is that you fail to
understand that I don't want included batteries so much that I
would accept empty or leaking batteries.
Well, that code has been around for over a year, people are using it,
etc. Are you saying you'll support its inclusion if Bryan offers to
contribute it?

*Now* you get it. Precisely that. I would ask the users what they
think about the API (shouldn't be too difficult because the module
does have users) and what they think about other aspects (performance,
stability, and so on).
I've examined that module, I wouldn't consider it
ideal for the core (besides AES, it has some complicated additional
functions that aren't useful to most people)

Ok, that would be a problem. If this is a simple removal of functions
that you'ld request (which functions?), I'ld try to collect opinions
on that specific issue, and ask Bryan whether he could accept
removal of these functions.
So if the module was primarily written to be included in the core, I
would initially reject it for that very reason. After one year or so
in its life, and a recognizable user base, inclusion can be considered.

The context was new modules in general, not specifically an AES
module. Since "considered" means "thought about", so you said
inclusion shouldn't even be thought about until the module is already
done. That's completely in conflict with the idea of inviting anyone
to work on a new module, since inviting means that there's been some
thought.

I rarely invite people to work on new modules. For new modules, I
normally propose that they develop the module, and ship it to users
for some time.

I may have made exceptions to this rule in the past, e.g. when the
proposal is to simply wrap an existing C API in a Python module
(like shadow passwords). In this case, both the interface and
the implementation are straight-forward, and I expect no surprises.
For an AES module (or most other modules), I do expect surprises.
I would say there's an implied promise of something more than a code
review. There's an implied statement that you agree that the proposed
new functionality is useful, which means the patch has a good chance
of being accepted to the stdlib if it's not too messy or cumbersome.

I have said many times that I am in favour of including an AES
implementation in the Python distribution, e.g. in

http://mail.python.org/pipermail/python-dev/2003-April/034963.html

What I cannot promise is to include *your* AES implementation,
not without getting user feedback first. The whole notion of
creating the module from scratch just to include it in the core
strikes me as odd - when there are so many AES implementations
out there already that have been proven useful to users.
So let me just ask you one final question: suppose I do all that
stuff. The question: in your personal opinion, based on the best
information you have, what is the your own subjective estimate of the
probability?

Eventually, with hard work, I estimate the chances at, say, 90%. That
is, eventually, unless the code itself shows flaws, the module *will*
be included. However, initially, when first proposed, the chances are
rather like 10%. I.e. people will initially object. Decision processes
take their time, and valid concerns must be responded to. I personally
think that there is a good response to each concern, but it will take
time to find it. Before that, it will take time to find out what
precisely the concern is.

Regards,
Martin
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Paul said:
(And
actually: mxCrypto is the most capable of these packages and might be
the one with the most users, but it's completely unsuitable for the
core because of its size).

mxCrypto is primarily unsuitable for the core because Marc-Andre Lemburg
will never ever contribute it. He is very concerned about including
crypto code with the Python distribution, so he certainly won't
contribute his own.

Regards,
Martin
 
N

Nick Craig-Wood

Paul Rubin said:
An AES or DES addition to an existing module that implements just one
call:
ECB(key, data, direction)
would be a huge improvement over what we have now. A more complete
crypto module would have some additional operations, but ECB is the
only one that's really essential.

I would hate to see a module which only implemented ECB. Sure its the
only operation necessary to build the others out of, but its the least
secure mode of any block cipher.

If you don't offer users a choice, then they'll use ECB and just that
along with all its pitfalls, meanwhile thinking that they are secure
because they are using AES/DES...

For those people following along at home (I'm sure everyone who has
contributed to thread knows this already) I tried to find a simple
link to why ECB is bad, this PDF is the best I could come up with, via
Google's Cache.

http://www.google.com/search?q=cach...ers.se/Cs/Grundutb/Kurser/krypto/lect04_4.pdf
 
S

Skip Montanaro

Paul> I've had this discussion here before, maybe not with you. What I
Paul> really want is zero installations of anything.

Fine. Go build a sumo distribution and track the normal CPython. The
problem isn't all that new. (Take a look at scipy.org for one take on that
theme. Of course Linux distros have been doing their take on this forever.)

Paul> That makes no sense at all. That list is a list of programs
Paul> written in the Python language. They are Python programs, where
Paul> Python is an adjective.

No, many of them are just modules or programming frameworks.

Paul> I have the impression this is an application and not a module,

Yes, you're correct.

Paul> Don't know what this is.

Web app framework.

Paul> Should be in the core if it's what I think it is.

Probably will be someday.

Paul> Application, should be separate. Also, GPL'd, I think. Can't be
Paul> distributed under PSF license.

Sure.

Paul> I think this isn't ready for prime time yet. Should go into the
Paul> core once it is.

It's getting close for those of us with Intel chips in our boxes.

Paul> Not sure what this is.

A module for inlining C code within a Python module. Also see Weave from
the scipy.org folks. It was inspired by the Perl Inline::C module.

Paul> wxPython might be a better choice.

Doesn't matter. At work they decreed GTK as the GUI platform long before I
came along (they also use gtkmm for C++ apps). It's still an example of a
broadly useful package available outside the core distribution.

Paul> 1. Did you really need this, instead of some more reasonable rpc
Paul> format?

Yes, for several years I used a homegrown RPC solution behind the Musi-Cal
website that was Python only. Eventually Mojam (a Perl shop) bought
Musi-Cal (a Python shop). I switched to XML-RPC with little effort. At one
point we also had Java talking XML-RPC.

Paul> xdrlib has been in the core forever.

Sure. But it's somewhat lower level than XML-RPC and isn't really an RPC
protocol. It's just a marshalling protocol and is probably not as flexible
as XML-RPC at that.

Paul> 2. Isn't xmlrpclib written in Python?

Yes. The implementation language is just a detail. I also use Fredrik
Lundh's sgmlop library to accelerate XML-RPC and play some other games when
I know I'm talking Python-to-Python (marshal my args, then XML-RPC the
result passing a single argument between the client and server).

Paul> See, as Python improved, those things went into the core.

Sure, than that's what Martin has been trying to tell you about your AES
proposal. Put it out there, refine it, and get it into the core when it's
mature.

Paul> Could you use sigalarm instead?

I suppose. That's not the point though. I'm not married to the concept as
you seem to be that something has to be in the core distribution to be of
use to me. I'm perfectly happy incorporating solutions other people
provide. I believe you will find I am in the majority in this regard.

Paul> Sure, it's fine if you have all those modules and you write a
Paul> Python program that uses, say, five of them. External modules
Paul> aren't so bad when the developer and the end user are the same
Paul> person. What happens if you send your Python program to a
Paul> nonprogrammer friend who has just a vanilla Python installation?

I figure out some other packaging solution. In my world most of the
software I write is for my employer, so this is not a problem I face very
often. People use freeze, py2exe, py2app or other packaging solutions to
solve most/all of these problems.

Paul> Now he has to download and install those five modules too. You
Paul> send him the url's where you got the modules a year ago. What are
Paul> the chances that the 5 url's even all still work, much less the
Paul> chance of him being able to install and run all 5 of the modules
Paul> without needing help? What if the versions he downloads (from
Paul> separate developers) have gotten out of sync with each other and
Paul> can't interoperate any more?

This is the well-known "CPAN in Python" problem. People are working on it.
Perhaps you would like to spend some energy helping solve it. If so, join
the catalog-sig.

Paul> I'm not sure about timeout_socket and it sounds like it should
Paul> have just been a patch to the socket module, not a new module.

Sure, but a shim between the socket module and Python modules that used it
was a good first approximation to the problem. (I am also a firm believer
in successive approximation to problem solving, especially when I don't know
enough about the problem to know precisely what form the final solution will
take.)

Paul> csv is quite a complex module and took a lot of tweaking and PEP
Paul> editing before standardization. But the need for it was obvious;
Paul> the only good reason it wasn't in the core ages ago was that no
Paul> one had done the work of writing it and shaking it out.

Actually, there were at least two fairly mature implementations of CSV
modules out there before the PEP was a twinkle in anyone's eye. The authors
of those modules got together and wrote the current PEP and module from
scratch based upon their collective experience. I think the effort of
having a couple versions out in the field followed by joint effort to
produce something worthy of inclusion in the core is an excellent
demonstration of what Martin has been saying all along.

Paul> xmlrpclib, not sure. How long was it in separate distribution?

Not all that long. XML-RPC itself hadn't been around very long before
Fredrik wrote xmlrpclib. Both the protocol and xmlrpclib (as well as
similar modules for other languages) caught on pretty quickly.

Skip
 
P

Paul Rubin

Martin v. Löwis said:
Apparently, people disagree on what precisely the API should be. E.g.
cryptkit has

obj = aes(key)
obj.encrypt(data)

I don't disagree about the API. The cryptkit way is better than ECB
example I gave, but the ECB example shows it's possible to do it in
one call.
I think I would prefer explicit encrypt/decrypt methods over a
direction parameter. Whether or not selection of mode is a separate
parameter, or a separate method, might be debatable

I prefer separate methods too, however if it was done with a direction
flag instead, it wouldn't really cause a problem. As long as the
functionality is there, I can use it.
I would trust my intuition more for a single function than for an
entire API. In this specific proposal, I think I would trust my
intuition and reject the ECB function because of the direction argument.

As an experienced user of a lot of these packages, I can tell you I've
seen it done both ways and I have a slight preference for separate
calls, but it really doesn't matter one way or the other and it's not
worth getting in a debate about it or having a committee design the
API and worry about such trivial issues.

BTW, the main reason to reject the example ECB function is that
creating a key object ("key schedule") from a string can take
significant computation (sort of like compiling a regexp) so the ECB
function for some ciphers would have to cache the object like the
regexp module does. Yuck.

The direction flag question would normally be between:

key = aes.key(key_data)
ciphertext = key(plaintext, "e")

or
key = aes.key(key_data)
ciphertext = key.encrypt(plaintext)

FWIW, another way to do it, also sometimes preferable, is:

key = aes.ecb(key_data, "e") # e for encryption, or "d" for decryption
ciphertext = key(plaintext)

I think the module I proposed did it this last way, but I haven't
looked at it in a while.

The point is that when faced with yet another crypto package, I don't
get in too concerned about which simple API variant it uses to do such
a basic operation. I just care that the operation is available. I
look in the docs to find that package's particular API for that
operation, and I do what the docs say.

I should make it clear that this module is like Python's low-level
"thread" module in that you have to know what you're doing in order to
use it directly without instantly getting in deep trouble. Most
applications would instead use it indirectly through one or more
intermediate layers.
I fully understand what you desire - to include the module "as a
battery". What makes this decision difficult is that you fail to
understand that I don't want included batteries so much that I
would accept empty or leaking batteries.

I do understand that, and the prospect of empty or leaking batteries
is vitally important to considering whether to include a battery
that's included, but for the purposes of an included-battery
discussion, the characteristics of NON-included batteries is not
relevant, given that we know they exist.
Ok, that would be a problem. If this is a simple removal of functions
that you'ld request (which functions?),

OK. First you have to decide whether you want a general crypto
toolkit, or just an AES module. I've been concentrating on just an
AES module (or rather, a generic block cipher module with AES and DES)
since I figure that creates fewer areas of controversy, etc. I think
it's too early to standardize a fancy toolkit. Once there's block
ciphers, we can think about adding more stuff afterwards.

For that module, I'd say remove everything except AES and maybe
SHA256, and ask that DES be added. SHA256 is possibly useful, but
isn't really part of an encryption package; it can be separated out
like the existing sha and md5 modules. Also, it should be brought
into PEP 247 compliance if it's not already.

Rationale: I'd get rid of the entropy module now that os.urandom is
available. Having the OS provide entropy is much better than trying
to do it in user code. I'd get rid of the elliptic curve stuff unless
there's some widely used standard or protocol that needs that
particular implementation. Otherwise, if I want ECC in a Python
program, I'd just do it on characteristic-p curves in pure Python
using Python longs. (Bryan's package uses characteristic-2 curves
which means the arithmetic is all boolean operations, that are more
efficient on binary CPU's, especially small ones. But that means the
module has to be written in C, because doing all those boolean
operations in Python is quite slow. It would be like trying to do
multi-precision arithmetic in Python with Python ints instead of
longs). Once there's a widely accepted standard for ECC like there is
for AES, then I'd want the stdlib to have an implementation of the
standard, but right now there are just a lot of incompatible,
nonstandard approaches running around.

If SHA256 is accepted then SHA512/SHA384 (these are basically the
same) might as well also be. Not many people are using any of these
hash functions right now. Usage will increase over time (they are US
federal standards like AES), and they'll probably be worth adding
eventually. I'm indifferent to whether they're added now.

I think I'd include RC4 under the "toolkit" approach, if it's not
already there. I'd also include a pair of functions like the ones in
p3.py, i.e. an utterly minimal API like:

ciphertext = encrypt_string(key_string, plaintext)
plaintext = decrypt_string(key_string, ciphertext)

that does both encryption and authentication, for key and data strings
of arbitrary size. This would be what people should use instead of
the rotor module. It would be about a 10 line Python function that
calls the block cipher API. The block cipher API itself is intended
for more advanced users.
I may have made exceptions to this rule in the past, e.g. when the
proposal is to simply wrap an existing C API in a Python module
(like shadow passwords). In this case, both the interface and
the implementation are straight-forward, and I expect no surprises.

I'd be happy with an AES module that simply wrapped a C API and that
should be pretty much surprise-free. It would be about like the SHA
module in terms of complexity. What I proposed tries to be a bit more
Pythonic but I can live without that.
For an AES module (or most other modules), I do expect surprises.

Well, the hmac module was added in 2.2 or 2.3, without any fuss. It's
written in Python and is somewhat slow, though. What kind of
development process should it take to replace it in the stdlib with a
C module with the exact same interface?

I think you're imagining a basic AES module to be more complicated
than it really is, because you're not so familiar with this type of
module. Also, possibly because I'm making it sound like a lot of work
to write. But that work is just for the C version, and assumes that
it's me personally writing it. What little experience I've had with
Python's C API has been painful, so I figure on having to spend
considerable time wrestling with it. Someone more adapt with the C
API could probably implement the module with less effort than I'd need.
I have said many times that I am in favour of including an AES
implementation in the Python distribution, e.g. in

Oh, ok. Earlier you said you wanted user feedback before you could
conclude that there was reason to want an AES module at all.
What I cannot promise is to include *your* AES implementation,
not without getting user feedback first.

That's fine. But I think it's reasonable to actually approach some
users and say "this module is being considered for addition to the
core--could you try plugging it into your applications that now use
other external modules, and say whether you think it will fill your
needs".

That's impossible if consideration doesn't even start until testing is
complete. All one could say then is "here's yet another crypto
module, that does less than the toolkit you're already using, could
you please temporarily drop whatever you're doing and update your
programs to switch from your old module that works, to a new module
that MIGHT work?". If the new module is accepted into the core, of
course, it becomes worth retrofitting the existing toolkits to use it.
The whole notion of creating the module from scratch just to include
it in the core strikes me as odd - when there are so many AES
implementations out there already that have been proven useful to users.

We discussed this already. Here are three possible contexts for
designing a crypto module:

1) You're designing it to support some specific application you're
working on. The design will reflect the needs of that application
and might not be so great for a wider range of applications.
2) You're writing a general purpose module for distribution outside
the core (example: mxCrypto). You'll include lots of different
functions, pre-built implementations of a variety of protocols, etc.
You might include bindings that try to be compatible with other
packages, etc. Maybe this can get added to the core someday,
like numarray, but for now, that's a rather big step.
3) You're designing to add basic functionality to the core. Here,
you try to pick a general purpose API not slanted towards a
particular app, and provide just some standard building blocks
that other stuff can be built around. This is more like the
math module, which just does basics: sqrt, sin, cos, etc., with no
attempt at the stuff in a package like numarray. But if there's
a standard like FIPS 80 and if there's good reason to implement
a big subset of it (which there is), then you may as well
implement the standard completely unless there's a good reason
not to (which there isn't; the less important operations are a
few dozen lines of code total).

I think context #3 gets you something better suited for the core and
none of the existing crypto modules were written that way. The same
is in fact true for many of the non-crypto modules, that seem to have
been written in context #1 and would have been better under context #3.

Also, there's the plain fact that none of the authors of the existing
crypto modules have offered to contribute them. So somebody had to
step up and do something.
Eventually, with hard work, I estimate the chances at, say, 90%.

Hmm, this is very very interesting. I am highly confident that all
the purely technical questions (i.e. everything about the API and the
code quality, etc.) can converge to a consensus-acceptable solution
without much hassle. I had thought there were insurmountable
obstacles of a nontechnical nature, mainly caused by legal issues, and
that these are beyond any influence that I might have by writing or
releasing anything.
 
P

Paul Rubin

Martin v. Löwis said:
mxCrypto is primarily unsuitable for the core because Marc-Andre Lemburg
will never ever contribute it. He is very concerned about including
crypto code with the Python distribution, so he certainly won't
contribute his own.

Oh wait, I confused mxcrypto and m2crypto. Sorry. Anyway, the
technical considerations are similar.
 
P

Paul Rubin

Nick Craig-Wood said:
I would hate to see a module which only implemented ECB. Sure its the
only operation necessary to build the others out of, but its the least
secure mode of any block cipher.

It's intended as a building block for other modes. Most applications
shouldn't use it directly.
If you don't offer users a choice, then they'll use ECB and just that
along with all its pitfalls, meanwhile thinking that they are secure
because they are using AES/DES...

The documentation has to be written somewhat forcefully to tell users
what not to do. I can help with that. I've had to do that a lot,
supporting crypto packages in projects where the other programmers
haven't used crypto very much.
 
P

Paul Rubin

Skip Montanaro said:
Paul> Don't know what this is.

Web app framework.

I think Python should add a web app framework to its core, again since
it otherwise can't seriously begin to compete with PHP. However,
there are lots of approaches so this is an example of where your
suggested process of letting a bunch of different implementations
circulate before choosing something is a good idea.
Paul> Not sure what this is.

A module for inlining C code within a Python module. Also see Weave from
the scipy.org folks. It was inspired by the Perl Inline::C module.

Hmm, sounds like it has the same issues as Pyrex. I'm also not sure
why you'd want both PyInline and Pyrex.
Paul> wxPython might be a better choice.

Doesn't matter. At work they decreed GTK as the GUI platform long before I
came along (they also use gtkmm for C++ apps).

Can't wxPython use GTK?
It's still an example of a broadly useful package available outside
the core distribution.

I'd say if access to GTK is widely important functionality, then the
core should provide it in some way (e.g. through wxPython) and that's
enough. If your company wants some different (i.e. nonstandard, if
wxPython becomes the standard) form of access, then it can deal with
the consequences of not following standards.
Paul> 2. Isn't xmlrpclib written in Python?
Yes. The implementation language is just a detail.

I think it's more than a detail. If an external module is written in
Python, I can download it from wherever and include it with my own app
that I send to an end user. I do the work so the end user doesn't
have to. If it's written in C, then the end user has to deal with it.
Paul> See, as Python improved, those things went into the core.

Sure, than that's what Martin has been trying to tell you about your AES
proposal. Put it out there, refine it, and get it into the core when it's
mature.

What kind of refinements are you envisioning? This isn't a web
application framework we're talking about. It's more like the sha
module.
Paul> Could you use sigalarm instead?

I suppose. That's not the point though. I'm not married to the concept as
you seem to be that something has to be in the core distribution to be of
use to me. I'm perfectly happy incorporating solutions other people
provide.

So aren't you happier when the other person provides you with a
solution that installs with one command, instead of a solution that
requires you to download N different modules from who knows where, and
install them separately, all while hoping that they haven't been
tampered with? If I'm trying to provide someone else with a solution,
I'd rather use sigalarm than make the end-user download an extra
module, because I think they'll be happier that way.
Paul> What happens if you send your Python program to a
Paul> nonprogrammer friend who has just a vanilla Python installation?

I figure out some other packaging solution. In my world most of the
software I write is for my employer, so this is not a problem I face very
often. People use freeze, py2exe, py2app or other packaging solutions to
solve most/all of these problems.

Only those people who think that a cross-platform application is one
that works on both XP Home and XP Pro. That does simplify some
things. Life in a cult is often indeed simpler than life in the real
world said:
Actually, there were at least two fairly mature implementations of
CSV modules out there before the PEP was a twinkle in anyone's eye.
The authors of those modules got together and wrote the current PEP
and module from scratch based upon their collective experience.

Yes, CSV is complicated and benefits from that process just like
web app frameworks do. Let's pick another example, the hmac module
that appeared in Python 2.2. It implements the RFC 2104 HMAC algorithm.

Where are the two mature implementations that circulated before the
hmac module was added? Where were the authors pooling their
collective wisdom? Where was the year of user feedback? The answer
is, nothing like that was needed. HMAC is simple enough for a module
author to read RFC 2104 and implement what it says, run some tests,
and declare the module good to go.
I think the effort of having a couple versions out in the field
followed by joint effort to produce something worthy of inclusion in
the core is an excellent demonstration of what Martin has been
saying all along.

Martin is saying the opposite: that he doesn't understand the point of
writing a new module that synthesizes from experiences with old
modules, instead of just using one of the old modules.

I don't think there's a one-size-fits-all answer to any of these
questions. You have to have your hands in the details of a specific
problem, to arrive at the best way to deal with that problem.
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Paul said:
Oh, ok. Earlier you said you wanted user feedback before you could
conclude that there was reason to want an AES module at all.

I believe I never said that. I said that I wanted user feedback to
determine whether *this* AES module (where this is either your
from-scratch implementation, or any other specific implementation
contributed) is desirable.
Hmm, this is very very interesting. I am highly confident that all
the purely technical questions (i.e. everything about the API and the
code quality, etc.) can converge to a consensus-acceptable solution
without much hassle. I had thought there were insurmountable
obstacles of a nontechnical nature, mainly caused by legal issues, and
that these are beyond any influence that I might have by writing or
releasing anything.

These obstacles are indeed real. But I believe they are not
unsurmountable. For example, there is the valid complaint that,
in order to export the code from SF, we need to follow U.S.
export laws. 10 years ago, these would have been unsurmountable.
Today, it is still somewhat painful to comply with these laws,
but this is what the PSF is for, which can fill out the forms
necessary to allow exporting this code from the U.S.A.

Regards,
Martin
 
N

Nick Craig-Wood

Skip Montanaro said:
Fine. Go build a sumo distribution and track the normal CPython.
The problem isn't all that new. (Take a look at scipy.org for one
take on that theme. Of course Linux distros have been doing their
take on this forever.)

If I'm writing code just for fun. I'll be doing on Debian Linux, then
I can do

apt-get install python-crypto

and I'm away.

However if I'm writing code for work, it has to work on windows as
well, which introduces a whole extra barrier to using 3rd party
modules. Something else to compile. Something else to stick in the
installer. Basically a whole heap of extra work.

I think one of the special things about Python is its batteries
included approach, and a crypto library would seem to be an obvious
battery to install since it doesn't (or needn't) depend on any other
library or application.
 
P

Paul Rubin

Nick Craig-Wood said:
There is a PEP about this...

API for Block Encryption Algorithms v1.0
http://www.python.org/peps/pep-0272.html

Yes, I know about that and have been in contact with its author. He
and I are in agreement (or at least were in agreement some time ago)
that the proposed API of the new module is an improvement, at least
for a generic module. PEP 272 seems to document the interface of
something that had been implemented for some particular application.
 
P

Paul Rubin

Martin v. Löwis said:
I believe I never said that. I said that I wanted user feedback to
determine whether *this* AES module (where this is either your
from-scratch implementation, or any other specific implementation
contributed) is desirable.

If that's what you're saying now, I'll accept it and not bother
looking for your other posts that came across much differently.
These obstacles are indeed real. But I believe they are not
unsurmountable. For example, there is the valid complaint that,
in order to export the code from SF, we need to follow U.S.
export laws. 10 years ago, these would have been unsurmountable.

Well, yes, 10 years ago, SF didn't exist <wink>. But there was an ftp
site run by Michael Johnson that had some special server side checks
that made sure the client was in the USA. That was considered good
enough to comply with the export regs, and another guy and I
distributed crypto code (a program that let you use your PC as an
encrypted voice phone) through that site in 1995.

Of course, every time my co-author and I released a new version
through the controlled ftp site, within a day or so the code somehow
managed to show up on another ftp site in Italy with worldwide access.
We (the authors) always managed to be shocked when that happened. But
we had nothing to do with it, so it wasn't our problem.
Today, it is still somewhat painful to comply with these laws,
but this is what the PSF is for, which can fill out the forms
necessary to allow exporting this code from the U.S.A.

Well, complying is painful in the sense of being morally repugnant
(people shouldn't have to notify the government in order to exercise
their free speech rights), but the actual process is pretty easy in
terms of the work required. Python should qualify for the TSU
exception which means you just need to send an email to the BXA,
without needing to fill out any forms. I thought that's what you had
done for the rotor module.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top