Socket module bug on OpenVMS

I

Irmen de Jong

Hi,

Recently I was bitten by an apparent bug in the BSD socket layer
on Open VMS. Specifically, it appears that VMS defines MSG_WAITALL
in socket.h but does not implement it (it is not in the documentation).
And I use the socket.MSG_WAITALL flag on my recv() calls... and
then they crash on OpenVMS.

I don't have access to an OpenVMS machine myself so could someone
else that has (or has more knowledge of it) shed some light on it?



This also raises the question to what extent Python itself should
work around platform specific "peculiarities", such as this one.
There's another problem with socket code on Windows and VMS systems,
where you get strange exceptions when using a "too big" recv() buffer.

Things like this force me into writing all sorts of error checking
code or platform specific functions, to work around these bugs.
Just for what was supposed to be a simple socket recv() and a
simple socket send()...

In my opinion Python's socket module itself could implement these
workarounds. That would make user code a lot cleaner and less
error prone, and more portable. What do other people think?


Regards
--Irmen de Jong
 
?

=?ISO-8859-15?Q?Jean-Fran=E7ois_Pi=E9ronne?=

Hi,
Recently I was bitten by an apparent bug in the BSD socket layer
on Open VMS. Specifically, it appears that VMS defines MSG_WAITALL
in socket.h but does not implement it (it is not in the documentation).
And I use the socket.MSG_WAITALL flag on my recv() calls... and
then they crash on OpenVMS.
Which Python version, OpenVMS version, IP stack and stack version?

If you think this is a Python on OpenVMS problem, send me a small
reproduced anf I will take a look.

If you think it's a OpenVMS problem and if you can provide a simple
reproducer and have a support contract I suggest you call HP, but I
suspect that if it's not documented the reply would be not (yet?) supported.


I don't have access to an OpenVMS machine myself so could someone
else that has (or has more knowledge of it) shed some light on it?

It appear that the only place, in the Python source code, where
MSG_WAITALL is used is in socketmodule.c:
#ifdef MSG_WAITALL
PyModule_AddIntConstant(m, "MSG_WAITALL", MSG_WAITALL);
#endif

May be a workaround is to not use MSG_WAITALL (currently) on OpenVMS and
in next version I will not defined MSG_WAITALL in the socket module on
OpenVMS.

If you need an access to an OpenVMS let me know.
This also raises the question to what extent Python itself should
work around platform specific "peculiarities", such as this one.
There's another problem with socket code on Windows and VMS systems,
where you get strange exceptions when using a "too big" recv() buffer.

Old Python version on OpenVMS have bugs in the socket module which have
been fixed in newer version. Be sure to check using the latest kit.
Things like this force me into writing all sorts of error checking
code or platform specific functions, to work around these bugs.
Just for what was supposed to be a simple socket recv() and a
simple socket send()...

In my opinion Python's socket module itself could implement these
workarounds. That would make user code a lot cleaner and less
error prone, and more portable. What do other people think?


Regards
--Irmen de Jong

Jean-Francois Pieronne
 
I

Irmen de Jong

Jean-François Piéronne said:
Which Python version, OpenVMS version, IP stack and stack version?

OpenVMS 7.3-2, Python 2.3.5, no idea about IP stack version.
If you think this is a Python on OpenVMS problem, send me a small
reproduced anf I will take a look.

I don't have any small case lying around (I can't reproduce it myself
because I lack access to an openVMS machine) but I think you should
be able to reproduce it by slightly altering one of the socket examples
that comes with Python. Just add the MSG_WAITALL to the recv() call:

something = sock.recv(somesize, socket.MSG_WAITALL)

and you should see it crash with a socket exception.

Mail me offline if you still need running example code (that I think
would expose the problem).
If you think it's a OpenVMS problem and if you can provide a simple
reproducer and have a support contract I suggest you call HP, but I
suspect that if it's not documented the reply would be not (yet?) supported.

I don't have anything to do with HP... the person that reported the
problem to me has, however. He's already aware of the problem.
May be a workaround is to not use MSG_WAITALL (currently) on OpenVMS and
in next version I will not defined MSG_WAITALL in the socket module on
OpenVMS.

How can I detect that I'm running on OpenVMS?


--Irmen
 
?

=?ISO-8859-15?Q?Jean-Fran=E7ois_Pi=E9ronne?=

Irmen de Jong a écrit :
OpenVMS 7.3-2, Python 2.3.5, no idea about IP stack version.
Thanks, may be upgrade to Python 2.5 will solve the problem.
I don't have any small case lying around (I can't reproduce it myself
because I lack access to an openVMS machine) but I think you should
be able to reproduce it by slightly altering one of the socket examples
that comes with Python. Just add the MSG_WAITALL to the recv() call:

something = sock.recv(somesize, socket.MSG_WAITALL)

and you should see it crash with a socket exception.
Ok, I will try.
Mail me offline if you still need running example code (that I think
would expose the problem).

Ok


I don't have anything to do with HP... the person that reported the
problem to me has, however. He's already aware of the problem.

You or the other person can also try to use

http://forums1.itrc.hp.com/service/forums/familyhome.do?familyId=288

How can I detect that I'm running on OpenVMS?
for example

import sys
print sys.platform
'OpenVMS'


JF
 
F

Fredrik Lundh

Irmen said:
This also raises the question to what extent Python itself should
work around platform specific "peculiarities", such as this one.
There's another problem with socket code on Windows and VMS systems,
where you get strange exceptions when using a "too big" recv() buffer.

what's a "strange exception" and a "too big" buffer?
Things like this force me into writing all sorts of error checking
code or platform specific functions, to work around these bugs.
Just for what was supposed to be a simple socket recv() and a
simple socket send()...

if you want buffering, use makefile(). relying on platform-specific
behaviour isn't a good way to write portable code.

</F>
 
I

Irmen de Jong

Fredrik said:
what's a "strange exception" and a "too big" buffer?

The exceptions are MemoryError (I know this one for sure)
and a socket.error I believe (can't remember exactly, and I don't
have a VMS machine to try to reproduce).
Too big buffer means anything above 64 kilobyte or so.

You can find a lot of reports about this happening on Windows at least.
From user reports I've learned that VMS also has similar problems with
recv buffer sizes above a certain size.
if you want buffering, use makefile(). relying on platform-specific
behaviour isn't a good way to write portable code.

I'm not sure if makefile() would shield me from the problems I experienced
(I could try I suppose) but your second remark is exactly my point!
I don't *want* to code around platform-specific behavior. I *want* my
code to be portable. And I expected it to be portable by just using
a regular recv() call... However as it is now, Python's socket module
exposes platform specific (and troublesome) behavior, so I have to write
various workarounds to make my code run without errors on multiple platforms...

Regards,

--Irmen de Jong
 
?

=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=

Irmen said:
In my opinion Python's socket module itself could implement these
workarounds. That would make user code a lot cleaner and less
error prone, and more portable. What do other people think?

It depends: when VMS defines MSG_WAITALL, but doesn't implement it
correctly, then Python probably shouldn't expose it.

Implementing recv() "splitting" is *not* something that Python
could do itself. The question is how you do error reporting
for partial results, and neither exception handling nor error
codes are particularly well-suited to such an error model: these
APIs all assume an all-or-none failure model (i.e. if receiving
fails, nothing has happened so far). If you were to split a
large recv call into multiple smaller ones, there wouldn't be
a good way to communicate both the partial result and the error
that you got on the last call.

Perhaps you had some different work-around in mind?

Regards,
Martin
 
I

Irmen de Jong

Martin said:
It depends: when VMS defines MSG_WAITALL, but doesn't implement it
correctly, then Python probably shouldn't expose it.

That would be one way of solving at least my troubles, because I
already check for the availability of MSG_WAITALL and revert to
a custom recv() loop otherwise.
As others have already reported, at this time Python (well, the build
process) merely checks for the availability of the MSG_WAITALL symbol
in the socket.h C-header file....
Implementing recv() "splitting" is *not* something that Python
could do itself. The question is how you do error reporting
for partial results
[..snip..]

Mm, tricky. Hadn't thought of that.

Although I'm not yet convinced about the all-or-nothing failure
model you were talking about. How can I know if recv() fails
with a MemoryError (or socket error) that it actually didn't
receive anything? Is that even an assumption I can rely on?
If it isn't, splitting up the recv() internally (into smaller
blocks) wouldn't be any different, right?
Perhaps you had some different work-around in mind?

Nope, I was indeed thinking about splitting up the recv() into
smaller blocks internally.
Maybe a more elaborate check in Python's build system (to remove
the MSG_WAITALL on VMS) would qualify as some sort of work-around
as well. Although that would not yet solve the errors you get when
using too big recv() buffer sizes.


Cheers,
--Irmen.
 
?

=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=

Irmen said:
Although I'm not yet convinced about the all-or-nothing failure
model you were talking about. How can I know if recv() fails
with a MemoryError (or socket error) that it actually didn't
receive anything? Is that even an assumption I can rely on?

Yes, you can. It can't produce a MemoryError (AFAICT, there is
no code path which possibly could (*)). If there is a socket
error, then the operating system guarantees that no data
were written into the buffer, and that subsequent calls
will yield any data that might have been available.

(*) It can, of course, produce a MemoryError if the result buffer
can't be allocated. If that happens, no recv(2) call has been
made, so the MemoryError indicates that nothing has happened -
you need to provide the result buffer before calling recv; that's
why you need to provide the size of the message you want to
receive.
If it isn't, splitting up the recv() internally (into smaller
blocks) wouldn't be any different, right?

Right. However, directly calling operating system calls and directly
exposing their results allows for such all-or-nothing error model,
as the operating system API itself is designed with the goal of
atomicity for each operation (or, if there are partial results,
has a way of reporting them, e.g. for partial writes).
Nope, I was indeed thinking about splitting up the recv() into
smaller blocks internally.

Look at shutil.copytree for an example of what this leads to.
This routine originally traversed the input tree, copied all
files it could, and skipped over those it couldn't. It ignored
the basic principle "errors should never pass silently".

I then tried fixing it, and the result is really ugly: It
keeps a list of all exceptions that occurred, and then
raises a single exception that contains them all. The
caller is supposed to know this, and perhaps print out
multiple error messages.

shutil.rmtree takes a different approach, which could be
considered just as ugly: it provides an onerror call-back
function which is invoked whenever some remove operation
fails.

I also think there is some API that returns the partial
result inside the exception if there is an exception.
The caller has to know this, and look at any data that
are already in the exception. Of course, changing an
existing API to switch to such a convention should
be considered as an incompatible change.
Maybe a more elaborate check in Python's build system (to remove
the MSG_WAITALL on VMS) would qualify as some sort of work-around
as well.

Before that is done, somebody should confirm that MSG_WAITALL
is really not behaving correctly on VMS. Then, the check doesn't
have to be all that elaborate: a simple #ifdef would do.

Regards,
Martin
 
?

=?ISO-8859-15?Q?Jean-Fran=E7ois_Pi=E9ronne?=

Irmen said:
Martin v. Löwis wrote: [snip]
Perhaps you had some different work-around in mind?

Nope, I was indeed thinking about splitting up the recv() into
smaller blocks internally.
Maybe a more elaborate check in Python's build system (to remove
the MSG_WAITALL on VMS) would qualify as some sort of work-around
as well. Although that would not yet solve the errors you get when
using too big recv() buffer sizes.

I have contact some guys from HP, unofficial reply is if it's not
documented it's not supported... Why it's defined in socket.h may be
because the TCP/IP stack from HP on OpenVMS is a port of the Tru64 Unix
stack and the flag was present in Tru64.
We will also investigate if the behaviour on OpenVMS when you use
MSG_WAITALL is broken or not.

So I will, in a forthcoming kit don't export this symbol under OpenVMS
(if it's not documented it's not supported).
I have also take a look at the problem of using big recv() buffer and
found a bug when the buffer size is > 32767, only the first 32767 bytes
are read, this will be also fixed in the next kit (only Python 2.5).

I expect to have a new kit which include all these fixes be available
before the end of the week.


JF
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,260
Messages
2,571,039
Members
48,768
Latest member
first4landlord

Latest Threads

Top