Perl's read() vs. sysread()

J

J. Romano

Hi,

I was wondering if anyone could tell me the differences between
Perl's read() function and its sysread() function. Now, by reading
the perldocs I know that Perl's read() function implements the
system's fread() call and that Perl's sysread() function implements
the system's read() call, but I really don't know what that means. I
tried reading the man pages on fread() and read(), but that didn't
help me much.

Here is what else I know about read and sysread (please correct me
if I'm wrong):

* read belongs to a group of functions that includes read, print,
write, seek, tell, eof, and the angle-bracket-filehandle-operator.

* sysread belongs to a group of functions that includes sysread,
syswrite, and sysseek.

* The functions in these two groups should not be mixed (unless, as
the Camel book says, I am into wizardry and/or pain). (Just an aside
note: I once accidentally used a print statement on a socket that I
had used sysread() on. It worked fine the first day, but froze up on
the print statement the next day. When I finally found the error and
changed the print statement to syswrite(), the program no longer froze
on me. This was a classic case of "the program worked fine for me
yesterday.")

* The functions open, close, and binmode can be used safely with
functions of both groups.

That's all I know about the difference between Perl's read() and
sysread() functions. What I would still like to know is:

* When should I use read() over sysread() (or sysread() over
read())?

* What differences in program execution can I expect if I switch my
read() statements to sysread() (or vice-versa)?

* If I were to open a socket over the internet using IO::Socket,
would it best to use the read() group of functions or the sysread()
group of functions?

* Since Perl's read() function uses the system's fread() call and
Perl's sysread() function uses the system's read() call, what does
that mean to me if I'm using those functions on a non-Unix system,
like Win32 using ActiveState Perl? I would imagine that, in that
particular circumstance, there would be no difference between Perl's
read() and fread() functions, but the mix-up I mentioned above about
using a print statement with a sysread() function was done on a Perl
program running on a Windows XP machine, so something different must
be happening under the hood even on Win32 operating systems.

Thanks in advance for any input,

Jean-Luc
 
P

Paul Lalli

Hi,

I was wondering if anyone could tell me the differences between
Perl's read() function and its sysread() function. Now, by reading
the perldocs I know that Perl's read() function implements the
system's fread() call and that Perl's sysread() function implements
the system's read() call, but I really don't know what that means. I
tried reading the man pages on fread() and read(), but that didn't
help me much.
This is far from a complete answer, but I know that one difference is that
the sys* family of functions operates on data unbuffered, whereas the
read(), print(), etc functions buffer their I/O. This is because this
second group uses the stdio or perlio layers, whereas sys* bypass those
layers to interface directly with the system.

I'm sure someone else can give more details.

Paul Lalli
 
B

Ben Morrow

I was wondering if anyone could tell me the differences between
Perl's read() function and its sysread() function. Now, by reading
the perldocs I know that Perl's read() function implements the
system's fread() call and that Perl's sysread() function implements
the system's read() call, but I really don't know what that means. I
tried reading the man pages on fread() and read(), but that didn't
help me much.

Here is what else I know about read and sysread (please correct me
if I'm wrong):

* read belongs to a group of functions that includes read, print,
write, seek, tell, eof, and the angle-bracket-filehandle-operator.
Yup.

* sysread belongs to a group of functions that includes sysread,
syswrite, and sysseek.

And, most importantly, select.
* The functions open, close, and binmode can be used safely with
functions of both groups.
Yup.


That's all I know about the difference between Perl's read() and
sysread() functions. What I would still like to know is:

* When should I use read() over sysread() (or sysread() over
read())?

Personally, I'd never use read.

The difference between the two sets is that read, print, etc. all
buffer their IO. What this means is that when you say read(...), Perl
actually reads rather more than you asks for, and returns the rest on
later read calls. This means that there is less low-level access to
the operating system, which makes things more efficient. Similarly,
when you print something, it actually only goes into a buffer. The
whole buffer is then printed in one go when it reaches a certain size
(or when you print a newline if the output is line-buffered). Output
buffering can be turned off with $|.

The only time to use sys* is when using select. select waits for data
to be ready on a filehandle, and sysread then lets you read what data
is there without waiting for more: obviously, if there's a girt big
buffer between you and the filehandle this isn't going to work.
* What differences in program execution can I expect if I switch my
read() statements to sysread() (or vice-versa)?

Buffered IO is always more efficient when you can use it.
* If I were to open a socket over the internet using IO::Socket,
would it best to use the read() group of functions or the sysread()
group of functions?

The only important thing is not to mix them. If you are using select
(or IO::Select), you *must* use the sys* functions; if you are waiting
for a response from the other end, you'd be better off using the sys*
as otherwise you may find that you're waiting for a response to a
request that's still sitting in your buffer (though this can be dealt
with using $|); otherwise, you're probably best off using buffered IO
for efficiency.
* Since Perl's read() function uses the system's fread() call and
Perl's sysread() function uses the system's read() call, what does
that mean to me if I'm using those functions on a non-Unix system,
like Win32 using ActiveState Perl? I would imagine that, in that
particular circumstance, there would be no difference between Perl's
read() and fread() functions, but the mix-up I mentioned above about
using a print statement with a sysread() function was done on a Perl
program running on a Windows XP machine, so something different must
be happening under the hood even on Win32 operating systems.

Any system that supports ANSI C (read: any system perl builds on)
supports fread(3). With 5.8, in fact, the buffering fread(3) does is
re-implemented inside perl, as this gives both more flexibility and a
measure of protection from certain OS's broken stdio libraries.

'Most any OS will also support either read(2) or some equivalent. Most
support read(2) directly: Win32 does, though it also has its own set
of functions, in the classic Microsoft fashion of not doing a thing
well once when you can do it badly five times.

Ben
 
W

Walter Roberson

:[email protected] (J. Romano) wrote:
:> I was wondering if anyone could tell me the differences between
:> Perl's read() function and its sysread() function.

:The difference between the two sets is that read, print, etc. all
:buffer their IO.

That's certainly an important difference. There are sometimes other
differences as well.


:The only time to use sys* is when using select.

That's not the *only* time. Sometimes, some of the functionality
available via a systems fcntl() call are only available when you
use the sys* calls.


:Buffered IO is always more efficient when you can use it.

Not completely correct. In fact, not at all correct if you are
into low-level I/O wizardry. Buffered I/O -always- copies the data,
and there are situations where that data copy just isn't fast enough
(e.g., for streaming video.) If you want top efficiency, you have
to use the sysread() functions... or more likely, you have to drop
into XS and do some magic down there for proper buffer alignment.

If you want efficiency, you want *unbuffered* reads, and lots of
good reference manuals on hand... and you probably want DMA, and
direct I/O, and scatter-gather, and you want filesystems that
support real-time I/O and guaranteed bandwidth (e.g., xfs) and
you want ....
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top