circular buffering using substr() ?

A

arz

Hi,

Can I use substr() to do circular buffering? I'm reading a binary data
stream from a pipe, which I need to send out on another pipe, but
since speeds may differ, I need to do some intermediate buffering (up
to a maximum amount).

I have something like the following (simplified):

my $buffer = "";
my $offset = 0;

while (my $cb = read(INPIPE, my $data, 32768)) {
$buffer .= $data;
$bytes = syswrite(OUTPIPE, $buffer, $offset, length($buffer) -
$offset); # OUTPIPE is non-blocking
$offset += $bytes;
$buffer .= substr($buffer, $offset); # move buffer pointer to
offset
}

It works as intended, but the script is eating memory... is $buffer
internally actually growing and growing because it does not go out of
scope and the substr() does not 'reset' it when the new string is
assigned to $buffer? What would be the best way to do this?

Thanks,

--arz
 
J

John W. Krahn

arz said:
Can I use substr() to do circular buffering?

Probably. It may be easier to use an array. But your example isn't
clear on what you mean by "circular buffering".
I'm reading a binary data
stream from a pipe, which I need to send out on another pipe, but
since speeds may differ, I need to do some intermediate buffering (up
to a maximum amount).

I have something like the following (simplified):

my $buffer = "";
my $offset = 0;

while (my $cb = read(INPIPE, my $data, 32768)) {
$buffer .= $data;
$bytes = syswrite(OUTPIPE, $buffer, $offset, length($buffer) -
$offset); # OUTPIPE is non-blocking

perldoc -f syswrite
syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET
syswrite FILEHANDLE,SCALAR,LENGTH
syswrite FILEHANDLE,SCALAR
Attempts to write LENGTH bytes of data from variable
SCALAR to the specified FILEHANDLE, using the system call
write(2). If LENGTH is not specified, writes whole
SCALAR. It bypasses buffered IO, so mixing this with
reads (other than sysread()), "print", "write", "seek",
"tell", or "eof" may cause confusion because the perlio
and stdio layers usually buffers data. Returns the
number of bytes actually written, or "undef" if there was
an error (in this case the errno variable $! is also
set). If the LENGTH is greater than the available data
in the SCALAR after the OFFSET, only as much data as is
available will be written.

You have the offset and length arguments in the wrong order.

$offset += $bytes;
$buffer .= substr($buffer, $offset); # move buffer pointer to
offset

You are not moving the "buffer pointer", you are just appending data on
to the end of the buffer, data that was already in the buffer.

}

It works as intended, but the script is eating memory... is $buffer
internally actually growing and growing because it does not go out of
scope and the substr() does not 'reset' it when the new string is
assigned to $buffer? What would be the best way to do this?

Could you try and explain more clearly how you want these buffers to work?



John
 
M

Martijn Lievaart

Hi,

Can I use substr() to do circular buffering? I'm reading a binary data
stream from a pipe, which I need to send out on another pipe, but since
speeds may differ, I need to do some intermediate buffering (up to a
maximum amount).

I have something like the following (simplified):

my $buffer = "";
my $offset = 0;

while (my $cb = read(INPIPE, my $data, 32768)) {
$buffer .= $data;
$bytes = syswrite(OUTPIPE, $buffer, $offset, length($buffer) -
$offset); # OUTPIPE is non-blocking
$offset += $bytes;
$buffer .= substr($buffer, $offset); # move buffer pointer to
offset
}

It works as intended, but the script is eating memory... is $buffer
internally actually growing and growing because it does not go out of
scope and the substr() does not 'reset' it when the new string is
assigned to $buffer? What would be the best way to do this?

You are only growing $buffer (hint, you never assign to $buffer, you only
append), so eating memory is the expected result from your program.

It is also unclear to me what you want to accomplish. I smell a XY
problem here.

M4
 
A

arz

You are only growing $buffer (hint, you never assign to $buffer, you only
append), so eating memory is the expected result from your program.

Ah I'm sorry, I was typing the code from memory, just outlining the
idea. The statement
$buffer .= substr($buffer, $offset) should of course be an
assignment:
$buffer = substr($buffer, $offset).
That was a typo. Same for the syswrite(), which I do use correctly in
the code.

The 'circular' lies in the fact that I assign $buffer to a substring
of itself, thereby 'moving' the buffer pointer, while at the same time
appending data to the buffer on every read... I agree it's not really
circular, but the idea is to have a buffer with moving read and write
pointers.

What I want to accomplish is saving the data that I read in a buffer
(max. say 4 MB, if it goes beyond 4MB I can discard the data, it's
just that I need to take some jitter into account with reader/writer
having different speeds). Data is coming in over one pipe, and needs
to be sent out over another pipe. I currently use the $buffer variable
to add data to ($buffer .= $data), and I move the buffer pointer with
($buffer = substr($buffer, $offset)).

Is that a good idea or would you suggest a different way?

Thanks,

--arz
 
R

RocketMan

arz said:
Hi,

Can I use substr() to do circular buffering? I'm reading a binary data
stream from a pipe, which I need to send out on another pipe, but
since speeds may differ, I need to do some intermediate buffering (up
to a maximum amount).

I have something like the following (simplified):

my $buffer = "";
my $offset = 0;

while (my $cb = read(INPIPE, my $data, 32768)) {
$buffer .= $data;
$bytes = syswrite(OUTPIPE, $buffer, $offset, length($buffer) -
$offset); # OUTPIPE is non-blocking
$offset += $bytes;
$buffer .= substr($buffer, $offset); # move buffer pointer to
offset
}

It works as intended, but the script is eating memory... is $buffer
internally actually growing and growing because it does not go out of
scope and the substr() does not 'reset' it when the new string is
assigned to $buffer? What would be the best way to do this?

Thanks,

--arz

Perl has a FIFO for just this, Named Pipe.
 
X

xhoster

arz said:
Ah I'm sorry, I was typing the code from memory,

Yeah, don't do that. Since we have no idea what your actual code is,
we can have no idea what the problem is.
just outlining the
idea. The statement
$buffer .= substr($buffer, $offset) should of course be an
assignment:
$buffer = substr($buffer, $offset).

In my hands, when correctly incorporated into a loop, this does not leak
memory.

That was a typo. Same for the syswrite(), which I do use correctly in
the code.

The 'circular' lies in the fact that I assign $buffer to a substring
of itself, thereby 'moving' the buffer pointer, while at the same time
appending data to the buffer on every read...

You do not move the buffer pointer, you copy a certain part of the buffer
to someplace else. Perl does not look deeply enough to realize that the
left-hand side and the right-hand side of the assignment both involve the
same variable in a way that allows it to be optimized.

To do the Perl equivalent of moving the pointer without moving the data
iself, you need to make what you want more explicit:

substr($buffer, 0, $offset, "");

At some point, this copies/moves the whole buffer to avoid leaking memory,
but it only does so occasionally and so is much faster (when $buffer is
big) than the assignment method, which does it every time.

As for your "original" code, offset should not be used in the syswrite.
The whole point of the substr thing is to throw away the data once written,
so the data yet to be written is always the first stuff in the string,
needing no offset (or an offset of zero).

Xho

--
-------------------- http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.
 
A

arz

To do the Perl equivalent of moving the pointer without moving the data
iself, you need to make what you want more explicit:

substr($buffer, 0, $offset, "");

At some point, this copies/moves the whole buffer to avoid leaking memory,
but it only does so occasionally and so is much faster (when $buffer is
big) than the assignment method, which does it every time.

Thanks, this is good to know, that's the use of substr() that I was
looking for. Looking further, it appears there's indeed no memory
problem here, it must be somewhere else. Thanks, now I can be
confident at least this is not the issue.

--arz
 
U

Uri Guttman

PV> Maybe will be better to use array. When you get data then you
PV> push @buffer, $data;
PV> and when you want to send data then you
PV> print OUT shift @buffer;
PV> or you can send whole buffer by
PV> print OUT @buffer;
PV> or clearer
PV> print OUT join('', @buffer);

PV> This is true circular buffer with unlimited size ;-)

and a string isn't unlimited in size?

your design has a major flaw in it. what if the write of the first array
element is only partial (remember, it is non-blocking I/O)? then you
have to truncate that part to remove the section that was written. this
means you have to implement the same substr logic as he has for that
element. the opposite is true too. what if an element is fully written,
do you loop and write the next element (and so on)? this means even more
logic. so your array idea has to have at least the same code as the
substr idea plus much more to handle the array elements. all for no
actual gain over a single buffer string.

uri
 
U

Uri Guttman

PV> OK, you can accumulate non-blocking I/O into string variable until you
PV> will get "\n" or other mark meaning data is complete and push this
PV> string when completed.

that still won't work unless your protocol is line oriented. and with
differing stream speeds (the OP's real problem) that will not likely be
workable.
PV> No, push increase array size (add new element at end) and shift
PV> decrease array size (remove 1st element).

no, i know what you mean. please don't tell me about how to do
buffering. the array is MORE complex and slower than a single string
buffer. period. as i said you need all the string buffer logic and also
the array buffer logic.

you don't understand non-blocking stream I/O. it is not line oriented
and you can just push/pop single bits of i/o at will. you have to
maintain your buffers with finer granularity than lines.

uri
 
U

Uri Guttman

PV> Yes, arrays can be a problem in time critical applications.
PV> I understand what is a non-blocking stream I/O but I never work with
PV> bit stream but with byte streams only. In other word I never read bits
PV> from single wire but always bytes from PIO or SIO chips, FIFO buffer
PV> or 8 bits port on measuring card in asynchronous mode.
PV> In these cases an end-of-data was be defined as 0x00 or 0x0d or any
PV> other byte and I can accumulate bytes to string buffer until I haven't
PV> got end-of-data byte. At this moment I pushed string buffer into array
PV> and empty this string for next receiving.

in the tcp world there are nb bits, only bytes. async on a serial line
is not the same as async (non-blocking) on a socket. don't confuse them.

and you don't need end of data chars in a protocol. length and data work
find and many protocols do that or use fixed sized records.

you haven't seen to grasped why arrays are poor for this. they offer no
benefits other than record boundaries which may not be needed. a single
string used as a rotating buffer works very well and is trivial
code. here is a simple example from Stem::AsyncIO (some code is trimmed
from this):

this sub is called when the socket in its object is writeable (there is
room in the socket output buffer). note that it shuts down the write
event if there is no more data to write

sub writeable {

my( $self ) = @_ ;

return if $self->{'shut_down'} ;

my $buf_ref = \$self->{'write_buf'} ;
my $buf_len = length $$buf_ref ;

unless ( $buf_len ) {

$self->{'write_event'}->stop() ;
return ;
}

my $bytes_written = syswrite( $self->{'write_fh'}, $$buf_ref ) ;

unless( defined( $bytes_written ) ) {

return ;
}

# remove the part of the buffer that was written

substr( $$buf_ref, 0, $bytes_written, '' ) ;

return if length( $$buf_ref ) ;

$self->write_shut_down() if $self->{'shut_down_when_empty'} ;
}

that is a rotating string buffer (or a ring buffer) with one call to
syswrite and one call to substr. the rest is error or condition
checking.

simple.

uri
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top