state of blocking/nonblocking I/O

Discussion in 'Ruby' started by Joshua Haberman, Oct 2, 2005.

  1. Here is my understanding about the current state of I/O in Ruby.
    Please correct me where I am mistaken.

    - by default, ruby i/o operations block, but only block the calling
    Ruby thread. Ruby does this by scheduling a thread out if the fd is
    not read-ready/write-ready. If there is more than one Ruby thread,
    Ruby won't do a read(2) or write(2) on an fd unless select() says it
    is ready, to prevent blocking the entire process.

    - the one flaw with this scheme is that write(2) can block even if an
    fd is write-ready, if you try to write too much data. This will
    cause such a write to lock the entire process and all Ruby threads
    therein ([0] is a simple test program that displays the problem).

    - You can try setting O_NONBLOCK on your IO objects with fcntl. That
    will help you in the case where you only have one Ruby thread -- now
    read and write will raise Errno::EAGAIN if the fd isn't ready. But
    in the case where there is more than one Ruby thread, this won't work
    because Ruby won't perform the read(2) or write(2) until the fd is
    ready. So even though you have O_NONBLOCK set, you block your Ruby
    thread. (See [1] for an example]).

    Is this right? What is the current state of supporting nonblocking i/
    o in Ruby?

    One other question: are the buffered fread()/fwrite() functions
    guaranteed to work correctly if O_NONBLOCK is set on the underlying
    descriptor? I have not been able to find a good answer to this.

    Josh

    Example [0]:

    thread = Thread.new {
    while true
    puts "Background thread running..."
    sleep 1;
    end
    }

    # Give the background thread a few chances to show that it's running
    sleep 2;

    (read_pipe, write_pipe) = IO::pipe

    # this will stall the entire process, including the background thread.
    # change the length to 4096 and everything is fine.
    write_pipe.write(" " * 4097)

    thread.join


    Example [1]:

    require 'fcntl'

    thread = Thread.new {
    while true
    puts "Background thread running..."
    sleep 1;
    end
    }

    (read_pipe, write_pipe) = IO::pipe
    read_pipe.fcntl(Fcntl::F_SETFL, read_pipe.fcntl(Fcntl::F_GETFL) |
    Fcntl::O_NONBLOCK)

    # this will block our thread, even though the fd is set to nonblocking.
    # however, if you eliminate the background thread, this call with
    give you EAGAIN,
    # which is what you want.
    read_pipe.read

    # we will never get here
    puts "Finished read!"
     
    Joshua Haberman, Oct 2, 2005
    #1
    1. Advertising

  2. Joshua Haberman

    Tanaka Akira Guest

    In article <>,
    Joshua Haberman <> writes:

    > - by default, ruby i/o operations block, but only block the calling
    > Ruby thread. Ruby does this by scheduling a thread out if the fd is
    > not read-ready/write-ready. If there is more than one Ruby thread,
    > Ruby won't do a read(2) or write(2) on an fd unless select() says it
    > is ready, to prevent blocking the entire process.


    Right.

    > - the one flaw with this scheme is that write(2) can block even if an
    > fd is write-ready, if you try to write too much data. This will
    > cause such a write to lock the entire process and all Ruby threads
    > therein ([0] is a simple test program that displays the problem).


    Right.

    > - You can try setting O_NONBLOCK on your IO objects with fcntl. That
    > will help you in the case where you only have one Ruby thread -- now
    > read and write will raise Errno::EAGAIN if the fd isn't ready.


    No.

    IO#write doesn't raise Errno::EAGAIN but retry until all data is written.

    IO#read also retry since Ruby 1.9.

    So IO#write and IO#read may block calling thread.

    > But
    > in the case where there is more than one Ruby thread, this won't work
    > because Ruby won't perform the read(2) or write(2) until the fd is
    > ready. So even though you have O_NONBLOCK set, you block your Ruby
    > thread. (See [1] for an example]).


    Right.

    > One other question: are the buffered fread()/fwrite() functions
    > guaranteed to work correctly if O_NONBLOCK is set on the underlying
    > descriptor? I have not been able to find a good answer to this.


    fwrite(3) may lost data.

    So Ruby 1.8 may lost data.

    % ruby-1.8.3 -v
    ruby 1.8.3 (2005-09-21) [i686-linux]
    % ruby-1.8.3 -rfcntl -e '
    w = STDOUT
    w.fcntl(Fcntl::F_SETFL, w.fcntl(Fcntl::F_GETFL) | Fcntl::O_NONBLOCK)
    w << "a" * 4096
    w.flush
    w << "b"
    w.flush
    ' | ruby -e 'sleep 1; p STDIN.read.length'
    4096

    However no data is lost if IO#sync = true since Ruby 1.8.2.
    It's because stdio is bypassed.

    % ruby-1.8.3 -rfcntl -e '
    w = STDOUT
    w.sync = true
    w.fcntl(Fcntl::F_SETFL, w.fcntl(Fcntl::F_GETFL) | Fcntl::O_NONBLOCK)
    w << "a" * 4096
    w.flush
    w << "b"
    w.flush
    ' | ruby -e 'sleep 1; p STDIN.read.length'
    4097

    Ruby 1.9 doesn't have the problem because it has its own
    buffering mechanism.

    > # this will block our thread, even though the fd is set to nonblocking.
    > # however, if you eliminate the background thread, this call with
    > give you EAGAIN,
    > # which is what you want.
    > read_pipe.read


    If you want to test some data available, use IO.select.
    --
    Tanaka Akira
     
    Tanaka Akira, Oct 3, 2005
    #2
    1. Advertising

  3. --Apple-Mail-5--331309369
    Content-Transfer-Encoding: 7bit
    Content-Type: text/plain;
    charset=US-ASCII;
    delsp=yes;
    format=flowed

    Tanaka,

    Thanks for your helpful answers!

    On Oct 2, 2005, at 5:11 PM, Tanaka Akira wrote:

    > In article <>,
    > Joshua Haberman <> writes:
    >> - You can try setting O_NONBLOCK on your IO objects with fcntl. That
    >> will help you in the case where you only have one Ruby thread -- now
    >> read and write will raise Errno::EAGAIN if the fd isn't ready.
    >>

    >
    > No.
    >
    > IO#write doesn't raise Errno::EAGAIN but retry until all data is
    > written.
    >
    > IO#read also retry since Ruby 1.9.
    >
    > So IO#write and IO#read may block calling thread.


    Hrm, so I guess that if I want to do real nonblocking I/O in Ruby, I
    have to write IO#nonblock_read and IO#nonblock_write, that do not
    have this retry behavior?

    >> One other question: are the buffered fread()/fwrite() functions
    >> guaranteed to work correctly if O_NONBLOCK is set on the underlying
    >> descriptor? I have not been able to find a good answer to this.
    >>

    >
    > fwrite(3) may lost data.
    >
    > So Ruby 1.8 may lost data.
    >
    > % ruby-1.8.3 -v
    > ruby 1.8.3 (2005-09-21) [i686-linux]
    > % ruby-1.8.3 -rfcntl -e '
    > w = STDOUT
    > w.fcntl(Fcntl::F_SETFL, w.fcntl(Fcntl::F_GETFL) | Fcntl::O_NONBLOCK)
    > w << "a" * 4096
    > w.flush
    > w << "b"
    > w.flush
    > ' | ruby -e 'sleep 1; p STDIN.read.length'
    > 4096


    Ooh, that's bad. What's the explanation for that?

    >> # this will block our thread, even though the fd is set to
    >> nonblocking.
    >> # however, if you eliminate the background thread, this call with
    >> give you EAGAIN,
    >> # which is what you want.
    >> read_pipe.read
    >>

    >
    > If you want to test some data available, use IO.select.


    Yes, but IO.select can't tell me how *much* data I can read or
    write. IO#read and IO#write can still block if I try to read or
    write too much data, which is what I want to avoid.

    Thanks,
    Josh

    --Apple-Mail-5--331309369--
     
    Joshua Haberman, Oct 3, 2005
    #3
  4. Joshua Haberman

    Tanaka Akira Guest

    In article <>,
    Joshua Haberman <> writes:

    > Hrm, so I guess that if I want to do real nonblocking I/O in Ruby, I
    > have to write IO#nonblock_read and IO#nonblock_write, that do not
    > have this retry behavior?


    IO#sysread and IO#syswrite is possible candidates.
    However they may block when multithreaded because select.
    Also they cannot be combined with buffering methods.

    Nonblocking methods such as IO#nonblock_read and
    IO#nonblock_write is good idea. If matz accept it, I'll
    implement them definitely. However I'm not sure that matz
    think the method names are good enough.

    > Ooh, that's bad. What's the explanation for that?


    R. Stevens says

    using standard I/O with nonblocking descriptors,
    a recipe for disaster

    UNIX Network Programming Vol1, p.399

    For more information, read the source of fflush in stdio.
    Version 7, 4.4BSD and glibc has the problem as far as I
    know. I feel it's portable behavior.

    > Yes, but IO.select can't tell me how *much* data I can read or
    > write. IO#read and IO#write can still block if I try to read or
    > write too much data, which is what I want to avoid.


    IO#readpartial is available since ruby 1.8.3.
    It doesn't block if some data available.

    For writing, I think IO#syswrite is required.
    --
    Tanaka Akira
     
    Tanaka Akira, Oct 3, 2005
    #4
  5. Joshua Haberman

    Ara.T.Howard Guest

    On Mon, 3 Oct 2005, Tanaka Akira wrote:

    > In article <>,
    > Joshua Haberman <> writes:
    >
    >> Hrm, so I guess that if I want to do real nonblocking I/O in Ruby, I
    >> have to write IO#nonblock_read and IO#nonblock_write, that do not
    >> have this retry behavior?

    >
    > IO#sysread and IO#syswrite is possible candidates.
    > However they may block when multithreaded because select.
    > Also they cannot be combined with buffering methods.
    >
    > Nonblocking methods such as IO#nonblock_read and
    > IO#nonblock_write is good idea. If matz accept it, I'll
    > implement them definitely. However I'm not sure that matz
    > think the method names are good enough.


    thanks so much for doing this work!

    suggestions:

    IO#nb_read
    IO#nb_write

    or objectify:

    nbio = NBIO::new an_io

    nb.read 42 #=> will not block
    nb.write 42 #=> will not block

    etc.

    this would be a great addition - a good name must be found! ;-)

    -a
    --
    ===============================================================================
    | email :: ara [dot] t [dot] howard [at] noaa [dot] gov
    | phone :: 303.497.6469
    | Your life dwells amoung the causes of death
    | Like a lamp standing in a strong breeze. --Nagarjuna
    ===============================================================================
     
    Ara.T.Howard, Oct 3, 2005
    #5
  6. On Oct 2, 2005, at 7:07 PM, Tanaka Akira wrote:

    > In article <>,
    > Joshua Haberman <> writes:
    >
    >
    >> Hrm, so I guess that if I want to do real nonblocking I/O in Ruby, I
    >> have to write IO#nonblock_read and IO#nonblock_write, that do not
    >> have this retry behavior?
    >>

    >
    > IO#sysread and IO#syswrite is possible candidates.
    > However they may block when multithreaded because select.


    It seems that Ruby should keep track of whether a descriptor has
    O_NONBLOCK set (like in OpenFile.mode) and not do the select if so.
    On the other hand, that will break if O_NONBLOCK is set by a C
    extension, or by another process that has the same ofile open. Sigh.

    > Nonblocking methods such as IO#nonblock_read and
    > IO#nonblock_write is good idea. If matz accept it, I'll
    > implement them definitely. However I'm not sure that matz
    > think the method names are good enough.


    Well I don't know if will help convince matz, but djb advocates that
    naming scheme as well, for C:

    http://cr.yp.to/unix/nonblock.html

    Now that I think of it, implementing IO#nonblock_read and
    IO#nonblock_write as extensions isn't feasible for the 1.8 branch,
    since it uses standard I/O which is incompatible with O_NONBLOCK. Sigh.

    I guess for now I'll have to use sysread/syswrite, along with a home-
    rolled buffering layer.

    >> Ooh, that's bad. What's the explanation for that?
    >>

    >
    > R. Stevens says
    >
    > using standard I/O with nonblocking descriptors,
    > a recipe for disaster


    I guess that says it all. :)

    Josh
     
    Joshua Haberman, Oct 3, 2005
    #6
  7. Joshua Haberman

    Tanaka Akira Guest

    In article <>,
    Joshua Haberman <> writes:

    > It seems that Ruby should keep track of whether a descriptor has
    > O_NONBLOCK set (like in OpenFile.mode) and not do the select if so.
    > On the other hand, that will break if O_NONBLOCK is set by a C
    > extension, or by another process that has the same ofile open. Sigh.


    Yes. The shared fd is a problem hard to solve.

    > Now that I think of it, implementing IO#nonblock_read and
    > IO#nonblock_write as extensions isn't feasible for the 1.8 branch,
    > since it uses standard I/O which is incompatible with O_NONBLOCK. Sigh.


    They are not problem if IO#sync = true.

    Since streams created by Ruby (IO.pipe, TCPSocket.open, etc)
    are IO#sync = true by default, the problem is not occur in
    most cases.

    > I guess for now I'll have to use sysread/syswrite, along with a home-
    > rolled buffering layer.


    You need your buffering layer if O_NONBLOCK is
    used on ruby 1.8. However IO#sync = true is enough if
    buffering is not required.
    --
    Tanaka Akira
     
    Tanaka Akira, Oct 3, 2005
    #7
  8. Tanaka Akira <> wrote:
    > In article <>,
    > Joshua Haberman <> writes:
    >
    >> It seems that Ruby should keep track of whether a descriptor has
    >> O_NONBLOCK set (like in OpenFile.mode) and not do the select if so.
    >> On the other hand, that will break if O_NONBLOCK is set by a C
    >> extension, or by another process that has the same ofile open. Sigh.

    >
    > Yes. The shared fd is a problem hard to solve.
    >
    >> Now that I think of it, implementing IO#nonblock_read and
    >> IO#nonblock_write as extensions isn't feasible for the 1.8 branch,
    >> since it uses standard I/O which is incompatible with O_NONBLOCK.
    >> Sigh.

    >
    > They are not problem if IO#sync = true.
    >
    > Since streams created by Ruby (IO.pipe, TCPSocket.open, etc)
    > are IO#sync = true by default, the problem is not occur in
    > most cases.
    >
    >> I guess for now I'll have to use sysread/syswrite, along with a home-
    >> rolled buffering layer.

    >
    > You need your buffering layer if O_NONBLOCK is
    > used on ruby 1.8. However IO#sync = true is enough if
    > buffering is not required.


    I have one question on this matter which I still don't understand (I'm not
    so deep into C stdlib IO variants so please bear with me): why would anybody
    want to use nonblocking IO (on the Ruby level, e.g. IO#read might not have
    read anything on return even if the stream is not closed) in the light of
    Ruby threads? I mean, with that one would have to build the multiplexing in
    Ruby which is already present in the interpreter with multiple Ruby threads?
    Are there situations that I'm not aware of where this is useful / needed?
    Thanks!

    Kind regards

    robert
     
    Robert Klemme, Oct 3, 2005
    #8
  9. Joshua Haberman

    Tanaka Akira Guest

    In article <>,
    "Robert Klemme" <> writes:

    > I have one question on this matter which I still don't understand (I'm not
    > so deep into C stdlib IO variants so please bear with me): why would anybody
    > want to use nonblocking IO (on the Ruby level, e.g. IO#read might not have
    > read anything on return even if the stream is not closed) in the light of
    > Ruby threads? I mean, with that one would have to build the multiplexing in
    > Ruby which is already present in the interpreter with multiple Ruby threads?
    > Are there situations that I'm not aware of where this is useful / needed?


    It is an interesting question I also have.

    I asked it several times, so I know some answers.

    1. GUI framework has its own event driven framework.

    If a callback blocks, it blocks entire GUI. It is not
    acceptable.

    2. High performance network server has its own event driven
    framework.

    Some high performance network servers use an application
    level event driven framework. If an event handler blocks,
    it blocks entire application. It is not acceptable.

    However I'm not sure that it is appropriate to implement
    a high performance server in Ruby.

    If an application level event driven framework is used,
    application level nonblocking I/O operations are required.

    If there are other usages, I'd like to know.
    --
    Tanaka Akira
     
    Tanaka Akira, Oct 4, 2005
    #9
  10. --Apple-Mail-8--225002262
    Content-Transfer-Encoding: 7bit
    Content-Type: text/plain;
    charset=US-ASCII;
    delsp=yes;
    format=flowed

    On Oct 3, 2005, at 9:21 PM, Tanaka Akira wrote:
    > In article <>,
    > "Robert Klemme" <> writes:
    >
    >
    >> I have one question on this matter which I still don't understand
    >> (I'm not
    >> so deep into C stdlib IO variants so please bear with me): why
    >> would anybody
    >> want to use nonblocking IO (on the Ruby level, e.g. IO#read might
    >> not have
    >> read anything on return even if the stream is not closed) in the
    >> light of
    >> Ruby threads? I mean, with that one would have to build the
    >> multiplexing in
    >> Ruby which is already present in the interpreter with multiple
    >> Ruby threads?
    >> Are there situations that I'm not aware of where this is useful /
    >> needed?
    >>

    >
    > It is an interesting question I also have.
    >
    > I asked it several times, so I know some answers.
    >
    > 1. GUI framework has its own event driven framework.
    >
    > If a callback blocks, it blocks entire GUI. It is not
    > acceptable.
    >
    > 2. High performance network server has its own event driven
    > framework.
    >
    > Some high performance network servers use an application
    > level event driven framework. If an event handler blocks,
    > it blocks entire application. It is not acceptable.
    >
    > However I'm not sure that it is appropriate to implement
    > a high performance server in Ruby.
    >
    > If an application level event driven framework is used,
    > application level nonblocking I/O operations are required.
    >
    > If there are other usages, I'd like to know.


    Nonblocking I/O is useful if you are a server with some kind of
    complex, global state, and lots of clients that can act on that
    state. A good example would be a gaming server. If you handle every
    client in its own thread, you need a big, coarse lock around your
    global state. Once you're doing that, what's the point of
    multithreading? It just makes things more complicated, and your
    program's execution more difficult to understand.

    You might have many IO objects open that are interrelated. Say your
    program logic is something like:

    when there's data available on object A, process it and send the
    results to B and C
    when there's data available on object B, process it and send the
    results to A and C
    when there's data available on object C, process it and send the
    results to A and B

    How should I break this down into threads? Three threads that block-
    on-read for A, B, and C? But what if A and B get data at the same
    time? They might interleave their writes to C. Do I put a mutex
    around C?

    For this case, it's a lot easier and more natural to write a main
    loop like:

    while true
    (read_ready, write_ready, err) = IO.select([A, B, C])
    read_ready.each { |io|
    output = process(io.read)
    [A, B, C].each { |client| client.write(output) unless client
    == io }
    }
    end

    Nonblocking I/O gives you more control over the execution of your
    program, and frees you from the worries of synchronizing between
    threads. And it's simpler than using threads for programs that
    follow certain patterns.

    Josh

    --Apple-Mail-8--225002262--
     
    Joshua Haberman, Oct 4, 2005
    #10
  11. Joshua Haberman

    David Gurba Guest

    Joshua Haberman wrote:

    > On Oct 3, 2005, at 9:21 PM, Tanaka Akira wrote:
    >
    >> In article <>,
    >> "Robert Klemme" <> writes:
    >>
    >>
    >>> I have one question on this matter which I still don't understand
    >>> (I'm not
    >>> so deep into C stdlib IO variants so please bear with me): why
    >>> would anybody
    >>> want to use nonblocking IO (on the Ruby level, e.g. IO#read might
    >>> not have
    >>> read anything on return even if the stream is not closed) in the
    >>> light of
    >>> Ruby threads? I mean, with that one would have to build the
    >>> multiplexing in
    >>> Ruby which is already present in the interpreter with multiple Ruby
    >>> threads?
    >>> Are there situations that I'm not aware of where this is useful /
    >>> needed?
    >>>

    >>
    >> It is an interesting question I also have.
    >>
    >> I asked it several times, so I know some answers.
    >>
    >> 1. GUI framework has its own event driven framework.
    >>
    >> If a callback blocks, it blocks entire GUI. It is not
    >> acceptable.
    >>
    >> 2. High performance network server has its own event driven
    >> framework.
    >>
    >> Some high performance network servers use an application
    >> level event driven framework. If an event handler blocks,
    >> it blocks entire application. It is not acceptable.
    >>
    >> However I'm not sure that it is appropriate to implement
    >> a high performance server in Ruby.
    >>
    >> If an application level event driven framework is used,
    >> application level nonblocking I/O operations are required.
    >>
    >> If there are other usages, I'd like to know.

    >
    >
    > Nonblocking I/O is useful if you are a server with some kind of
    > complex, global state, and lots of clients that can act on that
    > state. A good example would be a gaming server. If you handle every
    > client in its own thread, you need a big, coarse lock around your
    > global state. Once you're doing that, what's the point of
    > multithreading? It just makes things more complicated, and your
    > program's execution more difficult to understand.
    >
    > You might have many IO objects open that are interrelated. Say your
    > program logic is something like:
    >
    > when there's data available on object A, process it and send the
    > results to B and C
    > when there's data available on object B, process it and send the
    > results to A and C
    > when there's data available on object C, process it and send the
    > results to A and B
    >
    > How should I break this down into threads? Three threads that block-
    > on-read for A, B, and C? But what if A and B get data at the same
    > time? They might interleave their writes to C. Do I put a mutex
    > around C?
    >
    > For this case, it's a lot easier and more natural to write a main
    > loop like:
    >
    > while true
    > (read_ready, write_ready, err) = IO.select([A, B, C])
    > read_ready.each { |io|
    > output = process(io.read)
    > [A, B, C].each { |client| client.write(output) unless client
    > == io }
    > }
    > end
    >
    > Nonblocking I/O gives you more control over the execution of your
    > program, and frees you from the worries of synchronizing between
    > threads. And it's simpler than using threads for programs that
    > follow certain patterns.
    >
    > Josh
    >

    This sounds really interesting, but I don't fully understand the while
    loop. Nonblocking IO sends/recieves data when its ready/requested...eg.
    it doesn't block for the data, right?

    I have written some threaded applications. A java tic-tac-toe game which
    had players and observers of a game that all viewed a global 'board'
    state. Methods to modify the game state were thread safe with mutexes,
    how is what your saying different...? Any info appreciated...

    ooooo my 1st post to the mailing list :)
     
    David Gurba, Oct 4, 2005
    #11
  12. Joshua Haberman wrote:
    > On Oct 3, 2005, at 9:21 PM, Tanaka Akira wrote:
    >> In article <>,
    >> "Robert Klemme" <> writes:
    >>
    >>
    >>> I have one question on this matter which I still don't understand
    >>> (I'm not
    >>> so deep into C stdlib IO variants so please bear with me): why
    >>> would anybody
    >>> want to use nonblocking IO (on the Ruby level, e.g. IO#read might
    >>> not have
    >>> read anything on return even if the stream is not closed) in the
    >>> light of
    >>> Ruby threads? I mean, with that one would have to build the
    >>> multiplexing in
    >>> Ruby which is already present in the interpreter with multiple
    >>> Ruby threads?
    >>> Are there situations that I'm not aware of where this is useful /
    >>> needed?
    >>>

    >>
    >> It is an interesting question I also have.
    >>
    >> I asked it several times, so I know some answers.
    >>
    >> 1. GUI framework has its own event driven framework.
    >>
    >> If a callback blocks, it blocks entire GUI. It is not
    >> acceptable.
    >>
    >> 2. High performance network server has its own event driven
    >> framework.
    >>
    >> Some high performance network servers use an application
    >> level event driven framework. If an event handler blocks,
    >> it blocks entire application. It is not acceptable.
    >>
    >> However I'm not sure that it is appropriate to implement
    >> a high performance server in Ruby.
    >>
    >> If an application level event driven framework is used,
    >> application level nonblocking I/O operations are required.
    >>
    >> If there are other usages, I'd like to know.

    >
    > Nonblocking I/O is useful if you are a server with some kind of
    > complex, global state, and lots of clients that can act on that
    > state. A good example would be a gaming server. If you handle every
    > client in its own thread, you need a big, coarse lock around your
    > global state. Once you're doing that, what's the point of
    > multithreading? It just makes things more complicated, and your
    > program's execution more difficult to understand.
    >
    > You might have many IO objects open that are interrelated. Say your
    > program logic is something like:
    >
    > when there's data available on object A, process it and send the
    > results to B and C
    > when there's data available on object B, process it and send the
    > results to A and C
    > when there's data available on object C, process it and send the
    > results to A and B
    >
    > How should I break this down into threads? Three threads that block-
    > on-read for A, B, and C? But what if A and B get data at the same
    > time? They might interleave their writes to C. Do I put a mutex
    > around C?
    >
    > For this case, it's a lot easier and more natural to write a main
    > loop like:
    >
    > while true
    > (read_ready, write_ready, err) = IO.select([A, B, C])
    > read_ready.each { |io|
    > output = process(io.read)
    > [A, B, C].each { |client| client.write(output) unless client
    > == io }
    > }
    > end
    >
    > Nonblocking I/O gives you more control over the execution of your
    > program, and frees you from the worries of synchronizing between
    > threads. And it's simpler than using threads for programs that
    > follow certain patterns.


    Thanks for the feedback. Even in this case I'd probably choose a
    different architecture. I dunno which of these is easier but here's how
    I'd do it:

    Have a thread per open client connection that reads requests. Requests
    are put into a queue (thread safe!). Then I'd have a number of workers
    that fetch from the task queue and do the work. Either each worker sends
    results directly to affected clients or puts results into a second queue
    from which a number of sender threads fetch their tasks and send
    responses. There could also be dedicated sender threads per client.

    If there are no dedicated sender threads you would need just a single
    point of synchronization (apart from what queue does internally) for the
    sending socket in order to prevent multiple responses to interfere (if
    it's possible that a request is received while the answer to another
    request is being processed).

    Kind regards

    robert
     
    Robert Klemme, Oct 4, 2005
    #12
  13. Joshua Haberman

    snacktime Guest

    ------=_Part_6458_16119870.1128415530682
    Content-Type: text/plain; charset=ISO-8859-1
    Content-Transfer-Encoding: quoted-printable
    Content-Disposition: inline

    >
    > > Josh
    > >

    > This sounds really interesting, but I don't fully understand the while
    > loop. Nonblocking IO sends/recieves data when its ready/requested...eg.
    > it doesn't block for the data, right?




    I have written some threaded applications. A java tic-tac-toe game which
    > had players and observers of a game that all viewed a global 'board'
    > state. Methods to modify the game state were thread safe with mutexes,
    > how is what your saying different...? Any info appreciated...
    >
    > ooooo my 1st post to the mailing list :)
    >
    >

    In simple terms, with an event framework you have one main event loop that
    keeps a state engine of sorts for all the current IO operations going on. I=
    n
    your code when you need to do an IO operation, you send it to the event
    loop, register a callback, and then when there is something to read the
    event loop fires the callback. Event frameworks such as python's twisted
    provide a lot of the internal non blocking IO functions for you so you don'=
    t
    have to implement them yourself. For example writing to a file, waiting on =
    a
    socket, etc.. You call the higher level function, register a callback, and
    continue on your way.

    Chris

    ------=_Part_6458_16119870.1128415530682--
     
    snacktime, Oct 4, 2005
    #13
  14. Joshua Haberman

    snacktime Guest

    ------=_Part_6518_31333048.1128415900501
    Content-Type: text/plain; charset=ISO-8859-1
    Content-Transfer-Encoding: quoted-printable
    Content-Disposition: inline

    On 10/3/05, Robert Klemme <> wrote:
    >
    > Tanaka Akira <> wrote:
    > > In article <>,
    > > Joshua Haberman <> writes:
    > >
    > >> It seems that Ruby should keep track of whether a descriptor has
    > >> O_NONBLOCK set (like in OpenFile.mode) and not do the select if so.
    > >> On the other hand, that will break if O_NONBLOCK is set by a C
    > >> extension, or by another process that has the same ofile open. Sigh.

    > >
    > > Yes. The shared fd is a problem hard to solve.
    > >
    > >> Now that I think of it, implementing IO#nonblock_read and
    > >> IO#nonblock_write as extensions isn't feasible for the 1.8 branch,
    > >> since it uses standard I/O which is incompatible with O_NONBLOCK.
    > >> Sigh.

    > >
    > > They are not problem if IO#sync =3D true.
    > >
    > > Since streams created by Ruby (IO.pipe, TCPSocket.open, etc)
    > > are IO#sync =3D true by default, the problem is not occur in
    > > most cases.
    > >
    > >> I guess for now I'll have to use sysread/syswrite, along with a home-
    > >> rolled buffering layer.

    > >
    > > You need your buffering layer if O_NONBLOCK is
    > > used on ruby 1.8. However IO#sync =3D true is enough if
    > > buffering is not required.

    >
    > I have one question on this matter which I still don't understand (I'm no=

    t
    > so deep into C stdlib IO variants so please bear with me): why would
    > anybody
    > want to use nonblocking IO (on the Ruby level, e.g. IO#read might not hav=

    e
    > read anything on return even if the stream is not closed) in the light of
    > Ruby threads? I mean, with that one would have to build the multiplexing
    > in
    > Ruby which is already present in the interpreter with multiple Ruby
    > threads?
    > Are there situations that I'm not aware of where this is useful / needed?
    > Thanks!
    >
    > Kind regards
    >
    > robert



    I don't know exactly why this is, but an event framework using a single
    event loop is far more efficient then a bunch of threads each doing their
    own IO. Now this is with python and perl, ruby could be different although
    that would surprise me. On a number of applications that I have converted
    from threads to an event loop, cpu usage dropped like a rock.

    Not to mention that the issue of synchronization pretty much goes away.

    Chris

    ------=_Part_6518_31333048.1128415900501--
     
    snacktime, Oct 4, 2005
    #14
  15. Joshua Haberman

    Tanaka Akira Guest

    In article <>,
    Joshua Haberman <> writes:

    > Nonblocking I/O is useful if you are a server with some kind of
    > complex, global state, and lots of clients that can act on that
    > state. A good example would be a gaming server. If you handle every
    > client in its own thread, you need a big, coarse lock around your
    > global state. Once you're doing that, what's the point of
    > multithreading? It just makes things more complicated, and your
    > program's execution more difficult to understand.


    I see.

    > while true
    > (read_ready, write_ready, err) = IO.select([A, B, C])
    > read_ready.each { |io|
    > output = process(io.read)
    > [A, B, C].each { |client| client.write(output) unless client
    > == io }
    > }
    > end


    It seems too simplified to explain nonblocking I/O problem.

    O_NONBLOCK is required to avoid that write(2) blocks entire
    process. But if write(2) doesn't block due to O_NONBLOCK,
    some data are not written. So the result of write(2) should
    be checked and remaining data should be try to write later.

    It can be implemented by two ways.

    1. Using event driven framework and register an event
    handler for writability to the client. Since the event
    handler must not block, it needs a nonblocking write
    operation.

    2. Using a writing thread dedicated for the client.
    Since the thread is dedicated for the writing, it can use
    a blocking write operation.
    --
    Tanaka Akira
     
    Tanaka Akira, Oct 4, 2005
    #15
  16. Joshua Haberman

    Ara.T.Howard Guest

    On Tue, 4 Oct 2005, Tanaka Akira wrote:

    > In article <>,
    > "Robert Klemme" <> writes:
    >
    >> I have one question on this matter which I still don't understand (I'm not
    >> so deep into C stdlib IO variants so please bear with me): why would anybody
    >> want to use nonblocking IO (on the Ruby level, e.g. IO#read might not have
    >> read anything on return even if the stream is not closed) in the light of
    >> Ruby threads? I mean, with that one would have to build the multiplexing in
    >> Ruby which is already present in the interpreter with multiple Ruby threads?
    >> Are there situations that I'm not aware of where this is useful / needed?

    >
    > It is an interesting question I also have.
    >
    > I asked it several times, so I know some answers.
    >
    > 1. GUI framework has its own event driven framework.
    >
    > If a callback blocks, it blocks entire GUI. It is not
    > acceptable.
    >
    > 2. High performance network server has its own event driven
    > framework.
    >
    > Some high performance network servers use an application
    > level event driven framework. If an event handler blocks,
    > it blocks entire application. It is not acceptable.
    >
    > However I'm not sure that it is appropriate to implement
    > a high performance server in Ruby.
    >
    > If an application level event driven framework is used,
    > application level nonblocking I/O operations are required.
    >
    > If there are other usages, I'd like to know.


    it's sort of the same thing as 2, but network intense clients might me written
    more easily too... i've written code that was managing 100s of ssh connection
    for example. i could have used threads but it was easier/more responsive to
    just have an array of pipes and non-blocking reads.

    the only other thing i can think of is any time you may actually be fine
    blocking on a read, but wish to remain aware of time - eg. you don't want to
    block too long - and here something like readpartial is great.

    -a
    --
    ===============================================================================
    | email :: ara [dot] t [dot] howard [at] noaa [dot] gov
    | phone :: 303.497.6469
    | Your life dwells amoung the causes of death
    | Like a lamp standing in a strong breeze. --Nagarjuna
    ===============================================================================
     
    Ara.T.Howard, Oct 4, 2005
    #16
  17. On Oct 4, 2005, at 4:06 AM, Tanaka Akira wrote:

    > In article <>,
    > Joshua Haberman <> writes:
    >
    >
    >> Nonblocking I/O is useful if you are a server with some kind of
    >> complex, global state, and lots of clients that can act on that
    >> state. A good example would be a gaming server. If you handle every
    >> client in its own thread, you need a big, coarse lock around your
    >> global state. Once you're doing that, what's the point of
    >> multithreading? It just makes things more complicated, and your
    >> program's execution more difficult to understand.
    >>

    >
    > I see.
    >
    >
    >> while true
    >> (read_ready, write_ready, err) = IO.select([A, B, C])
    >> read_ready.each { |io|
    >> output = process(io.read)
    >> [A, B, C].each { |client| client.write(output) unless client
    >> == io }
    >> }
    >> end
    >>

    >
    > It seems too simplified to explain nonblocking I/O problem.
    >
    > O_NONBLOCK is required to avoid that write(2) blocks entire
    > process. But if write(2) doesn't block due to O_NONBLOCK,
    > some data are not written. So the result of write(2) should
    > be checked and remaining data should be try to write later.


    Yes, the code I posted made some simplifying assumptions. If some of
    the data was not written, you need to account for that somehow. The
    options you suggest would work for ensuring that the write happens
    eventually. But a nice thing about doing everything in a single
    thread is that you can do better if you choose.

    Imagine that you start with world W1. You get a message from client
    A that updates the state of the world to W2. Call DELTA1 the message
    you have to send to B and C to update them to W2. You try to write
    DELTA1 to B, but C is not ready. The next time through your event
    loop, you get a message from client B updating the state of the world
    to W3. DELTA2 updates W2 to W3, so you send that to A, but C is
    still not ready.

    Once C becomes ready, you could send C DELTA1 and DELTA2, or you
    could be smart and combine those into a single DELTA3 that updates W1
    to W3. DELTA3 will likely be smaller than DELTA1 + DELTA2. If you
    had initially blocked-on-write to send DELTA1, you would not have
    that option.

    Josh
     
    Joshua Haberman, Oct 4, 2005
    #17
  18. Joshua Haberman

    Mark Cotner Guest

    Here's an example that I tested recently. The application is a network
    management framework for polling millions of devices per hour. Threads are
    nice, but context switching between enough threads to get the job done(175
    per my testing) generates a ton of CPU load(~load jumps to 30 and 0% idle on
    4 CPU box) when trying to maintain this many threads. It was written in the
    producer/consumer pattern so thread startup isn't compounding the issue.
    However, the same application using Perl('cause Ruby can't do this just yet)
    asynchronous SNMP polls twice as many devices in the same amount of time
    with 2-4 processes and the system load is ~1 and ~70% idle.

    Granted, I'm doing some fairly extreme things, but it does help answer the
    question of threads vs async IO.

    'njoy,
    Mark


    On 10/4/05 9:40 AM, "Ara.T.Howard" <> wrote:

    > On Tue, 4 Oct 2005, Tanaka Akira wrote:
    >
    >> In article <>,
    >> "Robert Klemme" <> writes:
    >>
    >>> I have one question on this matter which I still don't understand (I'm not
    >>> so deep into C stdlib IO variants so please bear with me): why would anybody
    >>> want to use nonblocking IO (on the Ruby level, e.g. IO#read might not have
    >>> read anything on return even if the stream is not closed) in the light of
    >>> Ruby threads? I mean, with that one would have to build the multiplexing in
    >>> Ruby which is already present in the interpreter with multiple Ruby threads?
    >>> Are there situations that I'm not aware of where this is useful / needed?

    >>
    >> It is an interesting question I also have.
    >>
    >> I asked it several times, so I know some answers.
    >>
    >> 1. GUI framework has its own event driven framework.
    >>
    >> If a callback blocks, it blocks entire GUI. It is not
    >> acceptable.
    >>
    >> 2. High performance network server has its own event driven
    >> framework.
    >>
    >> Some high performance network servers use an application
    >> level event driven framework. If an event handler blocks,
    >> it blocks entire application. It is not acceptable.
    >>
    >> However I'm not sure that it is appropriate to implement
    >> a high performance server in Ruby.
    >>
    >> If an application level event driven framework is used,
    >> application level nonblocking I/O operations are required.
    >>
    >> If there are other usages, I'd like to know.

    >
    > it's sort of the same thing as 2, but network intense clients might me written
    > more easily too... i've written code that was managing 100s of ssh connection
    > for example. i could have used threads but it was easier/more responsive to
    > just have an array of pipes and non-blocking reads.
    >
    > the only other thing i can think of is any time you may actually be fine
    > blocking on a read, but wish to remain aware of time - eg. you don't want to
    > block too long - and here something like readpartial is great.
    >
    > -a
     
    Mark Cotner, Oct 4, 2005
    #18
  19. On Oct 4, 2005, at 12:13 AM, David Gurba wrote:

    > Joshua Haberman wrote:
    >> while true
    >> (read_ready, write_ready, err) = IO.select([A, B, C])
    >> read_ready.each { |io|
    >> output = process(io.read)
    >> [A, B, C].each { |client| client.write(output) unless
    >> client == io }
    >> }
    >> end
    >>
    >> Nonblocking I/O gives you more control over the execution of your
    >> program, and frees you from the worries of synchronizing between
    >> threads. And it's simpler than using threads for programs that
    >> follow certain patterns.
    >>
    >> Josh
    >>
    >>

    > This sounds really interesting, but I don't fully understand the
    > while loop. Nonblocking IO sends/recieves data when its ready/
    > requested...eg. it doesn't block for the data, right?


    I'm not sure exactly what you're asking. Nonblocking I/O basically
    tells the OS: "when I do a read() or write(), only perform as much of
    the operation as you can without making me wait." If the OS cannot
    perform *any* of the operation (because there is no data waiting to
    read, or no buffer space available to write), the call errors with
    EAGAIN.

    IO.select is what you use to ask the OS what file descriptors are
    available for reading or writing. IO.select is what blocks, until
    one of your fds is available, or a timeout has elapsed. If you
    didn't use select, you'd have to busy-wait by reading from the fd
    over and over (getting EAGAIN every time). That would waste the
    CPU. Instead, you ask select to block until a file descriptor is
    available.

    > I have written some threaded applications. A java tic-tac-toe game
    > which had players and observers of a game that all viewed a global
    > 'board' state. Methods to modify the game state were thread safe
    > with mutexes, how is what your saying different...? Any info
    > appreciated...


    If you follow the pattern above, you don't have to make anything
    thread-safe. You don't have to use mutexes. You don't have to think
    about possibly bad interactions between threads like deadlock.
    Everything happens in the same thread.

    Josh
     
    Joshua Haberman, Oct 5, 2005
    #19
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Hendra Gunawan
    Replies:
    1
    Views:
    12,645
    Allan Herriman
    Apr 8, 2004
  2. Andre Kelmanson

    blocking i/o vs. non blocking i/o (performance)

    Andre Kelmanson, Oct 10, 2003, in forum: C Programming
    Replies:
    3
    Views:
    947
    Valentin Tihomirov
    Oct 12, 2003
  3. nukleus
    Replies:
    14
    Views:
    848
    Chris Uppal
    Jan 22, 2007
  4. Christian
    Replies:
    5
    Views:
    741
    Esmond Pitt
    Dec 2, 2007
  5. Serge Savoie
    Replies:
    4
    Views:
    278
    Serge Savoie
    Oct 1, 2008
Loading...

Share This Page