NIO and accepts()

Discussion in 'Java' started by Cyrille \cns\ Szymanski, Dec 14, 2003.

  1. Hello,

    I'm benchmarking several server io strategies and for that purpose I've
    built two simplistic Java ECHO servers.

    One of the server implementation takes advantage of the java.nio API.
    However it (my implementation) is slower than the classic 1 thread /
    client server. I've managed to find out (thanks to the profiler) that the
    accept() function call was slowing down the process. The strange thing is
    that I'm calling accept() only when SelectionKey.isAcceptable() and thus
    this operation should be fast, right ? Issues ?

    To test this behaviour I used a program that sequentially creates N
    connections to the server. The first server I wrote used an infinite loop
    that accepts sockets from a ServerSocket. The second server I wrote uses
    nio, selects on OP_ACCEPT and does a SelectionKey.accept(). This takes
    about 10 times longer.

    I'd also like to take advantage of multiprocessor architectures and spawn
    as many "worker threads" (taken from the IOCP voc.) as there are CPUs
    installed. Has anybody done this already ?

    Is it good practice to have multiple threads waiting on select() on the
    same Selector ?

    How can I register a Channel with a selector while one thread is in
    Selector.select() and have this thread process incoming events ? What
    I've done so far is loop on selects() with a timeout but this surely
    isn't good practice.

    I'm saddened to see that as I wrote it the 1 thread/client outperforms
    the nio one...

    Here comes the code :


    import java.io.*;
    import java.lang.*;
    import java.net.*;
    import java.nio.*;
    import java.nio.channels.*;
    import java.util.*;

    public class javaenh
    {
    public static void main(String args[]) throws Exception
    {
    // incoming connection channel
    ServerSocketChannel channel = ServerSocketChannel.open();
    channel.configureBlocking(false);
    channel.socket().bind( new InetSocketAddress( 1234 ) );

    // Register interest in when connection
    Selector selector = Selector.open();
    channel.register( selector, SelectionKey.OP_ACCEPT );

    System.out.println( "Ready" );
    // Wait for something of interest to happen
    while( selector.select()>0 )
    {
    // Get set of ready objects
    Iterator readyItor = selector.selectedKeys().iterator();

    // Walk through set
    while( readyItor.hasNext() )
    {
    // Get key from set
    SelectionKey key = (SelectionKey)readyItor.next();
    readyItor.remove();

    if( key.isReadable() )
    {
    // Get channel and context
    SocketChannel keyChannel = (SocketChannel)key.channel
    ();
    ByteBuffer buffer = (ByteBuffer)key.attachment();
    buffer.clear();

    // Get the data
    if( keyChannel.read( buffer )==-1 ) {
    keyChannel.socket().close();
    buffer = null;
    } else {
    // Send the data
    buffer.flip();
    keyChannel.write( buffer );

    // wait for data to be sent
    keyChannel.register( selector,
    SelectionKey.OP_WRITE, buffer );
    }
    }
    else if( key.isWritable() )
    {
    // Get channel and context
    SocketChannel keyChannel = (SocketChannel)key.channel
    ();
    ByteBuffer buffer = (ByteBuffer)key.attachment();

    // data sent, read again
    keyChannel.register( selector, SelectionKey.OP_READ,
    buffer );
    }
    else if( key.isAcceptable() )
    {
    // Get channel
    ServerSocketChannel keyChannel =
    (ServerSocketChannel)key.channel();

    // accept incoming connection
    SocketChannel clientChannel = keyChannel.accept();

    // create a client context
    ByteBuffer buffer = ByteBuffer.allocateDirect( 1024
    );

    // register it in the selector
    clientChannel.configureBlocking(false);
    clientChannel.register( selector,
    SelectionKey.OP_READ, buffer );
    }
    else
    {
    System.err.println("Ooops");
    }
    }
    }
    }
    }

    --
    _|_|_| CnS
    _|_| for(n=0;b;n++)
    _| b&=b-1; /*pp.47 K&R*/
    Cyrille \cns\ Szymanski, Dec 14, 2003
    #1
    1. Advertising

  2. Cyrille \cns\ Szymanski

    Douwe Guest

    > One of the server implementation takes advantage of the java.nio API.
    > However it (my implementation) is slower than the classic 1 thread /
    > client server. I've managed to find out (thanks to the profiler) that the
    > accept() function call was slowing down the process. The strange thing is
    > that I'm calling accept() only when SelectionKey.isAcceptable() and thus
    > this operation should be fast, right ? Issues ?


    I´m not sure that NIO was written to outperform the classic IO
    (specific Socket). The idea behind NIO is that you do not have to
    start a new Thread for every client since the underlaying operating
    system more or less already created a Thread for that client (this I
    think depends on the platform java is running on). Not creating a
    seperate thread for each client has one very big advantage; it
    simplifies all data handling. i.e. you want to write the data received
    from a client to a datafile. In a multiple threaded program you have
    to make sure that you are the only one writing to that file in a
    single threaded program you just write you´re data (since you are
    already sure you are the only thread writing data to the disk at that
    moment). Unfortunately a single threaded program has some
    disadvantages as well: if one client sends erronous data and causes
    the thread to go into a locked state then this means all other client
    handling is blocked as well. You say that the accept method is slow
    and you´ve probably expected that NIO would solve this. Unfortunately
    you still have to call the accept method although you are sure (by
    using the selector) it will not block, it still has to initialize the
    socket structure (which I think takes some time) and since the program
    is single threaded all your clients have to wait.

    > To test this behaviour I used a program that sequentially creates N
    > connections to the server. The first server I wrote used an infinite loop
    > that accepts sockets from a ServerSocket. The second server I wrote uses
    > nio, selects on OP_ACCEPT and does a SelectionKey.accept(). This takes
    > about 10 times longer.
    >
    > I'd also like to take advantage of multiprocessor architectures and spawn
    > as many "worker threads" (taken from the IOCP voc.) as there are CPUs
    > installed. Has anybody done this already ?


    Dont know ? at least I have not :)

    > Is it good practice to have multiple threads waiting on select() on the
    > same Selector ?


    No.....
    I don´t understand why you want to use a combination of a Selector and
    also use multiple Threads. In a multiprocessor environment a multi
    threaded program will almost always outperform a single threaded
    program (depending on the design of the programs and on the programs
    algorithm). If you have already created multiple threads for different
    connections and want to use one selector for that then it means that
    you more or less block all threads until the selector wakes up again
    and notifies the threads needed. To do this you have to create an
    extra thread to handle the selector and have to create some
    synchronized methods so that client threads can control this thread.
    You´ve then created a complex system that uses a Selector.

    I think the best pratice is to handle each client in a seperate
    thread. To avoid the overkill of creating the threads you could create
    a system where a thread can be reused over and over againg. Depending
    on the number of clients and the number of processors (if these are
    more or less static) you could use a selector in each thread where you
    handle mulitple clients. This you should only do if you have a very
    large number of clients connecting and a small number of CPUs.


    > How can I register a Channel with a selector while one thread is in
    > Selector.select() and have this thread process incoming events ? What
    > I've done so far is loop on selects() with a timeout but this surely
    > isn't good practice.


    I don´t think you want to access a Selector with different threads
    (this is IMO absolutely BAD practice). You could create an extra class
    with a thread handling the selector and the other threads communicate
    with this thread via methods (as described above) but try to avoid
    multiple access on the Selector object itself. Maybe you can think of
    a Selector as an object to coordinate your data handling and not to
    handle the data itself.

    > I'm saddened to see that as I wrote it the 1 thread/client outperforms
    > the nio one...


    Don´t think this has to do with NIO ... this has to do with the use of
    the Selector (which is just one part of NIO).

    > Here comes the code :


    And I removed it :)
    Douwe, Dec 15, 2003
    #2
    1. Advertising

  3. Cyrille "cns" Szymanski wrote:
    > I'm benchmarking several server io strategies and for that purpose I've
    > built two simplistic Java ECHO servers.


    Good move. Test, don't assume.

    > One of the server implementation takes advantage of the java.nio API.
    > However it (my implementation) is slower than the classic 1 thread /
    > client server. I've managed to find out (thanks to the profiler) that the
    > accept() function call was slowing down the process. The strange thing is
    > that I'm calling accept() only when SelectionKey.isAcceptable() and thus
    > this operation should be fast, right ? Issues ?


    The actual profiler output might be useful here. It may be the case
    that your implementation is buggy; I am not an NIO expert, but my
    analysis of your code shows at least one or two possible problems (see
    below). The problems may or may not have anything to do with your slow
    accepts.

    More importantly, however, you should consider whether your test
    scenario is a good model for the application you plan. Slow accepts are
    a problem only if accepting new connections is expected to be a
    significant part of your service's work, which might not be the case.

    > To test this behaviour I used a program that sequentially creates N
    > connections to the server. The first server I wrote used an infinite loop
    > that accepts sockets from a ServerSocket. The second server I wrote uses
    > nio, selects on OP_ACCEPT and does a SelectionKey.accept(). This takes
    > about 10 times longer.
    >
    > I'd also like to take advantage of multiprocessor architectures and spawn
    > as many "worker threads" (taken from the IOCP voc.) as there are CPUs
    > installed. Has anybody done this already ?
    >
    > Is it good practice to have multiple threads waiting on select() on the
    > same Selector ?


    Per the API docs, Selectors are thread-safe but their various key sets
    are not. I'm not sure what you would expect the behavior to be with
    multiple threads selecting on the same selector concurrently, in any
    case. In particular, the selector's key sets are _not_ thread safe, so
    you can't have multiple threads processing those concurrently, at least
    if any of the threads attempt to modify the sets.

    > How can I register a Channel with a selector while one thread is in
    > Selector.select() and have this thread process incoming events ? What
    > I've done so far is loop on selects() with a timeout but this surely
    > isn't good practice.


    If you are doing it all in one thread then you can only register a
    channel when that thread is not doing something else (e.g. blocking on
    selection). You must therefore ensure that the selection loop will
    cycle periodically, which would be done exactly as you describe if you
    generally have little else to do in that thread, or by using selectNow()
    instead of select() if that thread generally has enough other work to do
    to only check the selector periodically.

    If you have a seperate thread in which you intend to perform the
    registration then you should be able to do that without fear, but it is
    not clear to me whether the registration would block, or whether the new
    channel would be eligible for selection during the current invocation of
    select(). (My guesses would be yes, it would block, and no, it wouldn't
    be immediately eligible.)

    > I'm saddened to see that as I wrote it the 1 thread/client outperforms
    > the nio one...


    The thread per client approach is tried and true. I wouldn't give up on
    the selection approach just yet, however. As long as you are looking
    into this sort of thing, it's worthwhile to try to tune your code a bit
    to get the best performance out of each technique. The selector
    variation is harder to get right (in other languages too).

    [...]

    > public class javaenh
    > {
    > public static void main(String args[]) throws Exception
    > {
    > // incoming connection channel
    > ServerSocketChannel channel = ServerSocketChannel.open();
    > channel.configureBlocking(false);
    > channel.socket().bind( new InetSocketAddress( 1234 ) );
    >
    > // Register interest in when connection
    > Selector selector = Selector.open();
    > channel.register( selector, SelectionKey.OP_ACCEPT );


    Looks good so far....

    > System.out.println( "Ready" );
    > // Wait for something of interest to happen
    > while( selector.select()>0 )
    > {


    This while condition is fine for testing, but is probably not what you
    would want to use in a real app. The select() method will return zero
    if the Selector's wakeUp() method is invoked or if the thread in which
    select() is blocking is interrupted (from another thread in either case)
    without any selectable channels being ready.

    > // Get set of ready objects
    > Iterator readyItor = selector.selectedKeys().iterator();
    >
    > // Walk through set
    > while( readyItor.hasNext() )
    > {
    > // Get key from set
    > SelectionKey key = (SelectionKey)readyItor.next();
    > readyItor.remove();


    This is fine here, but would be buggy if the Selector were concurrently
    accessed by multiple threads as you proposed doing. It does appear that
    this is necessary to indicate that you have handled the operation that
    was selected for.

    > if( key.isReadable() )
    > {
    > // Get channel and context
    > SocketChannel keyChannel = (SocketChannel)key.channel
    > ();
    > ByteBuffer buffer = (ByteBuffer)key.attachment();
    > buffer.clear();
    >
    > // Get the data
    > if( keyChannel.read( buffer )==-1 ) {
    > keyChannel.socket().close();
    > buffer = null;


    Setting the local buffer variable to null is pointless. The Buffer will
    remain reachable (and thus not be deallocated or GC'd) at least until
    the SelectionKey with which it is associated becomes unreachable. If
    you wanted to reuse the buffer (via a buffer pool, for instance) then
    you would want to disassociate it from the key and return it to the pool
    here, but probably you can just forget about it.

    > } else {
    > // Send the data
    > buffer.flip();
    > keyChannel.write( buffer );


    This is buggy. The channel is in non-blocking mode, so you are not
    assured that all the available data (or even any of it) will be written
    during this invocation of write().

    >
    > // wait for data to be sent
    > keyChannel.register( selector,
    > SelectionKey.OP_WRITE, buffer );


    This is suboptimal. Rather than register the channel again, you should
    be changing the key's interest set. The same buffer will even remain
    associated. Moreover, if you have successfully written all the buffer
    contents then you don't need to select for writing at all, just again
    for reading.

    > }
    > }
    > else if( key.isWritable() )
    > {
    > // Get channel and context
    > SocketChannel keyChannel = (SocketChannel)key.channel
    > ();
    > ByteBuffer buffer = (ByteBuffer)key.attachment();
    >
    > // data sent, read again
    > keyChannel.register( selector, SelectionKey.OP_READ,
    > buffer );


    As above, this is suboptimal -- just change the interest set. Before
    doing so, however, attempt to write the remaining bytes from the buffer;
    only switch back to selecting for reading once you have written all the
    data available.

    > }
    > else if( key.isAcceptable() )
    > {
    > // Get channel
    > ServerSocketChannel keyChannel =
    > (ServerSocketChannel)key.channel();
    >
    > // accept incoming connection
    > SocketChannel clientChannel = keyChannel.accept();
    >
    > // create a client context
    > ByteBuffer buffer = ByteBuffer.allocateDirect( 1024
    > );


    Have you read the API docs' recommendations about direct vs. non-direct
    buffers? In particular their warning that allocating a direct buffer
    takes longer, and their recommendation that direct buffers only be used
    for large, long-lived buffers and that they only be used when they yield
    a measurable performance gain?

    > // register it in the selector
    > clientChannel.configureBlocking(false);
    > clientChannel.register( selector,
    > SelectionKey.OP_READ, buffer );


    Unlike some of the above, this a new channel registration, so okay.

    > }
    > else
    > {
    > System.err.println("Ooops");
    > }
    > }
    > }
    > }
    > }



    John Bollinger
    John C. Bollinger, Dec 15, 2003
    #3
  4. > I´m not sure that NIO was written to outperform the classic IO
    > (specific Socket).


    The classic blocking socket scheme does not scale well and this is why
    writing a powerful server in Java wasn't reasonable. I thought that NIO
    had been written to solve this problem.


    > In a multiple threaded program you have to make sure that you are the

    only one writing to that file in a
    > single threaded program you just write you´re data (since you are
    > already sure you are the only thread writing data to the disk at that
    > moment).


    I've written servers in which only one thread at a time handles a client.

    I have the program spawn N "worker threads" (typically N=2*CPU) which
    enter a sleeping state. Handles (sockets, files, memory...) are
    registered with a queue and when something happens on one of the handles
    (the queue for that handle isn't empty), the operating system awakens one
    of the worker threads which handles the event.

    If a resource has to be shared within several threads (for instance you
    wish to count bytes sent/recv) then the thread posts its job to the queue
    associated with the resource and asynchronously waits for it to complete.


    > Unfortunately a single threaded program has some disadvantages as well:

    if one client sends erronous data and causes
    > the thread to go into a locked state then this means all other client
    > handling is blocked as well.


    Right. So are dead threads in a MT program a vulnerability, and for this
    reason I happen to think that single threaded models are better because
    you can't go away with that sort of problem.


    You say that the accept method is slow and you´ve probably expected that
    NIO would solve this. Unfortunately
    > you still have to call the accept method although you are sure (by
    > using the selector) it will not block, it still has to initialize the
    > socket structure (which I think takes some time) and since the program
    > is single threaded all your clients have to wait.


    Then Java lacks an asynchronous accept() method.


    >> Is it good practice to have multiple threads waiting on select() on
    >> the same Selector ?

    >
    > No.....
    > I don´t understand why you want to use a combination of a Selector and
    > also use multiple Threads. In a multiprocessor environment a multi
    > threaded program will almost always outperform a single threaded
    > program (depending on the design of the programs and on the programs
    > algorithm).


    On multiprocessor architectures the 1 thread per client model doesn't
    scale well either. Even though the maximum number of clients is higher,
    it is still too small.

    On a 4 CPU machine, I'd typically want to have 8 threads processing IO
    requests. If I use a single threaded progam, the thread would only run on
    one CPU at a time which does not take advantage of the 3 other CPUs.


    > you could use a selector in each thread where you handle mulitple

    clients. This you should only do if you have a very
    > large number of clients connecting and a small number of CPUs.


    You mean if have N threads and M clients, I'd give M/N clients to each
    thread to handle ? That doesn't solve the accept issue (which can be only
    done by one thread) and I'd rather have N threads handling M clients.


    Thanks for your helpful thoughts.

    --
    _|_|_| CnS
    _|_| for(n=0;b;n++)
    _| b&=b-1; /*pp.47 K&R*/
    Cyrille \cns\ Szymanski, Dec 15, 2003
    #4
  5. >> I'm benchmarking several server io strategies and for that purpose
    >> I've built two simplistic Java ECHO servers.

    >
    > Good move. Test, don't assume.


    My goal is to write the best ECHO server for various platforms (Java,
    win32, .NET...) I can as long as the code remains simple and assume that
    fine tuning it (which I will not) will improve performance by, say 10% on
    each platform. This should be a good starting point for comparisons.


    > More importantly, however, you should consider whether your test
    > scenario is a good model for the application you plan. Slow accepts
    > are a problem only if accepting new connections is expected to be a
    > significant part of your service's work, which might not be the case.


    Since I am planning a HTTP proxy server I think it is reasonable to
    assume that connections will not last long specially with lossy web
    clients.


    >> Is it good practice to have multiple threads waiting on select() on
    >> the same Selector ?

    >
    > Per the API docs, Selectors are thread-safe but their various key sets
    > are not. I'm not sure what you would expect the behavior to be with
    > multiple threads selecting on the same selector concurrently, in any
    > case.


    In fact I think I've mistaken NIO with Microsoft's IO Completion Ports
    (IOCP). The selector is nothing more than the Java implementation of
    Berkeley's socket select().

    If you are not aware of what IOCP is, here is a brief explanation :

    The idea is to spawn N threads (typically N=2*CPU) that will process IO
    requests. The programmer then registers the handles he wishes to use with
    the iocp.

    The worker threads wait for the IOCP to wake them up when an io operation
    completes on one of those handles so it can process the received data,
    then issue another asynchronous io request and re-enter sleeping state.


    Typically this is how things happen with a typical echo server :

    The listening socket is registered with the IOCP and a (asynchronous)
    call to accept is made, then the thread sleeps. When a connection is
    established and the accept finishes, the thread wakes up (it can have
    handled other io requests in the meantime), it finds out that an accept
    has finished (context information is associated with the asynchronous
    call) and typically issues an (asynchronous) read request.

    When the read request completes, the thread wakes up, finds out that a
    read has finished and issues a send request on the received buffer.

    When the send completes, either all data has been sent in which case a
    new read is done, either there is still data to send in which case a new
    send is done.


    The good thing about IOCP is that every lengthy operation (accept,
    connect, read, write...) is overlapped. I believe that socket acceptance
    is time consuming because a new socket descriptor has to be allocated (I
    bet most of the time is spent in thread synchrinosation calls to ensure
    the socket implementation is thread safe) and SYN ACK packets have to be
    sent. Thus it is time consuming and not cpu consuming which makes it a
    good candidate for overlapped operation.


    My requirements are simple : I do not want 1 thread per client as this
    does not scale well (exit classical io) and I need several threads to
    handle io requests to take advantage of multiprocessor machines.

    I wonder if those requirements are comatible with NIO... since they are
    not compatible with select()...


    > If you have a seperate thread in which you intend to perform the
    > registration then you should be able to do that without fear, but it
    > is not clear to me whether the registration would block, or whether
    > the new channel would be eligible for selection during the current
    > invocation of select(). (My guesses would be yes, it would block, and
    > no, it wouldn't be immediately eligible.)


    The threads that perform Channel registrations also call select(). But as
    long as the others do not cycle there will only be one thread able to
    process the newly registered channels.

    Besides your guesses seems to be correct.


    > The thread per client approach is tried and true. I wouldn't give up
    > on the selection approach just yet, however. As long as you are
    > looking into this sort of thing, it's worthwhile to try to tune your
    > code a bit to get the best performance out of each technique. The
    > selector variation is harder to get right (in other languages too).


    I'm a strong believer in the Selector approach. However i'd rather have
    implemented "completion" selects (as it is done in IOCP) because it makes
    MT programs easier to write.


    The approach this thread made me think of is having one thread loop in
    selects() and dispatch work to idle worker threads of a thread pool. I
    thought that the JVM would do the dispatching for me if I had several
    thread waiting on select() but it doesn't seem to be the case.


    >> public class javaenh
    >> {
    >> public static void main(String args[]) throws Exception
    >> {
    >> // incoming connection channel
    >> ServerSocketChannel channel = ServerSocketChannel.open();
    >> channel.configureBlocking(false);
    >> channel.socket().bind( new InetSocketAddress( 1234 ) );
    >>
    >> // Register interest in when connection
    >> Selector selector = Selector.open();
    >> channel.register( selector, SelectionKey.OP_ACCEPT );

    >
    > Looks good so far....
    >
    >> System.out.println( "Ready" );
    >> // Wait for something of interest to happen
    >> while( selector.select()>0 )
    >> {

    >
    > This while condition is fine for testing, but is probably not what you
    > would want to use in a real app. The select() method will return zero
    > if the Selector's wakeUp() method is invoked or if the thread in which
    > select() is blocking is interrupted (from another thread in either
    > case) without any selectable channels being ready.


    Great. There is a way to wake up the selector without io operation being
    triggered.

    >
    >> // Get set of ready objects
    >> Iterator readyItor = selector.selectedKeys().iterator();
    >>
    >> // Walk through set
    >> while( readyItor.hasNext() )
    >> {
    >> // Get key from set
    >> SelectionKey key = (SelectionKey)readyItor.next();
    >> readyItor.remove();

    >
    > This is fine here, but would be buggy if the Selector were
    > concurrently accessed by multiple threads as you proposed doing. It
    > does appear that this is necessary to indicate that you have handled
    > the operation that was selected for.
    >
    >> if( key.isReadable() )
    >> {
    >> // Get channel and context
    >> SocketChannel keyChannel =
    >> (SocketChannel)key.channel
    >> ();
    >> ByteBuffer buffer = (ByteBuffer)key.attachment();
    >> buffer.clear();
    >>
    >> // Get the data
    >> if( keyChannel.read( buffer )==-1 ) {
    >> keyChannel.socket().close();
    >> buffer = null;

    >
    > Setting the local buffer variable to null is pointless. The Buffer
    > will remain reachable (and thus not be deallocated or GC'd) at least
    > until the SelectionKey with which it is associated becomes
    > unreachable. If you wanted to reuse the buffer (via a buffer pool,
    > for instance) then you would want to disassociate it from the key and
    > return it to the pool here, but probably you can just forget about it.


    Ok. I wanted the buffer to be marked for GC but indeed it is still
    referenced by the SelectionKey.

    >
    >> } else {
    >> // Send the data
    >> buffer.flip();
    >> keyChannel.write( buffer );

    >
    > This is buggy. The channel is in non-blocking mode, so you are not
    > assured that all the available data (or even any of it) will be
    > written during this invocation of write().


    I want this write operation to be overlapped. What I wish is to be
    notified when the write operation completes and how much data has been
    sent.

    >
    >>
    >> // wait for data to be sent
    >> keyChannel.register( selector,
    >> SelectionKey.OP_WRITE, buffer );

    >
    > This is suboptimal. Rather than register the channel again, you
    > should be changing the key's interest set. The same buffer will even
    > remain associated. Moreover, if you have successfully written all the
    > buffer contents then you don't need to select for writing at all, just
    > again for reading.


    If I get it right, I'd rather write
    keyChannel.keyFor().interestOps( SelectionKey.OP_WRITE );
    I need to be notified when the previous write operation completes.


    >
    >> }
    >> }
    >> else if( key.isWritable() )
    >> {
    >> // Get channel and context
    >> SocketChannel keyChannel =
    >> (SocketChannel)key.channel
    >> ();
    >> ByteBuffer buffer = (ByteBuffer)key.attachment();
    >>
    >> // data sent, read again
    >> keyChannel.register( selector,
    >> SelectionKey.OP_READ,
    >> buffer );

    >
    > As above, this is suboptimal -- just change the interest set. Before
    > doing so, however, attempt to write the remaining bytes from the
    > buffer; only switch back to selecting for reading once you have
    > written all the data available.


    if( buffer.length()>0 ) {
    keyChannel.write();
    } else {
    keyChannel.keyFor().interestOps( SelectionKey.OP_READ );
    }


    >
    >> }
    >> else if( key.isAcceptable() )
    >> {
    >> // Get channel
    >> ServerSocketChannel keyChannel =
    >> (ServerSocketChannel)key.channel();
    >>
    >> // accept incoming connection
    >> SocketChannel clientChannel =
    >> keyChannel.accept();
    >>
    >> // create a client context
    >> ByteBuffer buffer = ByteBuffer.allocateDirect(
    >> 1024
    >> );

    >
    > Have you read the API docs' recommendations about direct vs.
    > non-direct buffers? In particular their warning that allocating a
    > direct buffer takes longer, and their recommendation that direct
    > buffers only be used for large, long-lived buffers and that they only
    > be used when they yield a measurable performance gain?


    Ok. I was not aware of that issue.

    >
    >> // register it in the selector
    >> clientChannel.configureBlocking(false);
    >> clientChannel.register( selector,
    >> SelectionKey.OP_READ, buffer );

    >
    > Unlike some of the above, this a new channel registration, so okay.
    >
    >> }
    >> else
    >> {
    >> System.err.println("Ooops");
    >> }
    >> }
    >> }
    >> }
    >> }

    >
    >
    > John Bollinger
    >
    >


    John, thanks for your helpful advice.


    --
    _|_|_| CnS
    _|_| for(n=0;b;n++)
    _| b&=b-1; /*pp.47 K&R*/
    Cyrille \cns\ Szymanski, Dec 15, 2003
    #5
  6. Cyrille \cns\ Szymanski

    Douwe Guest

    "Cyrille \"cns\" Szymanski" <> wrote in message news:<Xns9452D075C830Dcns2cnsinvalid@213.228.0.33>...
    > > I´m not sure that NIO was written to outperform the classic IO
    > > (specific Socket).

    >
    > The classic blocking socket scheme does not scale well and this is why
    > writing a powerful server in Java wasn't reasonable. I thought that NIO
    > had been written to solve this problem.


    Dont know exactly what you mean with scaling but as far as I know
    Swing is largely based on AWT and therefor you could do the same
    things with AWT as you can with Swing

    > > In a multiple threaded program you have to make sure that you are the

    > only one writing to that file in a
    > > single threaded program you just write you´re data (since you are
    > > already sure you are the only thread writing data to the disk at that
    > > moment).

    >
    > I've written servers in which only one thread at a time handles a client.
    >
    > I have the program spawn N "worker threads" (typically N=2*CPU) which
    > enter a sleeping state. Handles (sockets, files, memory...) are
    > registered with a queue and when something happens on one of the handles
    > (the queue for that handle isn't empty), the operating system awakens one
    > of the worker threads which handles the event.
    >
    > If a resource has to be shared within several threads (for instance you
    > wish to count bytes sent/recv) then the thread posts its job to the queue
    > associated with the resource and asynchronously waits for it to complete.


    question is why you then created multiple threads ... if their is only
    one queue that is dispatching the enlisted information one by one to
    the different Threads you could better implement a single Thread (IMO
    this is just overkill)

    > > Unfortunately a single threaded program has some disadvantages as well:

    > if one client sends erronous data and causes
    > > the thread to go into a locked state then this means all other client
    > > handling is blocked as well.

    >
    > Right. So are dead threads in a MT program a vulnerability, and for this
    > reason I happen to think that single threaded models are better because
    > you can't go away with that sort of problem.


    Depends on what you mean with dead ... a dead thread could be a thread
    that just waits for data which will NEVER arive, a real dead thread is
    a thread that can not be reached at all anymore. A thread waiting on
    data can be interrupted (if Thread.interupt() does not work a close
    socket will work) and therefor a cleaner could remove that kind of
    'dead' threads . In a single Thread you can not do so.

    > You say that the accept method is slow and you´ve probably expected that
    > NIO would solve this. Unfortunately
    > > you still have to call the accept method although you are sure (by
    > > using the selector) it will not block, it still has to initialize the
    > > socket structure (which I think takes some time) and since the program
    > > is single threaded all your clients have to wait.

    >
    > Then Java lacks an asynchronous accept() method.


    Would an asynchronous accept help to speed up the initialisation
    process ?? If you can answer thiw with no then an asynchronous accept
    doesn´t bring much.

    > >> Is it good practice to have multiple threads waiting on select() on
    > >> the same Selector ?

    > >
    > > No.....
    > > I don´t understand why you want to use a combination of a Selector and
    > > also use multiple Threads. In a multiprocessor environment a multi
    > > threaded program will almost always outperform a single threaded
    > > program (depending on the design of the programs and on the programs
    > > algorithm).

    >
    > On multiprocessor architectures the 1 thread per client model doesn't
    > scale well either. Even though the maximum number of clients is higher,
    > it is still too small.
    >
    > On a 4 CPU machine, I'd typically want to have 8 threads processing IO
    > requests. If I use a single threaded progam, the thread would only run on
    > one CPU at a time which does not take advantage of the 3 other CPUs.
    >
    >
    > > you could use a selector in each thread where you handle mulitple

    > clients. This you should only do if you have a very
    > > large number of clients connecting and a small number of CPUs.

    >
    > You mean if have N threads and M clients, I'd give M/N clients to each
    > thread to handle ? That doesn't solve the accept issue (which can be only
    > done by one thread) and I'd rather have N threads handling M clients.


    That indeed doesn´t solve the accept issue ... as far as I can see you
    don't need to solve the slow accept() initilzing .. all you need to
    solve is that the slow accept is not interfering with the other
    clients that are being handled. But using a single thread you can not
    solve this problem. And if you use one Selector handling all
    connections then you should handle acceptance of connections in
    another Thread.
    Douwe, Dec 19, 2003
    #6
  7. >> The classic blocking socket scheme does not scale well and this is
    >> why writing a powerful server in Java wasn't reasonable. I thought
    >> that NIO had been written to solve this problem.

    >
    > Dont know exactly what you mean with scaling


    Quoting webopedia : "A popular buzzword that refers to how well a
    hardware or software system can adapt to increased demands."

    This has something to do with the asymptotic behaviour of functions as
    well (response time = f(nb flients) ). Typically a system which responds
    in o(n^2) where n is the number of clients isn't scalable while one that
    responds in o(n) is scalable.

    In a nutshell the idea is that an increasing number of clients will slow
    down the server but not overwhelm it.

    For instance, with less than 50 clients a 1-thread-per-client
    server and a iocp server give almost the same results, with about 2000
    clients the 1-thread-per-client is overwhelmed (it does not respond
    anymore) whereas the iocp server still works.


    > question is why you then created multiple threads ... if their is only
    > one queue that is dispatching the enlisted information one by one to
    > the different Threads you could better implement a single Thread (IMO
    > this is just overkill)


    The fact is that worker threads take more time to complete than the
    dispatcher thread to cycle because for example they have to parse a HTTP
    request when it's sent. And it's automatically done by the operating
    system under windows (IOCP server model).

    This method has been tried and tested and in multiprocessor environments
    it has been proven to yield a significant performance gain.


    It is my goal to compare different io strategies and if you're right, the
    benchmarks should show it.



    >> Then Java lacks an asynchronous accept() method.

    >
    > Would an asynchronous accept help to speed up the initialisation
    > process ?? If you can answer thiw with no then an asynchronous accept
    > doesn´t bring much.


    I bet it will since the accept() operation is time consuming but not cpu
    consuming it will allow the system to do something in the meantime.

    As I explained in another post, upon accpetance the operating system
    has to allocate a new socket descriptor (which involves thread
    synchronization with the socket subsystem) and perhaps send SYN/ACK
    packets which leaves the cpu with many cyles to spare.

    Again, it is my goal to see whether or not this would lead to a gain in
    performance.


    > That indeed doesn´t solve the accept issue ... as far as I can see you
    > don't need to solve the slow accept() initilzing .. all you need to
    > solve is that the slow accept is not interfering with the other
    > clients that are being handled.


    .... and with other clients being accepted.

    > But using a single thread you can not solve this problem.


    It is not my goal to use only one thread but an arbitrary number of
    threads that I can change at will to take advantage of multiprocessor
    architectures.

    --
    _|_|_| CnS
    _|_| for(n=0;b;n++)
    _| b&=b-1; /*pp.47 K&R*/
    Cyrille \cns\ Szymanski, Dec 20, 2003
    #7
  8. Cyrille \cns\ Szymanski

    Douwe Guest

    > >> The classic blocking socket scheme does not scale well and this is
    > >> why writing a powerful server in Java wasn't reasonable. I thought
    > >> that NIO had been written to solve this problem.

    > >
    > > Dont know exactly what you mean with scaling

    >
    > Quoting webopedia : "A popular buzzword that refers to how well a
    > hardware or software system can adapt to increased demands."
    >
    > This has something to do with the asymptotic behaviour of functions as
    > well (response time = f(nb flients) ). Typically a system which responds
    > in o(n^2) where n is the number of clients isn't scalable while one that
    > responds in o(n) is scalable.
    >
    > In a nutshell the idea is that an increasing number of clients will slow
    > down the server but not overwhelm it.
    >
    > For instance, with less than 50 clients a 1-thread-per-client
    > server and a iocp server give almost the same results, with about 2000
    > clients the 1-thread-per-client is overwhelmed (it does not respond
    > anymore) whereas the iocp server still works.


    Thanks for your fine definition ... :)

    > > question is why you then created multiple threads ... if their is only
    > > one queue that is dispatching the enlisted information one by one to
    > > the different Threads you could better implement a single Thread (IMO
    > > this is just overkill)

    >
    > The fact is that worker threads take more time to complete than the
    > dispatcher thread to cycle because for example they have to parse a HTTP
    > request when it's sent. And it's automatically done by the operating
    > system under windows (IOCP server model).
    >
    > This method has been tried and tested and in multiprocessor environments
    > it has been proven to yield a significant performance gain.
    >
    >
    > It is my goal to compare different io strategies and if you're right, the
    > benchmarks should show it.


    Could be that I´ve misunderstood you ... I thought you had built an
    queue-thread that dispatches its actions to different worker threads
    one by one waiting for each seperate worker thread (and so generating
    a sequential program with multiple threads) ... I thought so because
    you wrote

    >>>> I've written servers in which only one thread at a time handles a

    client.


    > >> Then Java lacks an asynchronous accept() method.

    > >
    > > Would an asynchronous accept help to speed up the initialisation
    > > process ?? If you can answer thiw with no then an asynchronous accept
    > > doesn´t bring much.

    >
    > I bet it will since the accept() operation is time consuming but not cpu
    > consuming it will allow the system to do something in the meantime.
    >
    > As I explained in another post, upon accpetance the operating system
    > has to allocate a new socket descriptor (which involves thread
    > synchronization with the socket subsystem) and perhaps send SYN/ACK
    > packets which leaves the cpu with many cyles to spare.
    >
    > Again, it is my goal to see whether or not this would lead to a gain in
    > performance.


    Asuming the accept is slow caused by the reasons you described above.
    If the new socket has to be synchronized with the sub-system the
    Thread handling this will do it calls to the socket subsystem ... go
    into a WAITING state ... [subsystem sends SYN and waits for ACK and
    maybe does other stuff].... WAKING up again (being signaled by the
    subsystem) .. and return from the accept method. As far as I can see
    two threads are involved here were the "outer" thread is a Java Thread
    and the inner Thread is a system thread (owned by the JVM). The outer
    Thread is going into a WAITING state and will not use any CPU cycles.
    The inner Thread will (in most cases) go into a WAITING state as well
    as soon as it has sent the SYN to the IO-Device/Networkcard and it
    will wake up as soon as the IO-Device has new data. The second
    (System-)Thread therefor doesn´t consume much time either while being
    asleep. This is the situation if the non-asynchronized accept() is
    used.

    In the asynchronized way their is not much different ... the Selector
    could be seen as the first thread ... the second thread stays more or
    less the same ... but now the first thread can be notified by multiple
    events from different Threads. I even could imaging that after being
    notified by one Thread the Selector waits for a few milliseconds in
    the hope more threads will notify it (but therefor I would have to
    look into the implementation).

    In both situations you have the system-thread that does the actual
    work, this can not be changed/improved. The first situation is using
    an older implementation for IO then the second one but it does not
    waste CPU cycles. I can´t tell you which of the two implementations
    will be faster that is just trial and error.

    > > That indeed doesn´t solve the accept issue ... as far as I can see you
    > > don't need to solve the slow accept() initilzing .. all you need to
    > > solve is that the slow accept is not interfering with the other
    > > clients that are being handled.

    >
    > ... and with other clients being accepted.


    Not sure what you mean but I could be that if two clients are being
    accepted at the same moment the Socket implementation will handle the
    accepts sequentially ... but this is IMO OS dependent and has nothing
    to do with the Java Socket API and neither with the accept being
    asynchronized or not.

    > > But using a single thread you can not solve this problem.

    >
    > It is not my goal to use only one thread but an arbitrary number of
    > threads that I can change at will to take advantage of multiprocessor
    > architectures.
    Douwe, Dec 22, 2003
    #8
  9. Cyrille "cns" Szymanski wrote:

    >>More importantly, however, you should consider whether your test
    >>scenario is a good model for the application you plan. Slow accepts
    >>are a problem only if accepting new connections is expected to be a
    >>significant part of your service's work, which might not be the case.

    >
    >
    > Since I am planning a HTTP proxy server I think it is reasonable to
    > assume that connections will not last long specially with lossy web
    > clients.


    Well, for some clients connections might not last long, but they will in
    general last longer than for a locally-generated echo request /
    response, even for ill-behaved clients. Much longer in many cases.

    >>>Is it good practice to have multiple threads waiting on select() on
    >>>the same Selector ?

    >>
    >>Per the API docs, Selectors are thread-safe but their various key sets
    >>are not. I'm not sure what you would expect the behavior to be with
    >>multiple threads selecting on the same selector concurrently, in any
    >>case.

    >
    >
    > In fact I think I've mistaken NIO with Microsoft's IO Completion Ports
    > (IOCP). The selector is nothing more than the Java implementation of
    > Berkeley's socket select().


    Yes.

    > If you are not aware of what IOCP is, here is a brief explanation :
    >
    > The idea is to spawn N threads (typically N=2*CPU) that will process IO
    > requests. The programmer then registers the handles he wishes to use with
    > the iocp.
    >
    > The worker threads wait for the IOCP to wake them up when an io operation
    > completes on one of those handles so it can process the received data,
    > then issue another asynchronous io request and re-enter sleeping state.


    I was not aware. One could certainly build an equivalent in Java,
    presumably on top of NIO, based on one thread to deal directly with the
    Selector and an associated thread pool to handle the actual operations.
    (Along the general lines you mentioned yourself.)

    [...]

    > The good thing about IOCP is that every lengthy operation (accept,
    > connect, read, write...) is overlapped. I believe that socket acceptance
    > is time consuming because a new socket descriptor has to be allocated (I
    > bet most of the time is spent in thread synchrinosation calls to ensure
    > the socket implementation is thread safe) and SYN ACK packets have to be
    > sent. Thus it is time consuming and not cpu consuming which makes it a
    > good candidate for overlapped operation.


    Having done a little socket programming in C, but not claiming to be
    expert, I don't see how you could overlap two accepts on the same
    listening socket. Don't you have to accept connections serially, even
    at a low level? I guess it's a function of the TCP stack; do some
    stacks allow concurrent accepts on the same socket?

    > My requirements are simple : I do not want 1 thread per client as this
    > does not scale well (exit classical io) and I need several threads to
    > handle io requests to take advantage of multiprocessor machines.
    >
    > I wonder if those requirements are comatible with NIO... since they are
    > not compatible with select()...


    I think so. Here's the scheme, based on your idea about dispatching
    work to a thread pool:
    () One thread manages the Selector, much as you already have.
    () When it detects one or more ready IO operations, it iterates through
    the selected keys and assigns the appropriate IO operation on the
    associated Channel to a thread from a thread pool, after first clearing
    the key's interest ops.
    () After processing the whole list, the selector thread invokes a new
    select().

    () The threads from your pool, upon being awakend and assigned a new
    SelectionKey, retrieve the channel, perform as much of the required
    operation as they can without blocking, set the appropriate interest
    operations on the key, and then wakeup() the Selector before going back
    to sleep. (The wakeup is essential to make the Selector notice the
    change in the key's interest operations.)
    () After a read, as much of the data read as possible should be
    written; if that's all of it then the new interest set is OP_READ;
    otherwise it is OP_WRITE.
    () Remember to close() the _channel_ (which will also implicitly cancel
    all associated selection keys) when closure of the remote side is
    detected. It is not clear to me from the API docs whether closing the
    underlying Socket causes the channel to be closed (or the reverse).

    () The selector thread must be prepared for the possibility that no
    selection keys are ready when select() returns, but that shouldn't be hard.

    >>If you have a seperate thread in which you intend to perform the
    >>registration then you should be able to do that without fear, but it
    >>is not clear to me whether the registration would block, or whether
    >>the new channel would be eligible for selection during the current
    >>invocation of select(). (My guesses would be yes, it would block, and
    >>no, it wouldn't be immediately eligible.)

    >
    >
    > The threads that perform Channel registrations also call select(). But as
    > long as the others do not cycle there will only be one thread able to
    > process the newly registered channels.
    >
    > Besides your guesses seems to be correct.


    I don't think it necessary for multiple channels to call select(), as
    long as you wakeup() the Selector at appropriate points. You might need
    to apply a bit of synchronization (for instance, so that the Selector
    doesn't go back into select() too soon) but I think it could be worked
    out. Rather than synchronizing on the Selector itself you might want to
    create a simple mutex.

    > I'm a strong believer in the Selector approach. However i'd rather have
    > implemented "completion" selects (as it is done in IOCP) because it makes
    > MT programs easier to write.


    In other words, IOCP already provides a packaged equivalent to the
    approach I describe above? Or is there something I missed that it does
    and the above doesn't?

    >>> // Wait for something of interest to happen
    >>> while( selector.select()>0 )
    >>> {

    >>
    >>This while condition is fine for testing, but is probably not what you
    >>would want to use in a real app. The select() method will return zero
    >>if the Selector's wakeUp() method is invoked or if the thread in which
    >>select() is blocking is interrupted (from another thread in either
    >>case) without any selectable channels being ready.

    >
    >
    > Great. There is a way to wake up the selector without io operation being
    > triggered.


    Many people would consider that a good thing. For instance, it makes it
    easier to cleanly shut down. It also makes it possible to make the
    Selector take notice of changes to its keys' interest op sets, a
    facility that my suggested approach makes use of.

    >>> // Send the data
    >>> buffer.flip();
    >>> keyChannel.write( buffer );

    >>
    >>This is buggy. The channel is in non-blocking mode, so you are not
    >>assured that all the available data (or even any of it) will be
    >>written during this invocation of write().

    >
    >
    > I want this write operation to be overlapped. What I wish is to be
    > notified when the write operation completes and how much data has been
    > sent.


    The operation will not block on I/O. When it returns you can tell
    whether or not more remains to write by checking buffer.remaining().

    >>> // wait for data to be sent
    >>> keyChannel.register( selector,
    >>>SelectionKey.OP_WRITE, buffer );

    >>
    >>This is suboptimal. Rather than register the channel again, you
    >>should be changing the key's interest set. The same buffer will even
    >>remain associated. Moreover, if you have successfully written all the
    >>buffer contents then you don't need to select for writing at all, just
    >>again for reading.

    >
    >
    > If I get it right, I'd rather write
    > keyChannel.keyFor().interestOps( SelectionKey.OP_WRITE );
    > I need to be notified when the previous write operation completes.


    For the single-threaded approach you want, right after the
    keyChannel.write() above,

    if (buffer.remaining() > 0) {
    key.interestOps(SelectionKey.OP_WRITE);
    }

    [...]

    >>> ByteBuffer buffer = (ByteBuffer)key.attachment();
    >>>
    >>> // data sent, read again
    >>> keyChannel.register( selector,
    >>> SelectionKey.OP_READ,
    >>>buffer );

    >>
    >>As above, this is suboptimal -- just change the interest set. Before
    >>doing so, however, attempt to write the remaining bytes from the
    >>buffer; only switch back to selecting for reading once you have
    >>written all the data available.

    >
    >
    > if( buffer.length()>0 ) {
    > keyChannel.write();
    > } else {
    > keyChannel.keyFor().interestOps( SelectionKey.OP_READ );
    > }


    Make that

    if (buffer.remaining() > 0) {
    keyChannel.write(buffer);
    } else {
    // You already have the key; no need to look it up
    key.interestOps(SelectionKey.OP_READ);
    }

    >>> }
    >>> else if( key.isAcceptable() )
    >>> {
    >>> // Get channel
    >>> ServerSocketChannel keyChannel =
    >>>(ServerSocketChannel)key.channel();
    >>>
    >>> // accept incoming connection
    >>> SocketChannel clientChannel =
    >>> keyChannel.accept();


    As described above, you could attempt to perform this in a seperate
    thread. In fact, I think you safely could do so as long as you clear
    the key's interest set before submitting the accept to another thread.
    I don't think you can overlap multiple accepts, but I'm prepared to be
    shown wrong.

    >>> // register it in the selector
    >>> clientChannel.configureBlocking(false);
    >>> clientChannel.register( selector,
    >>>SelectionKey.OP_READ, buffer );

    >>
    >>Unlike some of the above, this a new channel registration, so okay.


    But if the registration would block on completion of the Selector's
    current select() then you need to wakeup() the Selector first, and make
    sure it doesn't go back into select() until the registration is done.


    John Bollinger
    John C. Bollinger, Dec 22, 2003
    #9
  10. Cyrille \cns\ Szymanski

    Douwe Guest

    Don´t know if it helps but fur ANSI C their is a real good simple http
    server that also uses a Selector. Though it is the C version of the
    sockets implementation and you need to be able to read it, I think it
    acts prety much the same as the implementation in Java.

    http://www.acme.com/software/thttpd/
    Douwe, Dec 23, 2003
    #10
  11. Cyrille \cns\ Szymanski

    Esmond Pitt Guest

    John C. Bollinger wrote:
    > Having done a little socket programming in C, but not claiming to be
    > expert, I don't see how you could overlap two accepts on the same
    > listening socket. Don't you have to accept connections serially, even
    > at a low level? I guess it's a function of the TCP stack; do some
    > stacks allow concurrent accepts on the same socket?


    Java serializes accepts via synchronization, so yes you can do this.
    Esmond Pitt, Dec 23, 2003
    #11
  12. Esmond Pitt wrote:

    > John C. Bollinger wrote:
    >
    >> Having done a little socket programming in C, but not claiming to be
    >> expert, I don't see how you could overlap two accepts on the same
    >> listening socket. Don't you have to accept connections serially, even
    >> at a low level? I guess it's a function of the TCP stack; do some
    >> stacks allow concurrent accepts on the same socket?

    >
    >
    > Java serializes accepts via synchronization, so yes you can do this.
    >


    If Java serializes the accepts then that is specifically contrary to the
    behavior I was asking about (although consistent with the way I thought
    things needed to work). You can have multiple threads blocking on
    accept() on the same socket, but you cannot have multiple threads
    concurrently executing accept on the same socket. I was wondering
    whether there were any environments wherein one could successfully and
    usefully have _concurrent_ accepts on one listening socket. If accept
    is not thread-safe at the level of the TCP stack then the answer is
    effectively no.


    John Bollinger
    John C. Bollinger, Dec 24, 2003
    #12
  13. Cyrille \cns\ Szymanski

    Esmond Pitt Guest

    John C. Bollinger wrote:
    >
    > You can have multiple threads blocking on
    > accept() on the same socket, but you cannot have multiple threads
    > concurrently executing accept on the same socket. I was wondering
    > whether there were any environments wherein one could successfully and
    > usefully have _concurrent_ accepts on one listening socket. If accept
    > is not thread-safe at the level of the TCP stack then the answer is
    > effectively no.


    Isn't this a distinction without a difference? I'm not at all clear
    about the effective difference between these two conditions. Connections
    (accepted sockets) will be fed to the callers of accept() one at a time
    in either case: while there aren't any connections to accept you would
    expect either to block in accept() or to get a null return from it in
    non-blocking mode, and both of these occur in Java as expected. In other
    words I don't see what 'usefully' means above.
    Esmond Pitt, Dec 31, 2003
    #13
  14. > Don´t know if it helps but fur ANSI C their is a real good simple http
    > server that also uses a Selector. Though it is the C version of the
    > sockets implementation and you need to be able to read it, I think it
    > acts prety much the same as the implementation in Java.
    >
    > http://www.acme.com/software/thttpd/
    >


    Thank you for that precious link. The code is sometimes hard to follow
    because of its cross platform nature but I managed to understand the key
    parts. AFAIK this server could easily be implemented in Java.

    However the IO model has a tiny nitpick : it doesn't take advantage of
    multithreaded environments since it isn't multi threaded. You seem to
    believe that this isn't very important since the OS (or VM) spawns separate
    threads to handle io transfers.

    Is the Java Selector class limited to a certain amount of registered
    channels like the native select() often is ? Haw does the JVM address such
    a problem ? FYI thttpd limits the number of concurrent clients.

    --
    _|_|_| CnS
    _|_| for(n=0;b;n++)
    _| b&=b-1; /*pp.47 K&R*/
    Cyrille \cns\ Szymanski, Jan 3, 2004
    #14
  15. >> >> Then Java lacks an asynchronous accept() method.

    [...]

    > Asuming the accept is slow caused by the reasons you described above.
    > If the new socket has to be synchronized with the sub-system the
    > Thread handling this will do it calls to the socket subsystem ... go
    > into a WAITING state ... [subsystem sends SYN and waits for ACK and
    > maybe does other stuff].... WAKING up again (being signaled by the
    > subsystem) .. and return from the accept method. As far as I can see
    > two threads are involved here were the "outer" thread is a Java Thread
    > and the inner Thread is a system thread (owned by the JVM). The outer
    > Thread is going into a WAITING state and will not use any CPU cycles.
    > The inner Thread will (in most cases) go into a WAITING state as well
    > as soon as it has sent the SYN to the IO-Device/Networkcard and it
    > will wake up as soon as the IO-Device has new data. The second
    > (System-)Thread therefor doesn´t consume much time either while being
    > asleep. This is the situation if the non-asynchronized accept() is
    > used.
    >
    > In the asynchronized way their is not much different ... the Selector
    > could be seen as the first thread ... the second thread stays more or
    > less the same ... but now the first thread can be notified by multiple
    > events from different Threads. I even could imaging that after being
    > notified by one Thread the Selector waits for a few milliseconds in
    > the hope more threads will notify it (but therefor I would have to
    > look into the implementation).
    >
    > In both situations you have the system-thread that does the actual
    > work, this can not be changed/improved. The first situation is using
    > an older implementation for IO then the second one but it does not
    > waste CPU cycles. I can´t tell you which of the two implementations
    > will be faster that is just trial and error.


    I guess the problem can be summarized as follows :

    The accept() operation takes a long time to complete, so it must be
    either handled in a separate thread, or there should exist an
    asynchronous version of the function.

    The server must be capable of handling many clients. 1 thread per client
    is not reasonable so there must be a limited number of worker threads.
    Since some operations involve synchronization it is best to have at least
    two worker threads (so one thread can do number crunching while another
    is in sleeping mode).

    --
    _|_|_| CnS
    _|_| for(n=0;b;n++)
    _| b&=b-1; /*pp.47 K&R*/
    Cyrille \cns\ Szymanski, Jan 3, 2004
    #15
  16. > I think so. Here's the scheme, based on your idea about dispatching
    > work to a thread pool:
    > () One thread manages the Selector, much as you already have.
    > () When it detects one or more ready IO operations, it iterates
    > through
    > the selected keys and assigns the appropriate IO operation on the
    > associated Channel to a thread from a thread pool, after first
    > clearing the key's interest ops.
    > () After processing the whole list, the selector thread invokes a
    > new
    > select().


    Seems good to me so far.


    > () The threads from your pool, upon being awakend and assigned a new
    > SelectionKey, retrieve the channel, perform as much of the required
    > operation as they can without blocking, set the appropriate interest
    > operations on the key, and then wakeup() the Selector before going
    > back to sleep. (The wakeup is essential to make the Selector notice
    > the change in the key's interest operations.)


    Won't the worker thread block on attempt to set the interest ops ? I
    guess so.


    > () After a read, as much of the data read as possible should be
    > written; if that's all of it then the new interest set is OP_READ;
    > otherwise it is OP_WRITE.
    > () Remember to close() the _channel_ (which will also implicitly
    > cancel
    > all associated selection keys) when closure of the remote side is
    > detected. It is not clear to me from the API docs whether closing the
    > underlying Socket causes the channel to be closed (or the reverse).
    >
    > () The selector thread must be prepared for the possibility that no
    > selection keys are ready when select() returns, but that shouldn't be
    > hard.


    Fine


    >> I'm a strong believer in the Selector approach. However i'd rather
    >> have implemented "completion" selects (as it is done in IOCP) because
    >> it makes MT programs easier to write.

    >
    > In other words, IOCP already provides a packaged equivalent to the
    > approach I describe above? Or is there something I missed that it
    > does and the above doesn't?


    IOCP is almost equivalent to the approach described above. The main
    difference is that the selector equivalent of IOCP notifies when in IO
    operation completes. Therefore you call io functions (which return
    immediately) before the selector gives its notification.

    With this model, accepts can be treated in the same manner as reads and
    writes :
    * create a "client" SOCKET
    * call accept("client") which returns immediately
    * wait for the selector to notify completion (thread wakes up)
    * when it does, "client" is connected to the remote endpoint
    * then say, send a "hello message", which returns immediately
    * wait for the selector to notify completion (thread wakes up)
    * when it does, some bytes have been sent, or there has been an IO error.
    * etc.


    Another approach I'll want to try is using IOCP with JNI.


    > But if the registration would block on completion of the Selector's
    > current select() then you need to wakeup() the Selector first, and
    > make sure it doesn't go back into select() until the registration is
    > done.


    Got it.


    Thanks a lot.

    --
    _|_|_| CnS
    _|_| for(n=0;b;n++)
    _| b&=b-1; /*pp.47 K&R*/
    Cyrille \cns\ Szymanski, Jan 3, 2004
    #16
  17. Cyrille \cns\ Szymanski

    Douwe Guest

    > Thank you for that precious link. The code is sometimes hard to follow
    > because of its cross platform nature but I managed to understand the key
    > parts. AFAIK this server could easily be implemented in Java.
    >
    > However the IO model has a tiny nitpick : it doesn't take advantage of
    > multithreaded environments since it isn't multi threaded. You seem to
    > believe that this isn't very important since the OS (or VM) spawns separate
    > threads to handle io transfers.


    Absolutely true .. and I wouldn´t recommend you to use the same
    design/model in your program (single threaded). It just is a good
    example of how the Selector works and also a good example how the life
    of a programmer gets easier by decreasing the number of threads
    (preferably to one)

    > Is the Java Selector class limited to a certain amount of registered
    > channels like the native select() often is ? Haw does the JVM address such
    > a problem ? FYI thttpd limits the number of concurrent clients.



    Think the Java version is just a class wrapped around the original
    Selector stuff (in case of Linux). These are details of which I
    (unfortunately) have no knowledge.
    Douwe, Jan 5, 2004
    #17
  18. >> However the IO model has a tiny nitpick : it doesn't take advantage
    >> of multithreaded environments since it isn't multi threaded. You
    >> seem to believe that this isn't very important since the OS (or VM)
    >> spawns separate threads to handle io transfers.

    >
    > Absolutely true .. and I wouldn´t recommend you to use the same
    > design/model in your program (single threaded). It just is a good
    > example of how the Selector works and also a good example how the life
    > of a programmer gets easier by decreasing the number of threads
    > (preferably to one)


    I've been reading the book "Java NIO" from O'Reilly and they implement such
    a multithreaded server. Chapter 4 from the book deals with this matter and
    happens to be available online (source code of the examples as well).

    http://www.oreilly.com/catalog/javanio/index.html

    The very last section is really interesting, however it is a pity that the
    example code given fails to prove what the author describes.

    --
    _|_|_| CnS
    _|_| for(n=0;b;n++)
    _| b&=b-1; /*pp.47 K&R*/
    Cyrille \cns\ Szymanski, Jan 5, 2004
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Daniel Bress
    Replies:
    0
    Views:
    480
    Daniel Bress
    Sep 15, 2003
  2. iksrazal

    NIO with timeouts != NIO?

    iksrazal, Jun 17, 2004, in forum: Java
    Replies:
    1
    Views:
    6,248
    iksrazal
    Jun 18, 2004
  3. Kathryn Bean
    Replies:
    1
    Views:
    679
    John Harrison
    Oct 28, 2004
  4. u.int.32.t
    Replies:
    11
    Views:
    572
    Ian Collins
    Apr 24, 2007
  5. computer_guy
    Replies:
    3
    Views:
    765
    computer_guy
    Jul 20, 2007
Loading...

Share This Page