Java Solaris Communities Sun Store Join SDN My Profile Why Join?
 
Bug Database
Bug Detail
Quick Lists
Top 25 Bugs
Top 25 RFE's
Recently Closed Bugs
Printable Page Printable Page


Bug Database
Bug ID: 4469394
Votes 103
Synopsis (so) SocketChannels registered with OP_WRITE only release selector once (win)
Category java:classes_nio
Reported Against 1.4 , hopper , merlin-beta
Release Fixed 1.4.0_02, 1.4.1(hopper) (Bug ID:2044483)
State 10-Fix Delivered, Needs Verification, bug
Priority: 2-High
Related Bugs 4628289 , 4645302
Submit Date 13-JUN-2001
Description




java version "1.4.0-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta-b65)
Java HotSpot(TM) Client VM (build 1.4.0-beta-b65, mixed mode)

Non-blocking SocketChannels registered with a Selector and including the
SelectionKey.OP_WRITE operation code will not release Selector.select() more
than once. The first time selector.select() is called and the outgoing socket
is available for writing, the select method returns as expected, and the
selector.selectedKeys Set includes the SelectionKey for the SocketChannel with
readyOps including OP_WRITE. However, after removing the SelectionKey from the
selectedKeys Set, any later call to selector.select() will block indefinitely.

The following simple program demonstrates the problem. The program creates a
ServerSocketChannel in non-blocking mode using an InetSocketAddress built
from "localhost", port 1110. The program registers the ServerSocketChannel with
a Selector with operations OP_ACCEPT. selector.select() then blocks until a
client (e.g., a Telnet client) connects to localhost 1110.

Once a client is accepted, a SocketChannel for that client is configured in non-
blocking mode. The new SocketChannel is registered with the same Selector with
operation OP_WRITE. The next selector.select() call returns also most
immedaitely with a selectedKey set including the SelectionKey for the new
SocketChannel -- indicating the socket can be written to. The program writes a
simple, 7 byte buffer to the socket (using SocketChannel.write(ByteBuffer)),
and removes the SelectionKey from the selector's selectedKey Set. The next call
to selector.select() never returns, even though the outgoing client socket
should be again available for writing almost instantaneously.

Note that the erroneous behavior occurs whether or not the test program
actually writes any bytes to the SocketChannel. In the final block of the main
() method's loop below, simply comment out the call to SocketChannel.write()
(actual line is "sc.write(bb);"), and the same problem still occurs.

// Simple test program to exercise writing multiple times
// to a SocketChannel using non-blocking I/O in the java.nio.*
// packages.

import java.net.*;
import java.nio.*;
import java.nio.channels.*;
import java.util.*;

public class WriteTest
{
    public static void main(String[] args) throws Exception
    {
	//...contents of this byte buffer will be written to
	//   all connected clients repeatedly until they disconnect...
	ByteBuffer bb = ByteBuffer.wrap(new byte[] {
	    (byte)'B', (byte)'r', (byte)'i', (byte)'a', (byte)'n',
	    (byte)'\r', (byte)'\n'
	});

	//...ServerSocketChannel to receive new client connections on
	//   "localhost" interface, port 1110
	ServerSocketChannel ssc = ServerSocketChannel.open();
	ssc.configureBlocking(false);
	ServerSocket ss = ssc.socket();
	ss.bind(new InetSocketAddress(InetAddress.getByName("localhost"),
1110));

	//...single Selector used to receive non-blocking I/O events...
	Selector sel = Selector.open();

	//...register the ServerSocketChannel with the Selector...
	SelectionKey keyAccept = ssc.register(sel, ssc.validOps());

	while(true)
	{
	    if(sel.select() == 0)
		continue;

	    Set selectedKeys = sel.selectedKeys();

	    //...if a new client is trying to copnnect, accept the
	    //   connection then loop...
	    if(selectedKeys.contains(keyAccept))
	    {
		Socket sock = ssc.accept();
		SocketChannel sc = sock.getChannel();
		sc.configureBlocking(false);
		SelectionKey k = sc.register(sel, SelectionKey.OP_WRITE);
		k.attach(sc);
		selectedKeys.remove(keyAccept);
		continue;
	    }

	    //...no new client connections, meaning that at least one of
	    //   the currently connected clients can receive more bytes...
	    Iterator iter = selectedKeys.iterator();
	    while(iter.hasNext())
	    {
		SelectionKey k = (SelectionKey)iter.next();
		SocketChannel sc = (SocketChannel)k.attachment();
		sc.write(bb); // Problem persists even after
			      // commenting this line out.
		bb.rewind();
		selectedKeys.remove(k);
	    }
	}
    }
}

The following modified version of the above test program attempts to work
around the problem by creating a new Selector  customer  after each write operation
and registering all client SocketChannels with the new Selector (instead of re-
using the same Selector in each iteration of the outer infinite loop). This
does not fix the problem, however, indicating to me that the problem is in the
native layer (comment elided for brevity). (Note that this version of the test
program does not work with multiple simultaneous clients -- but it still
demonstrates the problem which is the important part).

import java.net.*;
import java.nio.*;
import java.nio.channels.*;
import java.util.*;

public class WriteTest
{
    public static void main(String[] args) throws Exception
    {
	ByteBuffer bb = ByteBuffer.wrap(new byte[] {
	    (byte)'B', (byte)'r', (byte)'i', (byte)'a', (byte)'n',
	    (byte)'\r', (byte)'\n'
	});

	ServerSocketChannel ssc = ServerSocketChannel.open();
	ssc.configureBlocking(false);
	ServerSocket ss = ssc.socket();
	ss.bind(new InetSocketAddress(InetAddress.getByName("localhost"),
1110));

	Selector sel = Selector.open();

	SelectionKey keyAccept = ssc.register(sel, ssc.validOps());

	while(true)
	{
	    if(sel.select() == 0)
		continue;

	    Set selectedKeys = sel.selectedKeys();

	    if(selectedKeys.contains(keyAccept))
	    {
		Socket sock = ssc.accept();
		SocketChannel sc = sock.getChannel();
		sc.configureBlocking(false);
		SelectionKey k = sc.register(sel, SelectionKey.OP_WRITE);
		k.attach(sc);
		selectedKeys.remove(keyAccept);
		continue;
	    }

	    //...this section modified from the previous example to open
	    //   a new Selector and use the new one in the next iteration
	    //   through the output (infinite) while loop...
	    Iterator iter = selectedKeys.iterator();
	    Selector sel2 = Selector.open();
	    while(iter.hasNext())
	    {
		SelectionKey k = (SelectionKey)iter.next();
		SocketChannel sc = (SocketChannel)k.attachment();
		sc.write(bb);
		bb.rewind();
		selectedKeys.remove(k);

		k = sc.register(sel2, SelectionKey.OP_WRITE);
		k.attach(sc);
	    }
	    ssc.register(sel2, SelectionKey.OP_ACCEPT);
	    sel.close();
	    sel = sel2;
	}
    }
}


Brian Maso
(Review ID: 126007) 
======================================================================
Work Around




No known workaround that I can find other than not using non-blocking I/O for
socket writes.
======================================================================
Evaluation
This bug is due to the fact that the NIO specification requires level-triggered
readiness notifications (which is, conveniently, how they work in Solaris and
Linux) but Windows only provides edge-triggered notifications.  A fix is in
progress and is slated for the 1.4.1 release.

--   xxxxx@xxxxx   2002/3/23
Comments
  
  Include a link with my name & email   

Submitted On 17-AUG-2001
cstjohn
I tried the first test program, and it
didn't exhibit the bug (telnet'ed to 1110
on localhost, got 'Brian' over and over
and over). Both server and client jvm's
worked fine.  What platform was the bug
originally reported on?


java version "1.4.0-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build
1.4.0-beta-b65)
Java HotSpot(TM) Server VM (build 1.4.0-beta-b65, mixed mode)

kernel 2.4.5
2.1.3GNU C Library stable release version 2.1.3
AMD Athalon
etc.


Submitted On 24-AUG-2001
Nivag
Please fix ASAP :-)

I'm trying to write a load generatot using Java on Linux.  
I need to do a connect/write/read/write/read/close sequence 
of actions on a socket for every transaction.  Blocking 
socket I/O is limited to about 300 per second on my current 
system, as there is a high overhead creating a new Thread 
for each transaction.


Submitted On 18-OCT-2001
skaistis
Wow, 63 votes for something that I'm pretty sure isn't even 
a bug, just a misunderstanding.

I haven't tested this yet, but the behavior described above 
sounds correct.  According to the docs, the OP_WRITE code 
should be triggered under the following conditions:

- When the socket becomes writeable
- When the socket becomes unwriteable
- An error has occurred

The first case is obviously being triggered when the socket 
is connected.  I then wouldn't expect OP_WRITE to be 
selected again until the socket is shutdown for writing or 
closed.  I don't think that there would be any interim non-
error cases where the socket status would change to 
unwritable unless maybe some underlying buffer was full.

Rememeber, selectors should be thought of as events, not a 
status check.  Otherwise, we're just polling status instead 
of triggering based on a change.  OP_WRITE indicates that 
the write status has changed, not that it actually is 
writeable.  The isWriteable() method does that.


Submitted On 18-OCT-2001
daney
I dissagree with the previous poster.  Selector.select works
very much as select(2) or poll(2) in UNIX.  The key seems to
be to clear the selectedKeys prior to doing the select. 
Here is a slightly modified version of the program that
works for me on:
[daney@dl seltest]$ uname -a
Linux dl 2.4.12-1 #1 Thu Oct 11 11:10:19 PDT 2001 i686 unknown
[daney@dl seltest]$ java -version
java version "1.4.0-beta2"
Java(TM) 2 Runtime Environment, Standard Edition (build
1.4.0-beta2-b77)
Java HotSpot(TM) Client VM (build 1.4.0-beta2-b77, mixed mode)


Here is the program:
// Simple test program to exercise writing multiple times
// to a SocketChannel using non-blocking I/O in the java.nio.*
// packages.

import java.net.*;
import java.io.*;
import java.nio.*;
import java.nio.channels.*;
import java.util.*;

public class WriteTest
{
   public static void main(String[] args)
   {
      try {
         //...contents of this byte buffer will be written to
         //   all connected clients repeatedly until they
disconnect...
         ByteBuffer bb = ByteBuffer.wrap(new byte[] {
            (byte)'B', (byte)'r', (byte)'i', (byte)'a',
(byte)'n',
            (byte)'\r', (byte)'\n'
         });

         //...ServerSocketChannel to receive new client
connections on
         //   "localhost" interface, port 1110
         ServerSocketChannel ssc = ServerSocketChannel.open();
         ssc.configureBlocking(false);

         ServerSocket ss = ssc.socket();
         ss.bind(new
InetSocketAddress(InetAddress.getByName("localhost"), 1110));


         //...single Selector used to receive non-blocking
I/O events...
         Selector sel = Selector.open();

         //...register the ServerSocketChannel with the
Selector...
         SelectionKey keyAccept = ssc.register(sel,
ssc.validOps() & SelectionKey.OP_ACCEPT);
      select_loop:
         while(true) {
            if(sel.select() == 0) {
               continue;
            }
            
            Set selectedKeys = sel.selectedKeys();
            
            //...if a new client is trying to copnnect,
accept the
            //   connection then loop...
            if(selectedKeys.contains(keyAccept)) {
               Socket sock = ssc.accept();
               SocketChannel sc = sock.getChannel();
               sc.configureBlocking(false);
               SelectionKey k = sc.register(sel,
SelectionKey.OP_WRITE);
               k.attach(sc);
               selectedKeys.remove(keyAccept);
               continue select_loop;
            }

            //...no new client connections, meaning that at
least one of
            //   the currently connected clients can receive
more bytes...
            Iterator iter = selectedKeys.iterator();
            while(iter.hasNext()) {
               SelectionKey k = (SelectionKey)iter.next();
               SocketChannel sc = (SocketChannel)k.attachment();
               try {
                  sc.write(bb); // Problem persists even after
               } catch (IOException ioe) {
                  k.cancel();
               } // end of try-catch
               // commenting this line out.
               bb.rewind();
            }
            selectedKeys.clear();
         }
      }
      catch (Exception exc) {
         exc.printStackTrace();
      } // end of catch
   }
}


I am clearing my votes for this bug, as it seems that it is
not a bug.




Submitted On 19-OCT-2001
skaistis
Ok, this may or may not be a bug in the implementation.  
But it looks like an inconsistency in implementation 
between platforms.  I tried the previous poster's code with 
the Win32 version of 1.4b2 and got the only once OP_WRITE 
select behavoir described in the bug.  His code worked 
differently on Linux and OP_WRITE was selected multiple 
times.  I'm going to try it on Solaris when I can get a 1.4 
installation set up.

I found a clue in the Win32 docs for WSAAsyncSelect 
describing how Winsock checks for the write status 
(FD_WRITE) of a socket:

-------
The FD_WRITE network event is handled slightly differently. 
An FD_WRITE network event is recorded when a socket is 
first connected with connect/WSAConnect or accepted with 
accept/WSAAccept, and then after a send fails with 
WSAEWOULDBLOCK and buffer space becomes available. 
Therefore, an application can assume that sends are 
possible starting from the first FD_WRITE network event 
setting and lasting until a send returns WSAEWOULDBLOCK. 
After such a failure the application will find out that 
sends are again possible when an FD_WRITE network event is 
recorded and the associated event object is set.
-------

So, assuming the likely possibility that the Win32 
implmentation calls WSAEventSelect, the observed behavior 
matches the underlying native API.  The only time that 
OP_WRITE would be selected again would ge if the underlying 
buffer was full and Winsock returned WSAEWOULDBLOCK.  The 
solution would be to have the implmentation call the plain 
Winsock select() call, but that can have some performance 
and scalability issues.

Since I've done quite a bit of Win32 socket programming, 
this behavior is familiar to me.  But since the goal of 
Java is to have platform-independent operation, this is 
probably a bug.


Submitted On 20-OCT-2001
skaistis
I checked the imports on the nio native code DLL, and it 
does import WSAEventSelect() and not select().  So, that 
does appear to be the issue.  Could be interesting to fix.


Submitted On 20-OCT-2001
daney
I guess I will vote for it again as I very much want to use
non-blocking io on windows as well as linux.



Submitted On 22-OCT-2001
skaistis
If you need good scalability under Win32, be sure to vote 
for bug 4503092 too.


Submitted On 04-DEC-2001
ggruschow
I ran into this problem independently.  Please notify me if this is going to be changed because I'm working on a system that relies on the Selector's behavior.

I think this is definitely a bug for at least 2 reasons: the documentation doesn't match the behavior of the Win32 JDK, and the behavior doesn't match across platforms.

The documentaion for SelectionKey.OP_WRITE reads "If the selector detects that the corresponding channel is ready for writing, has been remotely shut down for further writing, or has an error pending, then it will add OP_WRITE to the key's ready set and add the key to its selected-key set."

Or summarized: "If the [...] channel is ready for writing then [the selector] will add OP_WRITE to the key's ready set and add the key to its selected-key set."  As of JDK1.4b3, the selector will _NOT_ necessarily add the key to its selected-key set.

If you assume the Win32 behavior, you can in fact write working software with this interface.  However, this makes it painfully easy to write programs that aren't portable across platforms, and I thought that was strongly discouraged.


Submitted On 09-JAN-2002
astid
I assume this behavior is correct. If it weren't this way, 
then a channel that is ready to be written but is not 
actually written to would cause select() to constantly 
return immediately, in effect causing a busy loop. I have 
the following strategy for writing to a non-blocking 
channel:
- maintain a FIFO pipe that can grow without limits (no 
such thing in the JDK, you have to write one) and which has 
a dedicated output channel that is also registered with a 
selector
- write output to the pipe, not the channel; the pipe's 
write methods should attempt to flush as much of the pipe's 
content to its channel as possible
- when selector selects the channel with OP_WRITE, also 
attempt to flush as much of the pipe's contents to the 
channel as possible

In short, gather the output in a growable FIFO buffer (a 
pipe), try to flush the FIFO buffer to channel both 
immediately after write operations to buffer and on 
OP_WRITE notifications from the selector.


Submitted On 14-FEB-2002
NKaiser
The current implementation uses WSAAsyncSelect or 
WSAWaitForMultipleEvent and is therefore limited by 63 
connections... great scalability!!!


Submitted On 28-FEB-2002
jcvanvorst
Yes, you can find out more on the 63 key bug at 
http://developer.java.sun.com/developer/bugParade/bugs/45030
92.html


Submitted On 28-FEB-2002
jcvanvorst
This is definitely a bug, but as astid states, checking 
only for an OP_WRITE condition would cause the select() 
method to immediately return.  IMHO it seems that there is 
something missing from the API.  I'll explain...

With an OP_ACCEPT key select() doesn't return when the 
ServerSocketChannel is acceptable, it returns when there is 
a request for a new connection.

Likewise, with an OP_READ key select() doesn't return when 
the SocketChannel is readable, it returns when there is 
something to read.

Given the above it would seem that OP_WRITE needs to not 
only check that the SocketChannel is writable, but must 
also reference some object to indicate that there is 
something to write. 

If I am wrong in my first two assumptions, or am grossly 
oversimplifying please let me know.  I think I'll post this 
to the forums....


Submitted On 05-MAR-2002
jegmorris1
I agree this is a bug. To write server code that runs on 
both linux and win32 would be very hard as the semantics 
are very different on the two systems. Either select one or 
the other! I can live with either, although the semantics 
on Linux are well known as they mimic the c select() call 
exactly.. ie select() continually returns with write ready, 
rather than the one shot behavior on win32.


Submitted On 06-MAR-2002
jegmorris1
Actually now I understand how the windows semantics works..
(See skaistis comments) I prefer them, you only get one hit 
on the isWritable() it is almost like a completion event 
rather tahn a ready event. It is much easier to write non-
blocking writes that way. So my vote (which I know will not 
happen as it is a drastic change to the released version) 
is to make the Linux and other versions work the same way. 
Because in order to get the same thing I have running now 
in Linux would be to turn my interest in the OP_WRITE event 
on when a write does not fully write, and off again when it 
is done. According to the docs using the interestOps(int) 
method to change that may block and is implementation 
dependent!


Submitted On 14-MAR-2002
rashid11
I dont understand how one can call Windows' behavior a model
or preferred behavior. I have a product that works just 
fine on Solaris/Linux and doesn't on Win NT/2K - because of 
this very same problem. I don't understand how one can even 
write a non-blocking server using Windows semantics through 
what Java exposes.

Select()/poll() has been on Unix since forever and is the 
most straightforward way to write non-blocking servers. You 
get list of sockets  (for OP_WRITE) that have spare buffer 
space, dump whatever you have to write into those buffers 
and let OS handle sending the data and to notify you when 
buffer space becomes available again.

Even if one is willing to modify code to work with Windows 
semantics - how would I even know that write would block ?
Is there a flag in SelectionKey that indicates that ? Would 
a write throw an IO Exception (WOULD_BLOCK or something 
like that) - I see that the only exception coming from 
SocketChannel's write is IOException. How do we figure out 
how much data can be written (say it'd block if one tries 
to send a Kilo worth of data, but 1023 bytes is fine ?)

What value does registering with a Selector offer in this 
case ? 

This is a mess and a major show stopper for lotsa 
applications. It HAS TO BE FIXED.



Submitted On 01-APR-2002
rashid11
Just to beat the already dead cow: Sun knows about, the fix 
in the progress and will it arrive ... drum roll ... fall 
2002 !

Can we get it sooner ? I understand that Itanium support 
can wait till then - but we need a fix for this _bug right
now.


Submitted On 01-MAY-2002
rashid11
They now say it is fixed in 1.4.0_02+. The bug is closed.
Yet, one can not obtain these bug-fix releases and 
apparently will have to wait till fall for hopper release. 
It does not make sense. Please release 1.4.0_02 or _03 or 
whatever the latest tested  bug-fix release is ! 

Put a disclaimer there to the effect of interim release and 
let us handle the risks. Still it would be better than non-
working application - which breakes something MUCH MORE 
imprortant - ie write once run everywhere Java paradigm !


Submitted On 12-JUL-2002
KarlUp
If I specify OP_WRITE, the key gets selected constantly. 
Trying to turn off OP_WRITE when I have nothing to write 
doesn't work. Why would I want to be notified constantly 
that there's room in the output buffer? What I want is a 
single notification that space in the buffer has once again 
become available. I find the new behavior unusable. I have 
to use blocking writes.


Submitted On 21-AUG-2002
jordanz
This fix appears to be broken. I've just submitted a new bug 
on this. If you register OP_WRITE, that's all that you will ever 
get from select() and you'll get an infinite amount of them.


Submitted On 09-SEP-2002
tfolks
Using 1.4.1-rc, the problem still no longer occurs on
Windows ME or Windows 2000/XP.


Submitted On 09-SEP-2002
tfolks
Using 1.4.0_02, the problem still occurs on Windows ME but
no longer occurs on Windows 2000/XP.


Submitted On 17-SEP-2002
jordanz
This is badly broken in 1.4.1 FCS. On Win2K, if you register a 
socket with OP_WRITE, that's all you'll ever get from select(). 
All reads will be blocked. This can't be the desired behvaior.



PLEASE NOTE: JDK6 is formerly known as Project Mustang