Multithreading issues and races causing unhandled NullReferenceExceptions?

Apr 15, 2014 at 2:15 PM
Edited Apr 15, 2014 at 2:15 PM

is anyone else seeing occasional issues with NullReferenceExceptions?

It seems that the various socket callbacks happen on various background Worker Threads (because SocketIOClient uses WebSocket4Net.dll which uses SuperSocket.ClientEngine.AsyncTcpSession which uses the various System.Net.Sockets.SocketAsync* APIs).

But a lot of the onMessage and onClose methods in SocketIOClient.Client.cs do not appear to be particularly thread safe.

For example, both wsClient_Closed() and wsClient_MessageReceived() have code paths leading to this.Close(), but the this.Close() implementation and following methods (like closeOutboundQueue()) do not use any locks or mutexes to act on the various instance objects.

In particular, closeOutboundQueue() reads like this:
if (this.outboundQueue != null) {
  // <more code>
  this.outboundQueue = null;
This makes it extremely dangerous to call Close() from the main UI thread, for example. There is a risk that while the main thread enters the if-block, then at the same time some background worker thread enters the if-block. (For example, if the server happened to close the connection at the same time). There is now a race where the main UI thread ends up setting this.outboundQueue to null, and the second background thread attempts to call CompleteAdding() or Dispose() on a null object. That, in turn, causes a NullReferenceException, which causes an AppDomain.UnhandledException (with IsTerminating=true) which seems to take down the entire application.

(It's easier than it sounds to trigger a Close() from the main thread (or even a different worker thread) at the same time. For example, consider a server that sends a message to ask the client to change to a different server, and then immediately after sends a disconnect - the client app may handle the message to change server by calling Close() on the current socket before trying to create a different socket)

I'm wondering if the best way to fix this would be to modify SocketIOClient.Client to either use some kind of lock() mutex in the callbacks, and/or to dispatchAsync all the callbacks onto a single main thread?