Main /
WebSocketDeadlockAfter we fixed issues with notifyAll(), actor/lib/jjs/modules/webSocket/test/auto/FullDuplex.xml and FullDuplex2 are hanging when run using ReplicationIf you run $PTII/ptolemy/actor/lib/jjs/modules/webSocket/test/auto/FullDuplex.xml and then do a kill -3, you will see the deadlock message below. bash-3.2$ $PTII/bin/ptinvoke ptolemy.moml.MoMLSimpleApplication $PTII/ptolemy/actor/lib/jjs/modules/webSocket/test/auto/FullDuplex.xml >& /tmp/f00.out & [1] 92225 bash-3.2$ kill -3 %1 bash-3.2$ kill -9 %1 The log file contains ... WebSocketClient3: Connection failed. Will try again: Connection refused: localhost/127.0.0.1:8091 WebSocketServer: No open sockets. Discarding message: undefined WebSocketServer: Server: Listening for socket connection requests. WebSocketClient3: Sending to web socket: 0 WebSocketServer: No open sockets. Discarding message: undefined WebSocketServer: Server: new socket established with ID: 0 WebSocketClient3: Sending to web socket: -1 WebSocketServer: No open sockets. Discarding message: undefined WebSocketClient3: Sending to web socket: -2 WebSocketClient3: Status: Connection established WebSocketServer: Broadcasting to socket id 0 message: 3 WebSocketClient3: Sending to web socket: -3 WebSocketServer: Broadcasting to socket id 0 message: 4 WebSocketClient3: Sending to web socket: -4 WebSocketServer: Broadcasting to socket id 0 message: 5 Jan 05, 2016 9:13:26 AM io.vertx.core.impl.BlockedThreadChecker WARNING: Thread Thread[vert.x-eventloop-thread-3,5,main] has been blocked for 2116 ms, time limit is 2000 Jan 05, 2016 9:13:27 AM io.vertx.core.impl.BlockedThreadChecker WARNING: Thread Thread[vert.x-eventloop-thread-3,5,main] has been blocked for 3122 ms, time limit is 2000 Jan 05, 2016 9:13:28 AM io.vertx.core.impl.BlockedThreadChecker WARNING: Thread Thread[vert.x-eventloop-thread-3,5,main] has been blocked for 4126 ms, time limit is 2000 Jan 05, 2016 9:13:29 AM io.vertx.core.impl.BlockedThreadChecker WARNING: Thread Thread[vert.x-eventloop-thread-3,5,main] has been blocked for 5128 ms, time limit is 2000 io.vertx.core.VertxException: Thread blocked at ptolemy.actor.lib.jjs.JavaScript.invokeCallback(JavaScript.java:1265) at ptolemy.actor.lib.jjs.HelperBase._issueResponse(HelperBase.java:177) at ptolemy.actor.lib.jjs.modules.webSocket.WebSocketHelper.access$2700(WebSocketHelper.java:73) at ptolemy.actor.lib.jjs.modules.webSocket.WebSocketHelper$DataHandler.handle(WebSocketHelper.java:520) at ptolemy.actor.lib.jjs.modules.webSocket.WebSocketHelper$DataHandler.handle(WebSocketHelper.java:507) at io.vertx.core.http.impl.WebSocketImplBase.handleFrame(WebSocketImplBase.java:170) at io.vertx.core.http.impl.ServerConnection.handleWsFrame(ServerConnection.java:327) at io.vertx.core.http.impl.ServerConnection.processMessage(ServerConnection.java:412) at io.vertx.core.http.impl.ServerConnection.handleMessage(ServerConnection.java:134) at io.vertx.core.http.impl.HttpServerImpl$ServerHandler.doMessageReceived(HttpServerImpl.java:482) at io.vertx.core.http.impl.HttpServerImpl$ServerHandler.doMessageReceived(HttpServerImpl.java:420) at io.vertx.core.http.impl.VertxHttpHandler.lambda$channelRead$29(VertxHttpHandler.java:80) at io.vertx.core.http.impl.VertxHttpHandler$$Lambda$41/1055690093.run(Unknown Source) at io.vertx.core.impl.ContextImpl.lambda$wrapTask$61(ContextImpl.java:333) at io.vertx.core.impl.ContextImpl$$Lambda$14/625243671.run(Unknown Source) at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:225) at io.vertx.core.http.impl.VertxHttpHandler.channelRead(VertxHttpHandler.java:80) at io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:124) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) at java.lang.Thread.run(Thread.java:745) After Found one Java-level deadlock: ============================= "vert.x-eventloop-thread-3": waiting to lock monitor 0x00007fe3b88f6218 (object 0x000000078016c178, a org.terraswarm.accessor.JSAccessor), which is held by "FullDuplex" "FullDuplex": waiting to lock monitor 0x00007fe3b88f5198 (object 0x00000007aafdcd18, a io.vertx.core.http.impl.ServerConnection), which is held by "vert.x-eventloop-thread-3" Java stack information for the threads listed above: =================================================== "vert.x-eventloop-thread-3": at ptolemy.actor.lib.jjs.JavaScript.invokeCallback(JavaScript.java:1265) - waiting to lock <0x000000078016c178> (a org.terraswarm.accessor.JSAccessor) at ptolemy.actor.lib.jjs.HelperBase._issueResponse(HelperBase.java:177) at ptolemy.actor.lib.jjs.modules.webSocket.WebSocketHelper.access$2700(WebSocketHelper.java:73) at ptolemy.actor.lib.jjs.modules.webSocket.WebSocketHelper$DataHandler.handle(WebSocketHelper.java:520) at ptolemy.actor.lib.jjs.modules.webSocket.WebSocketHelper$DataHandler.handle(WebSocketHelper.java:507) at io.vertx.core.http.impl.WebSocketImplBase.handleFrame(WebSocketImplBase.java:170) - locked <0x00000007aafdcd18> (a io.vertx.core.http.impl.ServerConnection) at io.vertx.core.http.impl.ServerConnection.handleWsFrame(ServerConnection.java:327) - locked <0x00000007aafdcd18> (a io.vertx.core.http.impl.ServerConnection) at io.vertx.core.http.impl.ServerConnection.processMessage(ServerConnection.java:412) at io.vertx.core.http.impl.ServerConnection.handleMessage(ServerConnection.java:134) - locked <0x00000007aafdcd18> (a io.vertx.core.http.impl.ServerConnection) at io.vertx.core.http.impl.HttpServerImpl$ServerHandler.doMessageReceived(HttpServerImpl.java:482) at io.vertx.core.http.impl.HttpServerImpl$ServerHandler.doMessageReceived(HttpServerImpl.java:420) at io.vertx.core.http.impl.VertxHttpHandler.lambda$channelRead$29(VertxHttpHandler.java:80) at io.vertx.core.http.impl.VertxHttpHandler$$Lambda$41/1055690093.run(Unknown Source) at io.vertx.core.impl.ContextImpl.lambda$wrapTask$61(ContextImpl.java:333) at io.vertx.core.impl.ContextImpl$$Lambda$14/625243671.run(Unknown Source) at io.vertx.core.impl.ContextImpl.executeFromIO(ContextImpl.java:225) at io.vertx.core.http.impl.VertxHttpHandler.channelRead(VertxHttpHandler.java:80) at io.vertx.core.net.impl.VertxHandler.channelRead(VertxHandler.java:124) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) at java.lang.Thread.run(Thread.java:745) "FullDuplex": at io.vertx.core.http.impl.ServerWebSocketImpl.writeQueueFull(ServerWebSocketImpl.java:193) - waiting to lock <0x00000007aafdcd18> (a io.vertx.core.http.impl.ServerConnection) at ptolemy.actor.lib.jjs.modules.webSocket.WebSocketHelper.send(WebSocketHelper.java:176) at java.lang.invoke.LambdaForm$DMH/1788408659.invokeVirtual_LL_V(LambdaForm$DMH) at java.lang.invoke.LambdaForm$BMH/1430290256.reinvoke(LambdaForm$BMH) at java.lang.invoke.LambdaForm$reinvoker/452121674.dontInline(LambdaForm$reinvoker) at java.lang.invoke.LambdaForm$MH/416841088.guard(LambdaForm$MH) at java.lang.invoke.LambdaForm$MH/1606286799.linkToCallSite(LambdaForm$MH) at jdk.nashorn.internal.scripts.Script$Recompilation$493$14956A$require.L:1$send-1(/Users/cxh/ptII/ptolemy/actor/lib/jjs/external/require.js#209:26<eval>:344) at java.lang.invoke.LambdaForm$DMH/589446616.invokeStatic_L3_L(LambdaForm$DMH) at java.lang.invoke.LambdaForm$BMH/1825738663.reinvoke(LambdaForm$BMH) at java.lang.invoke.LambdaForm$reinvoker/452121674.dontInline(LambdaForm$reinvoker) at java.lang.invoke.LambdaForm$MH/416841088.guard(LambdaForm$MH) at java.lang.invoke.LambdaForm$MH/1606286799.linkToCallSite(LambdaForm$MH) at jdk.nashorn.internal.scripts.Script$Recompilation$459$\^eval\_.initialize$handle(<eval>:116) at java.lang.invoke.LambdaForm$DMH/381707837.invokeStatic_LL_L(LambdaForm$DMH) at java.lang.invoke.LambdaForm$BMH/964201940.reinvoke(LambdaForm$BMH) at java.lang.invoke.LambdaForm$reinvoker/452121674.dontInline(LambdaForm$reinvoker) at java.lang.invoke.LambdaForm$MH/416841088.guard(LambdaForm$MH) at java.lang.invoke.LambdaForm$MH/1606286799.linkToCallSite(LambdaForm$MH) at jdk.nashorn.internal.scripts.Script$Recompilation$458$5086$\^eval\_.addInputHandler$callback-1(<eval>:120) at java.lang.invoke.LambdaForm$DMH/381707837.invokeStatic_LL_L(LambdaForm$DMH) at java.lang.invoke.LambdaForm$BMH/254820167.reinvoke(LambdaForm$BMH) at java.lang.invoke.LambdaForm$MH/283923877.invokeExact_MT(LambdaForm$MH) at jdk.nashorn.javaadapters.java.lang.Runnable.run(Unknown Source) at ptolemy.actor.lib.jjs.JavaScript$PortOrParameterProxy.invokeHandlers(JavaScript.java:2566) - locked <0x000000078016c178> (a org.terraswarm.accessor.JSAccessor) at ptolemy.actor.lib.jjs.JavaScript.fire(JavaScript.java:899) - locked <0x000000078016c178> (a org.terraswarm.accessor.JSAccessor) at ptolemy.domains.de.kernel.DEDirector._fire(DEDirector.java:1955) at ptolemy.domains.de.kernel.DEDirector.fire(DEDirector.java:501) at ptolemy.actor.CompositeActor.fire(CompositeActor.java:454) at ptolemy.actor.Manager.iterate(Manager.java:841) at ptolemy.actor.Manager.execute(Manager.java:367) at ptolemy.actor.Manager.run(Manager.java:1252) at ptolemy.actor.Manager$PtolemyRunThread.run(Manager.java:1903) Found 1 deadlock. Deadlock AnalysisinvokeCallback()Recall from above: "vert.x-eventloop-thread-3": at ptolemy.actor.lib.jjs.JavaScript.invokeCallback(JavaScript.java:1265) - waiting to lock <0x000000078016c178> (a org.terraswarm.accessor.JSAccessor) at ptolemy.actor.lib.jjs.HelperBase._issueResponse(HelperBase.java:177) Line 1265 of /** Invoke the specified function in the fire() method as soon as possible. * @param function The function to invoke. * @throws IllegalActionException If the director cannot respect the request. */ public void invokeCallback(final Runnable function) throws IllegalActionException { // Coverity Scan warned that we were missing a lock when reading _pendingCallbacks. synchronized (this) { ---> _pendingCallbacks.offer(function); getDirector().fireAtCurrentTime(this); } } invokeHandlers()Recall from above that the FullDuplex thread is holding the lock that the vert.x-eventloop-thread-3 thread wants: "FullDuplex": ... at ptolemy.actor.lib.jjs.JavaScript$PortOrParameterProxy.invokeHandlers(JavaScript.java:2566) - locked <0x000000078016c178> (a org.terraswarm.accessor.JSAccessor) at ptolemy.actor.lib.jjs.JavaScript.fire(JavaScript.java:899) - locked <0x000000078016c178> (a org.terraswarm.accessor.JSAccessor) at ptolemy.domains.de.kernel.DEDirector._fire(DEDirector.java:1955) Line 2566 of JavaScript.java is marked below: /** Invoke any input handlers that have been added if there * are new inputs. * @see #addInputHandler(Runnable) */ public void invokeHandlers() { synchronized(JavaScript.this) { if (_inputHandlers != null && _hasNewInput) { for (Runnable function : _inputHandlers) { if (function != null) { ----> function.run(); if (_debugging) { _debug("Invoked handler function for " + _port.getName()); } } } } } } // Invoke any timeout callbacks whose timeout matches current time, // followed by specific input handlers, followed by generic input handlers, // followed by the fire function. // Synchronize to ensure that this function invocation is atomic // w.r.t. to any callbacks. synchronized (this) { // Mark that we are in the fire() method, enabling outputs to be // sent immediately. _inFire = true; ... // Invoke input handlers. for (IOPort input : inputPortList()) { // Skip the scriptIn input if (input == script.getPort()) { continue; } try { ------> _proxies.get(input).invokeHandlers(); } catch (Throwable throwable) { // localFunctions.js has an undefined reference, then catch it here. throw new IllegalActionException(this, throwable, "Failed to invoke a handler on the input \"" + input.getName() + "\"."); } } ... } Rollback the lock of _pendingCallbacks()The code used to look like: public void invokeCallback(final Runnable function) throws IllegalActionException { _pendingCallbacks.offer(function); getDirector().fireAtCurrentTime(this); } The Coverity scan message was: "CID 1346508 (#1 of 1): Unguarded read (GUARDED_BY_VIOLATION)1. missing_lock: Accessing _pendingCallbacks without holding lock JavaScript.this. Elsewhere, "ptolemy.actor.lib.jjs.JavaScript._pendingCallbacks" is accessed with JavaScript.this held 3 out of 4 times."
If we rollback this change, then $PTII/bin/vergil $PTII/ptolemy/actor/lib/jjs/modules/webSocket/test/auto/FullDuplex.xml works! However $PTII/bin/ptinvoke ptolemy.moml.MoMLSimpleApplication $PTII/ptolemy/actor/lib/jjs/modules/webSocket/test/auto/FullDuplex.xml hangs :-(
bash-3.2$ 2016-01-05 10:14:47 Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.65-b01 mixed mode): "DestroyJavaVM" #39 prio=5 os_prio=31 tid=0x00007fdf40a21000 nid=0x1703 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "vert.x-eventloop-thread-4" #19 prio=5 os_prio=31 tid=0x00007fdf3dc5c000 nid=0x6f03 runnable [0x0000700001f9e000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x000000078019baa0> (a io.netty.channel.nio.SelectedSelectionKeySet) - locked <0x000000078019db20> (a java.util.Collections$UnmodifiableSet) - locked <0x000000078019ba00> (a sun.nio.ch.KQueueSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) at java.lang.Thread.run(Thread.java:745) "threadDeathWatcher-2-1" #37 daemon prio=1 os_prio=31 tid=0x00007fdf3bb81800 nid=0x6d03 waiting on condition [0x0000700001e9b000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at io.netty.util.ThreadDeathWatcher$Watcher.run(ThreadDeathWatcher.java:137) at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137) at java.lang.Thread.run(Thread.java:745) "vert.x-acceptor-thread-0" #31 prio=5 os_prio=31 tid=0x00007fdf3e17e800 nid=0x6b03 runnable [0x0000700001d98000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x00000007801b7d30> (a io.netty.channel.nio.SelectedSelectionKeySet) - locked <0x00000007801c9d48> (a java.util.Collections$UnmodifiableSet) - locked <0x00000007801b7c90> (a sun.nio.ch.KQueueSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) at java.lang.Thread.run(Thread.java:745) "vert.x-eventloop-thread-2" #17 prio=5 os_prio=31 tid=0x00007fdf3f0e8000 nid=0x6903 runnable [0x0000700001c95000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x00000007801a08b0> (a io.netty.channel.nio.SelectedSelectionKeySet) - locked <0x00000007801b2808> (a java.util.Collections$UnmodifiableSet) - locked <0x00000007801a0810> (a sun.nio.ch.KQueueSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) at java.lang.Thread.run(Thread.java:745) "vert.x-eventloop-thread-3" #18 prio=5 os_prio=31 tid=0x00007fdf3fdbd800 nid=0x6703 runnable [0x0000700001b92000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x000000078019e1a8> (a io.netty.channel.nio.SelectedSelectionKeySet) - locked <0x00000007801a0228> (a java.util.Collections$UnmodifiableSet) - locked <0x000000078019e108> (a sun.nio.ch.KQueueSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) at java.lang.Thread.run(Thread.java:745) "vert.x-eventloop-thread-0" #15 prio=5 os_prio=31 tid=0x00007fdf3c839000 nid=0x660b runnable [0x0000700001a8f000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x00000007801b5598> (a io.netty.channel.nio.SelectedSelectionKeySet) - locked <0x00000007801b7618> (a java.util.Collections$UnmodifiableSet) - locked <0x00000007801b54f8> (a sun.nio.ch.KQueueSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) at java.lang.Thread.run(Thread.java:745) "vert.x-eventloop-thread-1" #16 prio=5 os_prio=31 tid=0x00007fdf3fdf1000 nid=0x5f03 runnable [0x000070000198c000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.KQueueArrayWrapper.kevent0(Native Method) at sun.nio.ch.KQueueArrayWrapper.poll(KQueueArrayWrapper.java:198) at sun.nio.ch.KQueueSelectorImpl.doSelect(KQueueSelectorImpl.java:103) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x00000007801b2e90> (a io.netty.channel.nio.SelectedSelectionKeySet) - locked <0x00000007801b4f10> (a java.util.Collections$UnmodifiableSet) - locked <0x00000007801b2df0> (a sun.nio.ch.KQueueSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at io.netty.channel.nio.NioEventLoop.select(NioEventLoop.java:622) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:310) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) at java.lang.Thread.run(Thread.java:745) "vertx-blocked-thread-checker" #14 daemon prio=5 os_prio=31 tid=0x00007fdf3dc1a800 nid=0x5b07 in Object.wait() [0x0000700001683000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x0000000780009f20> (a java.util.TaskQueue) at java.util.TimerThread.mainLoop(Timer.java:552) - locked <0x0000000780009f20> (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:505) "process reaper" #11 daemon prio=10 os_prio=31 tid=0x00007fdf3c35b000 nid=0x5703 waiting on condition [0x0000700001580000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000780019b78> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1066) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1127) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) "Service Thread" #9 daemon prio=9 os_prio=31 tid=0x00007fdf3b810800 nid=0x5303 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C1 CompilerThread3" #8 daemon prio=9 os_prio=31 tid=0x00007fdf3b816800 nid=0x5103 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread2" #7 daemon prio=9 os_prio=31 tid=0x00007fdf3d80a800 nid=0x4f03 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread1" #6 daemon prio=9 os_prio=31 tid=0x00007fdf3b047000 nid=0x4d03 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread0" #5 daemon prio=9 os_prio=31 tid=0x00007fdf3c03f000 nid=0x4b03 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" #4 daemon prio=9 os_prio=31 tid=0x00007fdf3c01d000 nid=0x3a0f waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Finalizer" #3 daemon prio=8 os_prio=31 tid=0x00007fdf3d800000 nid=0x3803 in Object.wait() [0x0000700000d3a000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x0000000780019dd8> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143) - locked <0x0000000780019dd8> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209) "Reference Handler" #2 daemon prio=10 os_prio=31 tid=0x00007fdf3b046000 nid=0x3603 in Object.wait() [0x0000700000c37000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x0000000780009fb0> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:157) - locked <0x0000000780009fb0> (a java.lang.ref.Reference$Lock) "VM Thread" os_prio=31 tid=0x00007fdf3b043000 nid=0x3403 runnable "GC task thread#0 (ParallelGC)" os_prio=31 tid=0x00007fdf3b00f000 nid=0x2403 runnable "GC task thread#1 (ParallelGC)" os_prio=31 tid=0x00007fdf3b00f800 nid=0x2603 runnable "GC task thread#2 (ParallelGC)" os_prio=31 tid=0x00007fdf3c00a000 nid=0x2803 runnable "GC task thread#3 (ParallelGC)" os_prio=31 tid=0x00007fdf3b800000 nid=0x2a03 runnable "GC task thread#4 (ParallelGC)" os_prio=31 tid=0x00007fdf3c00b000 nid=0x2c03 runnable "GC task thread#5 (ParallelGC)" os_prio=31 tid=0x00007fdf3c00b800 nid=0x2e03 runnable "GC task thread#6 (ParallelGC)" os_prio=31 tid=0x00007fdf3c00c000 nid=0x3003 runnable "GC task thread#7 (ParallelGC)" os_prio=31 tid=0x00007fdf3b010000 nid=0x3203 runnable "VM Periodic Task Thread" os_prio=31 tid=0x00007fdf3d803000 nid=0x5503 waiting on condition JNI global references: 4161 Heap PSYoungGen total 201216K, used 49184K [0x00000007aab00000, 0x00000007b9c80000, 0x00000007c0000000) eden space 188416K, 19% used [0x00000007aab00000,0x00000007ace8ea68,0x00000007b6300000) from space 12800K, 99% used [0x00000007b7480000,0x00000007b80f96c0,0x00000007b8100000) to space 17920K, 0% used [0x00000007b6300000,0x00000007b6300000,0x00000007b7480000) ParOldGen total 89088K, used 14697K [0x0000000780000000, 0x0000000785700000, 0x00000007aab00000) object space 89088K, 16% used [0x0000000780000000,0x0000000780e5a690,0x0000000785700000) Metaspace used 27226K, capacity 31834K, committed 32000K, reserved 1077248K class space used 3243K, capacity 4001K, committed 4096K, reserved 1048576K Helpfully, if we rollback this change, then bash-3.2$ cd $PTII/ptolemy/actor/lib/jjs/modules/socket/test/junit bash-3.2$ make works under Mac OS X. However, see Build 462 for how WebSocketClient.xml fails under JUnit on terra Caused by: ptolemy.kernel.util.IllegalActionException: Exceptions occurred during wrapup: ptolemy.kernel.util.IllegalActionException: The fire() method of this actor was never called. Usually, this is an error indicating that starvation is occurring. in .WebSocketClient.Test To replicate the build on terra: ssh terra Xvfb :2 -screen 0 1024x768x24 & export DISPLAY=localhost:2.0 cd $PTII; svn update; ./configure; ant cd $PTII/ptolemy/actor/lib/jjs/modules/socket/test/junit make Coverity Scan ChangesBelow are late December, 2015 changes prompted by Coverity Scan. December 22, Mismatched Notify.See FindBugs from December 22, 2015 - only one warning for domains.de.kernel: MWN: Mismatched notify() in ptolemy.domains.de.kernel.DEDirector.fireAt(Actor, Time, int)
This method calls Object.notify() or Object.notifyAll() without obviously holding a lock on the object. Calling notify() or notifyAll() without a lock held will result in an IllegalMonitorStateException being thrown.
See notifyAll() for details.
r74199 DEDirector.java2015-12-29 03:19 cxh * [r74199] /trunk/ptolemy/domains/de/kernel/DEDirector.java: Summary: preinitialize(): if _eventQueue is non-null, then synchronize on it. After this, there were a number of changes to DEDirector. Eventually, a separate lock was added. r74125 JavaScript.java2015-12-31 02:55 cxh * [r74215] /trunk/ptolemy/actor/lib/jjs/JavaScript.java: Summary: invokeCallback(): Coverity Scan warned that we were missing a lock when reading _pendingCallbacks.
bash-3.2$ svn diff -r 74214:74215 ptolemy/actor/lib/jjs/JavaScript.java Index: ptolemy/actor/lib/jjs/JavaScript.java =================================================================== --- ptolemy/actor/lib/jjs/JavaScript.java (revision 74214) +++ ptolemy/actor/lib/jjs/JavaScript.java (revision 74215) @@ -1260,8 +1260,11 @@ * @throws IllegalActionException If the director cannot respect the request. */ public void invokeCallback(final Runnable function) throws IllegalActionException { - _pendingCallbacks.offer(function); - getDirector().fireAtCurrentTime(this); + // Coverity Scan warned that we were missing a lock when reading _pendingCallback\ s. + synchronized (this) { + _pendingCallbacks.offer(function); + getDirector().fireAtCurrentTime(this); + } } /** Return true if the model is executing (between initialize() and bash-3.2$ r74126 VertxHelperBase.java2015-12-31 02:57 cxh[r74216] /trunk/ptolemy/actor/lib/jjs/VertxHelperBase.java: Summary: reset(): Coverity Scan pointed out that we were accessing _busy without holding the lock. HandlerInvocation: Coverity Scan suggested makgin this inner class a static inner class.
bash-3.2$ svn diff -r 74126 ptolemy/actor/lib/jjs/VertxHelperBase.java Index: ptolemy/actor/lib/jjs/VertxHelperBase.java =================================================================== --- ptolemy/actor/lib/jjs/VertxHelperBase.java (revision 74126) +++ ptolemy/actor/lib/jjs/VertxHelperBase.java (working copy) @@ -145,11 +145,14 @@ submit(new Runnable() { @Override public void run() { - _busy = false; - // Ensure that any future callbacks are ignored. - _nextResponse = 0L; - // If there are pending responses, discard them. - _deferredHandlers.clear(); + // Coverity Scan warned that we were accessing _busy without holding the \ lock. + synchronized (VertxHelperBase.this) { + _busy = false; + // Ensure that any future callbacks are ignored. + _nextResponse = 0L; + // If there are pending responses, discard them. + _deferredHandlers.clear(); + } } }); } @@ -315,8 +318,10 @@ byte[] imageBytes = stream.toByteArray(); buffer.appendBytes(imageBytes); } else { - _error("Expected image to send, but got " - + data.getClass().getName()); + _error("Expected an image to send, but instead got " + + (data == null + ? "a null object" + : data.getClass().getName())); } } else { _appendNumericToBuffer(buffer, data, type); @@ -638,7 +643,8 @@ } /** A structure for storing a deferred handler invocation. */ - private class HandlerInvocation { + private static class HandlerInvocation { + // Coverity Scan suggested making this inner class static. public Runnable response; public boolean done; } bash-3.2$ r74127 SocketHelper.java2015-12-31 04:12 cxh * [r74217] /trunk/ptolemy/actor/lib/jjs/modules/socket/SocketHelper.java, /trunk/ptolemy/actor/lib/jjs/modules/xbee/XBeeHelper.java: Avoid FindBugs LI: Unsynchronized Lazy Initialization (FB.LI_LAZY_INIT_UPDATE_STATIC) by using synchronization and checking again for the condition when returning the send and receive types. See http://stackoverflow.com/questions/6782660/incorrect-lazy-initialization. In XBeeHelper, also avoid a possible divide by zero bash-3.2$ svn diff -r 74127 ptolemy/actor/lib/jjs/modules/socket/SocketHelper.java Index: ptolemy/actor/lib/jjs/modules/socket/SocketHelper.java =================================================================== --- ptolemy/actor/lib/jjs/modules/socket/SocketHelper.java (revision 74127) +++ ptolemy/actor/lib/jjs/modules/socket/SocketHelper.java (working copy) @@ -293,37 +293,47 @@ /** Return an array of the types supported by the current host for * receiveType arguments. + * @return An array of types */ public static String[] supportedReceiveTypes() { - if (_receiveTypes != null) { - return _receiveTypes; + if (_receiveTypes == null) { + // Avoid FindBugs LI: Unsynchronized Lazy Initialization (FB.LI_LAZY_INIT_UPD\ ATE_STATIC) + synchronized (_receiveTypesMutex) { + if (_receiveTypes == null) { + int length = DATA_TYPE.values().length; + _receiveTypes = new String[length]; + int i = 0; + for (DATA_TYPE type : DATA_TYPE.values()) { + _receiveTypes[i++] = type.toString().toLowerCase(); + } + } + } } - int length = DATA_TYPE.values().length; - _receiveTypes = new String[length]; - int i = 0; - for (DATA_TYPE type : DATA_TYPE.values()) { - _receiveTypes[i++] = type.toString().toLowerCase(); - } return _receiveTypes; } /** Return an array of the types supported by the current host for * sendType arguments. + * @return An array of types */ public static String[] supportedSendTypes() { - if (_sendTypes != null) { - return _sendTypes; + if (_sendTypes == null) { + // Avoid FindBugs LI: Unsynchronized Lazy Initialization (FB.LI_LAZY_INIT_UPD\ ATE_STATIC) + synchronized (_sendTypesMutex) { + if (_sendTypes == null) { + int length = DATA_TYPE.values().length; + _sendImageTypes = _removeDuplicates(ImageIO.getWriterFormatNames()); + _sendTypes = new String[length + _sendImageTypes.size()]; + int i = 0; + for (DATA_TYPE type : DATA_TYPE.values()) { + _sendTypes[i++] = type.toString().toLowerCase(); + } + for (String imageType : _sendImageTypes) { + _sendTypes[i++] = imageType; + } + } + } } - int length = DATA_TYPE.values().length; - _sendImageTypes = _removeDuplicates(ImageIO.getWriterFormatNames()); - _sendTypes = new String[length + _sendImageTypes.size()]; - int i = 0; - for (DATA_TYPE type : DATA_TYPE.values()) { - _sendTypes[i++] = type.toString().toLowerCase(); - } - for (String imageType : _sendImageTypes) { - _sendTypes[i++] = imageType; - } return _sendTypes; } @@ -334,17 +344,23 @@ /** The array of receive type names. */ private static String[] _receiveTypes; + /** A mutex used when creating _receiveTypes. */ + private static Object _receiveTypesMutex = new Object(); + /** The set of informal image type names that can be sent. */ private static TreeSet<String> _sendImageTypes; /** The array of send type names. */ private static String[] _sendTypes; + /** A mutex used when creating _sendTypes. */ + private static Object _sendTypesMutex = new Object(); + /////////////////////////////////////////////////////////////////// //// public classes //// /** Input stream backed by a list of byte arrays. */ - public class ByteArrayBackedInputStream extends InputStream { + public static class ByteArrayBackedInputStream extends InputStream { private LinkedList<byte[]> _list = new LinkedList<byte[]>(); private int _position, _array; public ByteArrayBackedInputStream(byte[] buffer) { bash-3.2$ r74128 WebSocketHelper2015-12-31 04:18 cxh* [r74218] /trunk/ptolemy/actor/lib/jjs/modules/webSocket/WebSocketHelper.java: Summary: Coverity Scan warned: "wait_cond_improperly_checked: The wait condition prompting the wait upon WebSocketHelper.this is not checked correctly. This code can wait for a condition that has already been satisfied, which can cause a never-ending wait." The suggested fix: "Refactor the code to protect the call to wait with a loop that rechecks the wait condition inside the locked region." However, this code is a bit unusual because we want to stall the director thread here so as not to get ahead of the capability.
That change is: bash-3.2$ svn diff -r 74128 ptolemy/actor/lib/jjs/modules/webSocket/WebSocketHelper.java Index: ptolemy/actor/lib/jjs/modules/webSocket/WebSocketHelper.java =================================================================== --- ptolemy/actor/lib/jjs/modules/webSocket/WebSocketHelper.java (revision 74128) +++ ptolemy/actor/lib/jjs/modules/webSocket/WebSocketHelper.java (working copy) @@ -176,8 +176,22 @@ while(isOpen() && _webSocket.writeQueueFull()) { synchronized(this) { try { - _actor.log("WARNING: Send buffer is full. Stalling to allow it to\ drain."); - wait(); + // Coverity Scan warned: + // "wait_cond_improperly_checked: The wait + // condition prompting the wait upon + // WebSocketHelper.this is not checked + // correctly. This code can wait for a + // condition that has already been satisfied, + // which can cause a never-ending wait." + + // The suggested fix: "Refactor the code to + // protect the call to wait with a loop that + // rechecks the wait condition inside the + // locked region." + if (isOpen() && _webSocket.writeQueueFull()) { + _actor.log("WARNING: Send buffer is full. Stalling to allow i\ t to drain."); + wait(); + } } catch (InterruptedException e) { _error("Buffer is full, and wait for draining was interrupted"); } bash-3.2$ Starvation1/12/16 Periodically, when we run WebSocketServerClient.xml, the Test actor does not get fired. It fails periodically on the Mac when run in Vergil. I did have to run it almost 200 times using -run20x: Try $PTII/bin/vergil -run20x ptolemy/actor/lib/jjs/modules/webSocket/test/auto/WebSocketsServerClient.xml On the nightly build, it passed last night, but FullDuplex failed. I just ran WebSocketServerClient.xml on the nightly build machine successfully, so it is not the port. I think the reason that it fails more often in the nightly build is because of threading issues. In the nightly build, we are running junit, there is probably an extra thread. In the nightly build, we instantiate MoMLSimpleApplication, rerun, reload the accessors and rerun. The problem is that the client is not sending data to the Test actor. One thing I noticed is that when the model works, in the send() method of WebSocketHelper.java, we invoke the verticle, but when it fails, we do not. In WebSocketHelper.send(), I added some printlns: // Ready to send. // Defer this action to be executed in the associated verticle. final Runnable action = new Runnable() { public void run() { System.out.println(_actor.getName() + " " + _actor.getDirector().getModelTime() + " WebSocketHelper._connectWebsocket(): Executing associated verticle."); ... System.out.println(_actor.getName() + " " + _actor.getDirector().getModelTime() + " WebSocketHelper.send(). About to submit()"); submit(action); } I added some other printlns as well. When the model works, I see "Executing associated verticle" WebSocketServer 0.758 WebSocketHelper.send(). About to submit() WebSocketServer 0.758 WebSocketHelper._connectWebsocket(): Executing associated verticle. WebSocketServer 0.758 WebSocketHelper._sendMessageOverSocket() WebSocketClient 0.76 WebSocketHelper._connectWebsocket().handle()._issueResponse() When the model fails, I do not see it: WebSocketServer 0.731 WebSocketHelper.send(). About to submit() WebSocketClient 0.732 WebSocketHelper._connectWebsocket().handle()._issueResponse() WebSocketClient: Status: Connection established Edward suggested: "I do occasionally see an error on re-run, which I think occurs because the wrapup() function closes the server, but in Vert.x, this is an asynchronous operation, and I think the server is still open when the re-run starts." Unfortunately, build 472 failed the first time WebSocketClient.xml was run. In the run below, TCPSocketUnsignedShort.xml runs and then WebSocketClient.xml runs. Note that the run time is 16407 ms., which indicates that WebSocketClient.xml failed. Stdin and Stdout are not merged in the output below, so we only see stdout. [junit] ----------------- Reloaded Accessors and testing again /home/jenkins/workspace/ptII/org/terraswarm/accessor/test/auto/TCPSocketUnsignedShort.xml [junit] TCPSocketServer: Server: Listening for socket connection requests. [junit] TCPSocketClient: Status: Connection established [junit] TCPSocketClient: Status: Connection closed in wrapup. [junit] 3115 ms. Memory: 1388544K Free: 416217K (30%) [junit] ----------------- testing /home/jenkins/workspace/ptII/org/terraswarm/accessor/test/auto/WebSocketClient.xml [junit] JSAccessor: urlSpec is http://www.terraswarm.org/accessors/net/WebSocketClientIcon.xml, but /home/jenkins/workspace/ptII/org/terraswarm/accessor/accessors/web/net/WebSocketClientIcon.xml exists, so the latter is being read. [junit] JSAccessor: urlSpec is http://www.terraswarm.org/accessors/net/WebSocketClient.js, but /home/jenkins/workspace/ptII/org/terraswarm/accessor/accessors/web/net/WebSocketClient.js exists, so the latter is being read. [junit] JSAccessor: urlSpec is http://www.terraswarm.org/accessors/net/WebSocketServerIcon.xml, but /home/jenkins/workspace/ptII/org/terraswarm/accessor/accessors/web/net/WebSocketServerIcon.xml exists, so the latter is being read. [junit] JSAccessor: urlSpec is http://www.terraswarm.org/accessors/net/WebSocketServer.js, but /home/jenkins/workspace/ptII/org/terraswarm/accessor/accessors/web/net/WebSocketServer.js exists, so the latter is being read. [junit] WebSocketClient: Connection failed. Will try again: Connection refused: localhost/127.0.0.1:8087 [junit] WebSocketServer: Server: Listening for socket connection requests. [junit] WebSocketServer: Server: new socket established with ID: 0 [junit] WebSocketClient: Status: Connection established [junit] WebSocketClient: Connection closed because model is no longer running. [junit] WebSocketClient: Status: Connection closed in wrapup. [junit] 16405 ms. Memory: 1383936K Free: 415548K (30%) [junit] ----------------- testing /home/jenkins/workspace/ptII/org/terraswarm/accessor/test/auto/WebSocketClient2.xml See Also
|