Main /
MemoryLeaksThe nightly build opens about 660 demos, runs many of them and exports them all. The problem is that we are running out of memory. See Also
SummaryWe use Mac Users: To avoid leaks, you will want to make two edits to your Analysis of the nightly buildHere are the scripts used to get the memory stats from the export run in the build bash-4.1$ cat ~/src/memory #!/bin/sh awk '{ if (start == 1 && $2 ~ /######/) { printf("%s %s ", $3, $(NF-1)) } if (start == 1 && $4 ~ /Memory/ && $6 ~ /Free:/) { print $0 } if ($0 ~ / ####### 1 /) { print "saw start " $0 start=1 } if ($0 ~ / ####### 660 /) { start=0 print "" exit } }' $@bash-4.1$ bash-4.1$ cat ~/src/plotmemory #!/bin/sh awk '{ if ($0 ~ /.xml/) { memory = substr($7, 1, length($7)-1) + 0 free = substr($9, 1, length($9)-1) + 0 print $1, (memory-free) } }' $@ Here's how the commands were run bash-4.1$ ~/src/memory ~/jobs/ptII/builds/301/log >& /tmp/301m bash-4.1$ ~/src/plotmemory /tmp/301m > /tmp/301.plt bash-4.1$ $PTII/bin/ptplot /tmp/301.plt Here's the plot, the x axis is the model number, the y axis is the amount of memory. The red is from build 301, before the leak fixes, the blue is from build 317, after the fixes below. The spikes into negative territory are because the scripts are primitive and fooled by output of some of the runs. From the above, we can see that memory use in the 301 run (in the red) is increasing over time. The garbage collector helps, but there are leaks. My first idea was to make sure that opening a new model and closing it was not leaking memory. Tracking down leaks using JVisualVMBelow are some random notes about tracking down memory leaks using JVisualVM. The test case was
Non-zero number of ActorGraphFrame instancesThere are a number of different ways that the ActorGraphFrame might avoid being garbage collected. What's happening is that something has a reference to the object and the garbage collector is not collecting it. Details are below Listeners have references to the objectDealing with memory leak in Kepler covers this. Basically, the _findInLibraryyEntryBoxI added the following to ActionListener[] listeners = _findInLibraryEntryBox.getActionListeners(); if (listeners != null) { int count = listeners.length; for (ActionListener listener : listeners) { //System.out.println("BGF.dispose(): _findInLibraryEntryBox: Removing " + listener); _findInLibraryEntryBox.removeActionListener(listener); } } BasicGraphFrame$MoveToFrontActionTo remove the reference to BasicGraphFrame$MoveToFrontAction, I added the following to // Free up BasicGraphFrame$MoveToFrontAction. MemoryCleaner.removeActionListeners(_rightComponent); BasicGraphFrame$HierarchyTreeCellRendererBasicGraphFrame$HierarchyTreeCellRenderer is an inner class that had a reference to BasicGraphFrame. To fix this, I added the following to // Free up BasicGraphFrame$HierarchyTreeCellRenderer. MemoryCleaner.removeActionListeners(_treeView); _treeView.setCellRenderer(null); _treeView.setUI(null); _treeView = null; The MacOSXAdapterMacOSXAdapter and JPopupMenu were appearing in the list of references. It turns out that we are using some deprecated methods. My temporary workaround was to temporarily comment out Top._macInitializer(): /** Initialize the menus and key bindings for Mac OS X. */ private void _macInitializer() { // try { // // To set the about name, set the // // com.apple.mrj.application.apple.menu.about.name // // property in the main thread, not the event thread. See // // VergilApplication.java // MacOSXAdapter.setAboutMethod(this, // getClass().getMethod("about", (Class[]) null)); // MacOSXAdapter.setQuitMethod(this, // getClass().getMethod("exit", (Class[]) null)); // } catch (NoSuchMethodException ex) { // report("Mac OS X specific initializations failed.", ex); // } } We should implement a more permanent fix here. MacOSX CPlatformWindowUnder MacOSX, CPlatformWindow had references that prevented ActorGraphFrame from being GC'd. To replicate
It turns out that under MacOSX, there are leaks. http://stackoverflow.com/questions/19781877/mac-os-java-7-jdialog-dispose-memory-leak describes the problem with JDialogs. I adapted that code and added to ptolemy.gui.Top.dispose(): java.awt.peer.ComponentPeer peer = getPeer(); super.dispose(); if (peer != null) { try { Class<?> componentPeerClass = Class.forName("sun.lwawt.LWComponentPeer"); Field target = componentPeerClass.getDeclaredField("target"); target.setAccessible(true); target.set(peer, null); Field window = peer.getClass().getDeclaredField("platformWindow"); window.setAccessible(true); Object platformWindow = window.get(peer); target = platformWindow.getClass().getDeclaredField("target"); target.setAccessible(true); target.set(platformWindow, null); } catch (Throwable throwable) { if (_debugClosing) { throwable.printStackTrace(); } } } Unfortunately, the above code is Mac-specific and compiling it produces a warning: Top.java:382: warning: ComponentPeer is internal proprietary API and may be removed in a future release java.awt.peer.ComponentPeer peer = getPeer(); So, that code is left commented out. Linux BltSubRegionBufferStrategyUnfortunately, under Linux, we get a different set of leaks. If we use Searching for "i've resolved my problem. i found out that the keyboardfocusmanger's "currentfocuscycleroot" property held the rerefence to my jinternalframe after it was closed. so on dispose i made sure to reset that reference by calling: keyboardfocusmanager.getcurrentkeyboardfocusmanager().setglobalcurrentfocuscycleroot(null); this seems to work. at least that's what the memory debugger is telling me. thanks, eitan"
So, I modified // I'm not sure exactly why this works but it does! // I think it has to do with the KeyboardFocusManager // holding onto the last focused component, so clearing and // cycling seems to free up the reference to this window. KeyboardFocusManager focusManager = KeyboardFocusManager .getCurrentKeyboardFocusManager(); focusManager.clearGlobalFocusOwner(); focusManager.downFocusCycle(); // http://oracle.developer-works.com/article/5359339/How+to+realy+unregister+listeners+%28or+how+%22great%22+swing+leaks+memory%29 focusManager.setGlobalCurrentFocusCycleRoot(null); That seems to have done it. ManagerTo replicate:
(Note that this run was done with the contents of Manager._shutdownHooks() commented out, see below.) The Manager has a thread, which is an inner class. While in that thread, when we want to plot, we initialize an Effigy. We end up trying to get read access on the directory Entity, that then adds the Manager thread to the Workspace._readerRecords hashmap. Note that this is a different Workspace than the workspace of the model. The problem is that when we want to set Manager._thread to null, there is no easy way to find all of the Workspaces and then remove the reference from each _readerRecord. Edward suggested that the thing to do is to change Manager and shutdownHooksIt turns out that the shutdown hook code in Manager leaks. If we comment out the contents of Manager._registerShutdownHook(), and we have the above change to _readerRecords, then we have no leaks. To replicate:
If we edit Manager, comment out the body of So, what we need to do is remove the shutdown hook when we destroy the manager. The solution here was to change /** Register a shutdown hook to gracefully stop the execution of a model * if the JVM is shut down (by control-C, the user logging out, tc.). */ protected void _registerShutdownHook() { _shutdownThread = new Thread() { @Override public void run() { if (_state != IDLE) { System.out .println("# Manager._registerShutdownHook(): State is " + _state + ", which is not IDLE. Waiting for model to stop."); } finish(); if (_thread != null && _thread.isAlive()) { try { // This seems dangerous. Could prevent the process // from dying. We use a timeout here of 30 econds. _thread.join(SHUTDOWN_TIME); } catch (InterruptedException e) { // Failed to stop the hread. e.printStackTrace(); } } } }; try { Runtime.getRuntime().addShutdownHook(_shutdownThread); } catch (java.security.AccessControlException ex) { // This exception gets triggered by // http://ptolemy.eecs.berkeley.edu/ptolemyII/ptII10.0/ptII10.0.devel/ptolemy/vergilVergil.htm System.out.println("Warning: failed to add a shutdown hook." + "(Running a model in an applet always causes this)"); } } ... /** Dispose of the thread by removing it from the workspace and * setting it to null. Also, remove the shutdown thread. */ private void _disposeOfThreads() { // FIXME: Should we acquire the read lock on the Workspace and // then have a version of doneReading that cleans up? // FIXME: Should we check to see if the Manager is idle here? _thread = null; // FIXME: Should this be synchronized? if (_shutdownThread != null) { Runtime.getRuntime().removeShutdownHook(_shutdownThread); _shutdownThread = null; } } ... // The thread that is passed to Runtime.addShutdowHook(). private Thread _shutdownThread = null; Edward suggested: "The solution is that Manager should have a static code body and register exactly one shutdown hook.
"Then Manager._registerShutdownHook should create a static list of weak references to Managers that have been created."
"The one shutdown hook should check if the weak reference is valid and whether the thread is running, and if it is, call finish() and attempt a Thread.join()."
Checking for leaks in ExportModelJUnitTestThe overview is that we run The reason we use Details:
Below is the sample run of 8 models
The solution to this was modify MessageHandler so that bash-3.2$ svn diff ptolemy/util/MessageHandler.java Index: ptolemy/util/MessageHandler.java =================================================================== --- ptolemy/util/MessageHandler.java (revision 73269) +++ ptolemy/util/MessageHandler.java (working copy) @@ -29,6 +29,7 @@ import java.io.BufferedReader; import java.io.IOException; import java.io.InputStreamReader; +import java.lang.ref.WeakReference; import java.util.Locale; /////////////////////////////////////////////////////////////////// @@ -183,7 +184,7 @@ * @see #status(String) */ public static void setStatusHandler(StatusHandler handler) { - _statusHandler = handler; + _statusHandler = new WeakReference<StatusHandler>(handler); } /** Return a short description of the throwable. @@ -218,8 +219,8 @@ * @see #message(String) */ public static void status(String message) { - if (_statusHandler != null) { - _statusHandler.status(message); + if (_statusHandler != null && _statusHandler.get() != null) { + _statusHandler.get().status(message); } else { System.out.println(message); } @@ -477,5 +478,5 @@ private static MessageHandler _handler = new MessageHandler(); /** The status handlers, if any. */ - private static StatusHandler _statusHandler; + private static transient WeakReference<StatusHandler> _statusHandler; } bash-3.2$
Calling The fix for that is in ConfigurationApplication.closeModelWithoutSavingOrExiting // Avoid a leaking Configuration. Configuration configuration = (Configuration) Configuration .findEffigy(model[0].toplevel()).toplevel(); _configurations.remove(configuration);
Another thing is to add <jvmarg value="-XX:+HeapDumpOnOutOfMemoryError"/> <jvmarg value="-XX:HeapDumpPath=${env.HOME}/ptII.test.longest.hprof"/> Manager again: _treeViewModelAfter a few weeks, opening vergil, running the Noise model and closing the Noise model showed a leak again: _treeViewModel seems to be the culprit. Here, the fix was to add the following to _treeViewModel = null; _container in CompositeActorOne of the changes above was to have CompsiteActor._setContainer() call if (container == null) { setManager(null); } However, this means that running the Zigbee demo throws a NPE because the container is null: bash-3.2$ java -classpath $PTII ptolemy.moml.MoMLSimpleApplication ptolemy/domains/\ wireless/test/auto/Zigbee.xml 896 ms. Memory: 449024K Free: 240244K (54%) UNCAUGHT EXCEPTION: Execution failed java.lang.RuntimeException: Execution failed at ptolemy.moml.MoMLSimpleApplication$UnloadThread.run(MoMLSimpleApplication.java:36\ 1) Caused by: ptolemy.kernel.util.InternalErrorException: The container of the manager was null\ . Try calling composite.setManager(). in .MoMLSimpleApplication Because: The container of the manager .MoMLSimpleApplication was null. Try calling composite.setManag\ er(). at ptolemy.actor.Manager.execute(Manager.java:454) at ptolemy.actor.Manager.run(Manager.java:1229) at ptolemy.actor.Manager$PtolemyRunThread.run(Manager.java:1865) Caused by: java.lang.NullPointerException: The container of the manager .MoMLSimpleApplicati\ on was null. Try calling composite.setManager(). at ptolemy.actor.Manager._inferWidths(Manager.java:1673) at ptolemy.actor.Manager.iterate(Manager.java:764) at ptolemy.actor.Manager.execute(Manager.java:367) ... 2 more Command failed: ptolemy.kernel.util.InternalErrorException: The container of the manager was\ null. Try calling composite.setManager(). in .MoMLSimpleApplication Because: The container of the manager .MoMLSimpleApplication was null. Try calling composite.setManag\ er(). ptolemy.kernel.util.InternalErrorException: The container of the manager was null. Try calli\ ng composite.setManager(). in .MoMLSimpleApplication Because: The container of the manager .MoMLSimpleApplication was null. Try calling composite.setManag\ er(). at ptolemy.actor.Manager.execute(Manager.java:454) at ptolemy.actor.Manager.run(Manager.java:1229) at ptolemy.actor.Manager$PtolemyRunThread.run(Manager.java:1865) Caused by: java.lang.NullPointerException: The container of the manager .MoMLSimpleApplicati\ on was null. Try calling composite.setManager(). at ptolemy.actor.Manager._inferWidths(Manager.java:1673) at ptolemy.actor.Manager.iterate(Manager.java:764) at ptolemy.actor.Manager.execute(Manager.java:367) ... 2 more Caused by: java.lang.NullPointerException: The container of the manager .MoMLSimpleApplicati\ on was null. Try calling composite.setManager(). at ptolemy.actor.Manager._inferWidths(Manager.java:1673) at ptolemy.actor.Manager.iterate(Manager.java:764) at ptolemy.actor.Manager.execute(Manager.java:367) at ptolemy.actor.Manager.run(Manager.java:1229) at ptolemy.actor.Manager$PtolemyRunThread.run(Manager.java:1865) If we place a breakpoint, then what is happening is that try { // We allocate a new XmlParser each time so as to avoid leaks. _xmlParser = new XmlParser(); _xmlParser.setHandler(this); if (base == null) { _xmlParser.parse(systemID, null, buffered); } else { // If we have tmp.moml and tmp/tmp2.moml and tmp.moml // contains <entity name="tmp2" class="tmp.tmp2"> // then we want to be sure that we set _xmlFile properly // NOTE: I'm not sure if it is necessary to check to // see if _xmlFile is null before hand, but it seems // like it is safer to check before resetting it to null. boolean xmlFileWasNull = false; if (_xmlFile == null) { xmlFileWasNull = true; _setXmlFile(new URL(base.toExternalForm())); } try { _xmlParser.parse(base.toExternalForm(), null, buffered); } finally { if (xmlFileWasNull) { _setXmlFile(null); } } } } catch (CancelException ex) { // Parse operation cancelled. return null; } catch (Exception ex) { // If you change this code, try running // ptolemy.moml.test.MoMLParserLeak with the heap profiler // and look for leaks. if (_toplevel != null && _toplevel instanceof ComponentEntity) { try { ((ComponentEntity) _toplevel).setContainer(null); } catch (Throwable throwable2) { // Ignore. setContainer(null) might throw an exception // if there are deferrables, but we don't want to hide // the original exception. // This problem comes up with tests in // actor/gui/test/UserActorLibrary.tcl. } // Since the container is probably already null, then // the setContainer(null) call probably did not do anything. // so, we remove the object from the workspace so it // can get gc'd. // FIXME: perhaps we should do more of what // ComponentEntity.setContainer() does and remove the ports? try { _workspace.getWriteAccess(); _workspace.remove(_toplevel); } finally { _workspace.doneWriting(); } _toplevel = null; } The xml that is causing the error is In theory, we could fix that XML, but as this would happen elsewhere, so a fix is needed HierarchyListenersSee https://projects.ecoinformatics.org/ecoinfo/issues/7190 When a model is cloned, some NamedObjs in the new model have hierarchy listeners from the original model, which prevents the new model from being garbage collected.
In the usecase I'm debugging, the toplevel Composites in both the original and new models are added as hierarchy listeners to PortParameters, but only the toplevel in the new model should be added. The original toplevel is added since clone() and setContainer() are called for the PortParameter before its container actor has been moved to the new toplevel.
I don't know the best approach to fix this. Making the listeners weak references would fix the memory leaks, but wrong listeners would still receive hierarchy change events.
import ptolemy.actor.CompositeActor; import ptolemy.actor.lib.Ramp; import ptolemy.kernel.util.Workspace; public class leak { public static void main(String[] args) { leak l = new leak(); l.go(); } public void go() { try { CompositeActor master = new CompositeActor(); Ramp ramp = new Ramp(master, "ramp"); for(int i = 0; i < 100; i++) { master.clone(new Workspace()); } System.gc(); System.out.println("Sleeping..."); Thread.sleep(100000L); } catch(Throwable t) { System.err.println("caught: " + t); } } } I'm attaching a program that illustrates the leak. It creates a simple model containing Ramp, clones the model 100 times, then sleeps. A reference is kept to the original model, but not the clones, so they should be gc'd.
Once the program sleeps, you should see 101 Ramp instances in the heap: one in the original model and one for each clone. According to jvisualvm, the nearest gc root for all the Ramp instances is _hierarchyListeners in the same CompositeActor instance (the original model).
Below is the output of JVisualVM NamedObj defines /** Add a hierarchy listener. If the listener is already * added, do nothing. This will cause the object to also * be added as a hierarchy listener in the container of * this object, if there is one, and in its container, * up to the top of the hierarchy. * @param listener The listener to add. * @see #removeHierarchyListener(HierarchyListener) */ public void addHierarchyListener(HierarchyListener listener) { if (_hierarchyListeners == null) { _hierarchyListeners = new HashSet<HierarchyListener>(); } _hierarchyListeners.add(listener); // Add to the container. NamedObj container = getContainer(); if (container != null) { container.addHierarchyListener(listener); } } ... public Object clone(Workspace workspace) throws CloneNotSupported\ Exception { ... newObject._hierarchyListeners = null; ... } ... /** Remove a hierarchy listener. If the listener is already * removed, do nothing. This will cause the object to also * be removed as a hierarchy listener in the container of * this object, if there is one, and in its container, * up to the top of the hierarchy. * @param listener The listener to remove. * @see #addHierarchyListener(HierarchyListener) */ public void removeHierarchyListener(HierarchyListener listener) { if (_hierarchyListeners != null) { _hierarchyListeners.remove(listener); // Remove from the container. NamedObj container = getContainer(); if (container != null) { container.removeHierarchyListener(listener); } } } ... /** If any hierarchy listeners are registered, notify them * that a change has occurred in the hierarchy. * @exception IllegalActionException If the change to the * hierarchy is not acceptable to the listener. * @see #addHierarchyListener(HierarchyListener) * @see HierarchyListener */ protected void _notifyHierarchyListenersAfterChange() throws IllegalActionException { if (_hierarchyListeners != null) { // The hierarchy has changed. Add all hierarchy listeners // up the new hierarchy. This should be done before notif\ ication // because notification may result in exceptions. NamedObj container = getContainer(); if (container != null) { for (HierarchyListener listener : _hierarchyListeners\ ) { container.addHierarchyListener(listener); } } for (HierarchyListener listener : _hierarchyListeners) { listener.hierarchyChanged(); } } } /** If any hierarchy listeners are registered, notify them * that a change is about to occur in the hierarchy. * @exception IllegalActionException If changing the * hierarchy is not acceptable to the listener. * @see #addHierarchyListener(HierarchyListener) * @see HierarchyListener */ protected void _notifyHierarchyListenersBeforeChange() throws IllegalActionException { if (_hierarchyListeners != null) { for (HierarchyListener listener : _hierarchyListeners) { listener.hierarchyWillChange(); } // If changing the hierarchy is acceptable to all listeners, // then we get to here. At this point, we should remove all // listeners from containers above us in the hierarchy, to // re-add them after the change in the hierarchy is complete. NamedObj container = getContainer(); if (container != null) { for (HierarchyListener listener : _hierarchyListeners) { container.removeHierarchyListener(listener); } } } } ... /** List of hierarchy listeners, if any. */ private Set<HierarchyListener> _hierarchyListeners; It seems like when we clone, it is not sufficient to set _hierarchyListeners to null in NamedObj, we also need to look at the container and remove the listeners. setContainer() should do this for us. |