There are several ways of retrieving information from a Java object into Matlab. On the face of it, all these methods look similar. But it turns out that there are important differences between them in terms of memory leakage and performance.
The problem: “Matlab crashes” – now go figure…
A client of one of my Matlab programs recently complained that Matlab crashes after several hours of extensive use of the program. The problem looked like something that is memory related (messages such as Matlab’s out-of-memory error or Java’s heap-space error). Apparently this happens even on 64-bit systems having lots of memory, where memory should never be a problem.
Well, we know that this is only in theory, but in practice Matlab’s internal memory management has problems that occasionally lead to such crashes. This is one of the reasons, by the way, that recent Matlab releases have added the preference option of increasing the default Java heap space (the previous way to do this was a bit complex). Still, even with a high Java heap space setting and lots of RAM, Matlab crashed after using my program for several hours.
Not pleasant at all, even a bit of an embarrassment for me. I’m used to crashing Matlab, but only as a result of my playing around with the internals – I would hate it to happen to my clients.
Finding the leak
While we can do little with Matlab’s internal memory manager, I started searching for the exact location of the memory leak and then to find a way to overcome it. I’ll save readers the description about the grueling task of finding out exactly where the memory leak occurred in a program that has thousands of lines of code and where events get fired asynchronously on a constant basis. Matlab Profiler’s undocumented memory profiling option helped me quite a bit, as well as lots of intuition and trial-and-error. Detecting memory leaks is never easy, and I consider myself somewhat lucky this time to have both detected the leak source and a workaround.
It turned out that the leakage happens in a callback that gets invoked multiple times per second by a Java object (see related articles here and here). Each time the Matlab callback function is invoked, it reads the event information from the supplied Java event-data (the callback’s second input parameter). Apparently, about 1KB of memory gets leaked whenever this event-data is being read. This may appear a very small leak, but multiply this by some 50-100K callback invocations per hour and you get a leakage of 50-100MB/hour. Not a small leak at all; more of a flood you could say…
Using get()
The leakage culprit turned out to be the following code snippet:
% 160 uSecs per call, with memory leak eventData = get(hEventData,{'EventName','ParamNames','EventData'}); eventName = eventData{1}; paramNames = eventData{2}; paramData = eventData{3}.cell; |
In this innocent-looking code, hEventData
is a Java object that contains the EventName, ParamNames, EventData properties: EventName is a Java String
, that is automatically converted by Matlab’s get() function into a Matlab string (char array); ParamNames is a Java array of String
s, that gets automatically converted into a Matlab cell-array of string; and EventData is a Java array of Object
s that needs to be converted into a Matlab cell array using the built-in cell function, as described in one of my recent articles.
The code is indeed innocent, works really well and is actually extremely fast: each invocation of this code segment takes less than 0.2 millisecs. Unfortunately, because of the memory leak I needed to find a better alternative.
Using handle()
The first idea was to use the built-in handle() function, under the assumption that it would solve the memory leak, as reported here. In fact, MathWorks specifically advises to use handle() rather than to work with “naked” Java objects, when setting Java object callbacks. The official documentation of the set function says:
Do not use the set function on Java objects as it will cause a memory leak.
It stands to reason then that a similar memory leak happens with get and that a similar use of handle would solve this problem:
% 300 uSecs per call, with memory leak s = get(handle(hEventData)); eventName = s.EventName; paramNames = s.ParamNames; paramData = cell(s.EventData); |
Unfortunately, this variant, although working correctly, still leaks memory, and also performs almost twice as worse than the original version, taking some 0.3 milliseconds to execute per invocation. Looks like this is a dead end.
Using Java accessor methods
The next attempt was to use the Java object’s internal accessor methods for the requested properties. These are public
methods of the form getXXX(), isXXX(), setXXX() that enable Matlab to treat XXX as a property by its get and set functions. In our case, we need to use the getter methods, as follows:
% 380 uSecs per call, no memory leak eventName = char(hEventData.getEventName); paramNames = cell(hEventData.getParamNames); paramData = cell(hEventData.getEventData); |
Here, the method getEventName() returns a Java String
, that we convert into a Matlab string using the char function. In our previous two variants, the get function did this conversion for us automatically, but when we use the Java method directly we need to convert the results ourselves. Similarly, when we call getParamNames(), we need to use the cell function to convert the Java String[]
array into a Matlab cell array.
This version at last doesn’t leak any memory. Unfortunately, it has an even worse performance: each invocation takes almost 0.4 milliseconds. The difference may seem insignificant. However, recall that this callback gets called dozens of times each second, so the total adds up quickly. It would be nice if there were a faster alternative that does not leak any memory.
Using struct()
Luckily, I found just such an alternative. At 0.24 millisecs per invocation, it is almost as fast as the leaky best-performance original get version. Best of all, it leaks no memory, at least none that I could detect.
The mechanism relies on the little-known fact that public fields of Java objects can be retrieved in Matlab using the built-in struct function. For example:
>> fields = struct(java.awt.Rectangle) fields = x: 0 y: 0 width: 0 height: 0 OUT_LEFT: 1 OUT_TOP: 2 OUT_RIGHT: 4 OUT_BOTTOM: 8 >> fields = struct(java.awt.Dimension) fields = width: 0 height: 0 |
Note that this useful mechanism is not mentioned in the main documentation page for accessing Java object fields, although it is indeed mentioned in another doc-page – I guess this is a documentation oversight.
In any case, I converted my Java object to use public (rather than private) fields, so that I could use this struct mechanism (Matlab can only access public fields). Yes I know that using private fields is a better programming practice and all that (I’ve programmed OOP for some 15 years…), but sometimes we need to do ugly things in the interest of performance. The latest version now looks like this:
% 240 uSecs per call, no memory leak s = struct(hEventData); eventName = char(s.eventName); paramNames = cell(s.paramNames); paramData = cell(s.eventData); |
This solved the memory leakage issue for my client. I felt fortunate that I was not only able to detect Matlab’s memory leak but also find a working workaround without sacrificing performance or functionality.
In this particular case, I was lucky to have full control over my Java object, to be able to convert its fields to become public. Unfortunately, we do not always have similar control over the object that we use, because they were coded by a third party.
By the way, Matlab itself uses this struct mechanism in its code-base. For example, Matlab timers are implemented using Java objects (com.mathworks.timer.TimerTask
). The timer callback in Matlab code converts the Java timer event data into a Matlab struct using the struct function, in %matlabroot%/toolbox/matlab/iofun/@timer/timercb.m. The users of the timer callbacks then get passed a simple Matlab EventData struct without ever knowing that the original data came from a Java object.
As an interesting corollary, this same struct mechanism can be used to detect internal properties of Matlab class objects. For example, in the timers again, we can get the underlying timer’s Java object as follows (note the highlighted warning, which I find a bit ironic given the context):
>> timerObj = timerfind Timer Object: timer-1 Timer Settings ExecutionMode: singleShot Period: 1 BusyMode: drop Running: off Callbacks TimerFcn: @myTimerFcn ErrorFcn: '' StartFcn: '' StopFcn: '' >> timerFields = struct(timerObj) Warning: Calling STRUCT on an object prevents the object from hiding its implementation details and should thus be avoided. Use DISP or DISPLAY to see the visible public details of an object. See 'help struct' for more information. (Type "warning off MATLAB:structOnObject" to suppress this warning.) timerFields = ud: {} jobject: [1x1 javahandle.com.mathworks.timer.TimerTask] |
Thanks yair for this very useful and amazing tip…
3 questions :
1 – could you give some details about the way you track memory leaks on java objects created inside matlab ? With JConsole ?
2 – is there a way to retrieve all references of java objects which are referenced so that they won’t be destroyed by next garbage collection ?
3 – do you plan to post an article about how to estimate the memory allocated by a java object (deep referencing) ?
@Julien – In general, I explain about Java debugging and memory profiling in sections 1.6 and 8.7.1 of my book.
In this particular case the memory leak happened on the Matlab side, so the Matlab Profiler’s memory profiling functionality was enough to identify the leak.
For Java memory profiling you could use the standard JConsole, JMap and JHat utilities that are part of the JDK. I also find JVisualVM informative and easy to use. In addition, you could use JMP/TIJMP or other 3rd-party tools. MathWorks themselves have posted a short technical note about using JConsole in Matlab.
For deep referencing, I typically use classmexer. As I have noted in a discussion with MathWorks almost 2 years ago, while deep-scanning takes some time to execute and cannot therefore be placed in the Workspace table’s Cell-renderer, a simple asynchronous thread that would periodically deep-scan Java objects and refresh the table should be relatively easy to implement. Unfortunately, my suggestion was never developed/released by MathWorks, and I never got around to doing this as an external utility (maybe if I have some spare time in the future…). MathWorks is currently hiring a performance-tuning engineer for Natick, so perhaps this person could also work on the Java aspects, not just the core Matlab aspects as indicated by the job requirements.
I plan to post a series of articles about related issues in the upcoming months: debugging in different Java IDEs, profiling (both code and memory) etc. But I do not have any due date at the moment.
Thank you very much for this very complete answer.
I bought your book almost one month ago but I am still waiting for it… Should I be worried?
I really look forward to reading your coming articles!
@Julien – don’t be worried: the book ships to Europe from the US through the UK, and apparently the UK office had a 2-week vacation around Christmas/New-Year, causing a large backlog. In addition, the demand for this book has apparently exceeded the initial expectations, causing another delay. I’m actually in the same boat as you, as I get my copies via the UK. I’m told that all the back-orders are expected to be shipped in a week or two.
Yair
I wonder what effect clearing the variables or setting them empty before returning from the callback might have e.g.
etc. while using the original code.
Did you try that?
Malcolm
@Malcolm – This could have been a nice idea, “helping” the Garbage Collector free the memory. Unfortunately it does not solve the leak.
I thought that might be too easy. How about setting ‘Interruptible’ to ‘off’: no need then (?) for a MATLAB managed stack.
@Malcolm – the problem was not to make the callback fast enough to prevent other events from interfering, but to make it fast enough to process an enormous number of callbacks in real-time, without dropping a single event. This is a difficult performance-tuning challenge all by itself; the memory leak just made it more challenging.
Perhaps I misunderstand your suggestion? – if so then kindly elaborate.
@Yair
I was thinking aloud and wasn’t clear. I was not thinking about speed but about leaked references maybe occurring somewhere in the internals used to manage the call stack. With an uninteruptible callback (or BusyAction=cancel?), that task should be easier and the leaked references need not be added to the stack – and if they are not added they can not leak. I was thinking of one of the pitfalls in Bloch’s Effective Java: with a resizable array used as a push/pull stack shrinking the array without explicity setting the now unused higher elements to null can leak the contents of those elements. More a thought than a suggestion.
@Malcolm – sorry for the belated answer. In theory you could be right. Unfortunately,
(1) in this particular case I had to process all events and could not afford to cancel even a single one. So, I had to use standard event queuing.
(2) the BusyAction and Interruptible properties are only honored by a few standard MATLAB callbacks: ButtonDownFcn, KeyPressFcn, KeyReleaseFcn, WindowButtonDownFcn, WindowButtonMotionFcn, WindowButtonUpFcn, WindowKeyPressFcn, WindowKeyReleaseFcn, and WindowScrollWheelFcn. Therefore, they cannot be used in the general case. I’m relying on the documentation here – if they are in fact honored as an undocumented feature, please do let me know.
@Yair – Actually, the documentation is in error. I was alarmed by what you posted that is from the documentation — the ‘Callback’ property is not listed as being governed by the ‘Interruptible’ property. I therefore contacted Mathworks to ask if the behavior of handle graphics has changed in the newer releases to exclude the ‘Callback’ property from ‘Interruptible’ consideration. Tech Support told me that the behavior has not changed, agreed that documentation was confusing, and would take action to fix it.
Yair, thanks for this detailed post, very appreciated. If you have a moment to consider please, do you think any of the techniques you mentioned in the post might be helpful in troubleshooting the following issue..?
I’m been experiencing a reoccurring & frustrating situation in R2011b 64-bit. Occasionally, during intensive computation using PCT (Parallel Computing Toolbox), my Windows 7 x64 system will spontaneously shut down. No blue screen, just sudden stop. I don’t currently suspect hardware issues, as the shut-downs are only once a week or so & almost always only occur during PCT execution.
The PCT task never generates errors when run within the MATLAB desktop. One feature of the code is extensive O-O use. In the PCT context described above, hundreds of thousands of objects are being created & destroyed per worker thread per hour. I’m keeping an eye on Windows TaskMan & each worker thread typically has about 1000 system handles open during execution. I have 48 GB of physical memory. The shutdowns have occurred 20 minutes into execution, as well as 2 hours into execution.
Of course, I’m not certain that the shutdown is MATLAB-related. However, Windows Event Viewer / System shows no errors or warnings prior to shutdown. My Intel BIOS is also set to pause at boot-time on any hardware system errors & this is also not showing any events.
Any comments / suggestions appreciated,
Thanks,
Brad
@Brad – two ideas:
Yair, thanks for your suggestions.. I’m sorry though, I may not have explained my situation clearly enough: “spontaneous shutdown” means “instant death”. No crash reports written to disk (as for example happens in a blue screen crash), no graceful exit process within MATLAB. At time (t), all Windows processes are running with no signs of error, then at time (t + 1 microsecond) the machine is off & the system fans are winding down..
sounds like a serious problem – contact Mathworks Customer Support
If his computer is shutting off completely it is likely unrelated to Matlab, right?
Not necessarily – it could be that the PCT accesses a certain Windows control or a bad memory location or some GPU funckiness that causes this. I don’t think we have the tools to find out, but Mathworks probably does.
In fact, based on the description, it sounds like the intense computation is overwhelming the cooling system of the computer, and either your BIOS or some heat-related failure is causing the shutdown. Do you do other heavy computation tasks to determine whether this is correlated with computation in general, versus Matlab in particular?
Oh, I see. That is a reasonable suspicion. I didn’t see your reply before I submitted my second post.
Good luck @Brad 🙂
Colin, you’re right, my problem turned out to be heat-related. As I posted initially, this seemed like a low-probability explanation. I thought! my tower case was really well-cooled..
I was aware though that my Intel board has a fairly good monitoring system (called BMC – Baseboard Management Controller) which shuts the system down when a component metric goes out of tolerance (e.g. over max. permitted temperature). So I turned to Intel just to “rule this out”..
In the course of doing that, I downloaded an Intel system utility for Windows that reads out all the BMC-logged events. I learned that all five of my “spontaneous” shutdown events (occuring over five months) had been triggered by the BMC due to overheating in what they call the “Intel Output Hub” (IOH), a key chip which handles CPU / peripheral I/O.
This chip is actually extremely hot to the touch even under light duty! Due to its location relative to the PCIe slots, there’s little vertical clearance for a CPU-style radiator + fan. (I have a RAID controller card which extends partially into the IOH’s air space). Intel has an oversized, relatively flat radiator mounted to the IOH.
I opened up the chassis today with my technician & explained that we needed to rig up a small fan on a flexible arm & move it in position to get good airflow directly onto that IOH radiator. We happened to have an 80mm low-noise Noctua fan around, which he suspended from a twisted-wire rigging. To test it out, I ran & successfully completed the same 1-hour, 8-core PCT job that shut the system down last week! 🙂
I’m wondering if PCT worker processes might in many cases all be writing their output to disk at roughly the same time. (for example, today’s stress-test job got written to C:\Users\Brad\AppData\Roaming\MathWorks\MATLAB\local_scheduler_data\R2011b\Job32)
But looking at those eight MAT output files in the Job32 folder, they’re “only” 14 MB each in size! Perhaps the baseline temperature of the IOH chip has been really close to the trip threshold all this time..? I guess my chassis wasn’t getting the overall minimum airflow rate that Intel designed-to. I had spec’ed low-noise, low-RPM Noctua fans in order to keep the sound level down.
There are a lot of trade-offs in system hardware configuration..
[…] If we use the struct function which Yair has previously discussed, […]
[…] Memory management has a direct influence on performance. I have already shown some examples of this in past articles here […]
hi….i have installed matlab 7 with my windows 7 OS. have changed the compatibility to windows 2000.but the problem is…while i am going to save the model or m file..matlab closed .cant save any file.please suggest what to do.
thank you.
@Shraboni – I suggest that you contact MathWorks technical support
thanks for the post, i found it useful in debugging another matlab memory leak. this time with the delete fcn. I was reading in a few hundred thousand binary files from a frame grabber and deleting them on the fly with delete(‘path\fname’). at around 760k files i got java heap space, out of memory error. It seems delete has a memory leak. I solved the problem by switching to
java.io.File('path\fname').delete()
, which maintains same speed but no memory leak issues. I tried system(‘del path\fname’) but it was 20x slower.Just though i’d share incase anyone else had this issue.
@Byron – thanks for sharing
[…] reasons when it becomes clear (I have shown several examples of this in the past, here, here, here and here). […]
[…] leak in delete was (to the best of my knowledge) originally reported in the CSSM newsgroup and on this blog a few weeks ago […]
Hello;
I have an open loop in my file and that is by using the command for n=1:inf
I do some calculations in such calculation and I use the command clear all at the end of each loop.
Unfortunately, the program runs for 1 hour and crashes later because of the out of memory message
It seems that there there is a leak in the memory whenever we jump from one loop to the next one.
Can any body help or tell me how to clear the command function at the end of loop and that is to avoid matlab crashes later.
Thank you
@Maamar – without knowing what your loop looks like it is impossible to provide a specific answer.
One memory leak that has been fixed in R2013a relates to using drawnow within a loop – maybe this is the case that you have.
Hello,
can i know which is better for matlab performance, make a java heap size bigger or lower?
Thanks
In general, increasing the Java heap size should improve performance.
[…] of this blog? Reading private propertiesGetting the value of x is simple enough when we recall that calling Matlab’s struct function on a class object reveals all its hidden treasures. I wrote about this a couple of years ago, and I’m not sure […]
[…] You may also find useful another article I wrote, on finding and fixing a Java memory leak in Matlab […]