There are several ways of retrieving information from a Java object into Matlab. On the face of it, all these methods look similar. But it turns out that there are important differences between them in terms of memory leakage and performance.
The problem: “Matlab crashes” – now go figure…
A client of one of my Matlab programs recently complained that Matlab crashes after several hours of extensive use of the program. The problem looked like something that is memory related (messages such as Matlab’s out-of-memory error or Java’s heap-space error). Apparently this happens even on 64-bit systems having lots of memory, where memory should never be a problem.
Well, we know that this is only in theory, but in practice Matlab’s internal memory management has problems that occasionally lead to such crashes. This is one of the reasons, by the way, that recent Matlab releases have added the preference option of increasing the default Java heap space (the previous way to do this was a bit complex). Still, even with a high Java heap space setting and lots of RAM, Matlab crashed after using my program for several hours.
Not pleasant at all, even a bit of an embarrassment for me. I’m used to crashing Matlab, but only as a result of my playing around with the internals – I would hate it to happen to my clients.
Finding the leak
While we can do little with Matlab’s internal memory manager, I started searching for the exact location of the memory leak and then to find a way to overcome it. I’ll save readers the description about the grueling task of finding out exactly where the memory leak occurred in a program that has thousands of lines of code and where events get fired asynchronously on a constant basis. Matlab Profiler’s undocumented memory profiling option helped me quite a bit, as well as lots of intuition and trial-and-error. Detecting memory leak is never easy, and I consider myself somewhat lucky this time to have both detected the leak source and a workaround.
It turned out that the leakage happens in a callback that gets invoked multiple times per second by a Java object (see related articles here and here). Each time the Matlab callback function is invoked, it reads the event information from the supplied Java event-data (the callback’s second input parameter). Apparently, about 1KB of memory gets leaked whenever this event-data is being read. This may appear a very small leak, but multiply this by some 50-100K callback invocations per hour and you get a leakage of 50-100MB/hour. Not a small leak at all; more of a flood you could say…
Using get()
The leakage culprit turned out to be the following code snippet:
% 160 uSecs per call, with memory leak
eventData = get(hEventData,{'EventName','ParamNames','EventData'});
eventName = eventData{1};
paramNames = eventData{2};
paramData = eventData{3}.cell;In this innocent-looking code, hEventData is a Java object that contains the EventName, ParamNames, EventData properties: EventName is a Java String, that is automatically converted by Matlab’s get() function into a Matlab string (char array); ParamNames is a Java array of Strings, that gets automatically converted into a Matlab cell-array of string; and EventData is a Java array of Objects that needs to be converted into a Matlab cell array using the built-in cell function, as described in one of my recent articles.
The code is indeed innocent, works really well and is actually extremely fast: each invocation takes of this code segment takes less than 0.2 millisecs. Unfortunately, because of the memory leak I needed to find a better alternative.
Using handle()
The first idea was to use the built-in handle() function, under the assumption that it would solve the memory leak, as reported here. In fact, MathWorks specifically advises to use handle() rather than to work with “naked” Java objects, when setting Java object callbacks. The official documentation of the set function says:
Do not use the set function on Java objects as it will cause a memory leak.
It stands to reason then that a similar memory leak happens with get and that a similar use of handle would solve this problem:
% 300 uSecs per call, with memory leak
s = get(handle(hEventData));
eventName = s.EventName;
paramNames = s.ParamNames;
paramData = cell(s.EventData);
Unfortunately, this variant, although working correctly, still leaks memory, and also performs almost twice as worse than the original version, taking some 0.3 milliseconds to execute per invocation. Looks like this is a dead end.
Using Java accessor methods
The next attempt was to use the Java object’s internal accessor methods for the requested properties. These are public methods of the form getXXX(), isXXX(), setXXX() that enable Matlab to treat XXX as a property by its get and set functions. In our case, we need to use the getter methods, as follows:
% 380 uSecs per call, no memory leak
eventName = char(hEventData.getEventName);
paramNames = cell(hEventData.getParamNames);
paramData = cell(hEventData.getEventData);
Here, the method getEventName() returns a Java String, that we convert into a Matlab string using the char function. In our previous two variants, the get function did this conversion for us automatically, but when we use the Java method directly we need to convert the results ourselves. Similarly, when we call getParamNames(), we need to use the cell function to convert the Java String[] array into a Matlab cell array.
This version at last doesn’t leak any memory. Unfortunately, it has an even worse performance: each invocation takes almost 0.4 milliseconds. The difference may seem insignificant. However, recall that this callback gets called dozens of times each second, so the total adds up quickly. It would be nice if there were a faster alternative that does not leak any memory.
Using struct()
Luckily, I found just such an alternative. At 0.24 millisecs per invocation, it is almost as fast as the leaky best-performance original get version. Best of all, it leaks no memory, at least none that I could detect.
The mechanism relies on the little-known fact that public fields of Java objects can be retrieved in Matlab using the built-in struct function. For example:
>> fields = struct(java.awt.Rectangle)
fields =
x: 0
y: 0
width: 0
height: 0
OUT_LEFT: 1
OUT_TOP: 2
OUT_RIGHT: 4
OUT_BOTTOM: 8
>> fields = struct(java.awt.Dimension)
fields =
width: 0
height: 0Note that this useful mechanism is not mentioned in the main documentation page for accessing Java object fields, although it is indeed mentioned in another doc-page – I guess this is a documentation oversight.
In any case, I converted my Java object to use public (rather than private) fields, so that I could use this struct mechanism (Matlab can only access public fields). Yes I know that using private fields is a better programming practice and all that (I’ve programmed OOP for some 15 years…), but sometimes we need to do ugly things in the interest of performance. The latest version now looks like this:
% 240 uSecs per call, no memory leak
s = struct(hEventData);
eventName = char(s.eventName);
paramNames = cell(s.paramNames);
paramData = cell(s.eventData);
This solved the memory leakage issue for my client. I felt fortunate that I was not only able to detect Matlab’s memory leak but also find a working workaround without sacrificing performance or functionality.
In this particular case, I was lucky to have full control over my Java object, to be able to convert its fields to become public. Unfortunately, we do not always have similar control over the object that we use, because they were coded by a third party.
By the way, Matlab itself uses this struct mechanism in its code-base. For example, Matlab timers are implemented using Java objects (com.mathworks.timer.TimerTask). The timer callback in Matlab code converts the Java timer event data into a Matlab struct using the struct function, in %matlabroot%/toolbox/matlab/iofun/@timer/timercb.m. The users of the timer callbacks then get passed a simple Matlab EventData struct without ever knowing that the original data came from a Java object.
As an interesting corollary, this same struct mechanism can be used to detect internal properties of Matlab class objects. For example, in the timers again, we can get the underlying timer’s Java object as follows (note the highlighted warning, which I find a bit ironic given the context):
>> timerObj = timerfind
Timer Object: timer-1
Timer Settings
ExecutionMode: singleShot
Period: 1
BusyMode: drop
Running: off
Callbacks
TimerFcn: @myTimerFcn
ErrorFcn: ''
StartFcn: ''
StopFcn: ''
>> timerFields = struct(timerObj)
Warning: Calling STRUCT on an object prevents the object from hiding its implementation details and should thus be avoided.Use DISP or DISPLAY to see the visible public details of an object. See 'help struct' for more information.(Type "warning off MATLAB:structOnObject" to suppress this warning.)timerFields =
ud: {}
jobject: [1x1 javahandle.com.mathworks.timer.TimerTask]