Matlab-Java book

Undocumented Secrets of Matlab-Java Programming book
CRC promo code

Quick links:     Table of Contents     Book organization     FAQ     About the author     Errata list

I am extremely pleased to announce that after five years of research and hard work, my book on Undocumented Secrets of Matlab-Java Programming is finally published.

For a variety of reasons, the Matlab-Java interface was never fully documented. This is really quite unfortunate: Java is one of the most widely used programming languages, having many times the number of programmers and programming resources as Matlab. Also unfortunate is the popular claim that while Matlab is a fine programming platform for prototyping, it is not suitable for real-world, modern-looking applications.

Undocumented Secrets of Matlab-Java Programming (CRC Press, ISBN 9781439869031) aims to correct this.

This book shows how using Java can significantly improve Matlab program appearance and functionality. This can be done easily and even without any prior Java knowledge.

Readers are led step-by-step from simple to complex customizations. Within the book’s 700 pages, thousands of code snippets, hundreds of screenshots and ~1500 online references are provided to enable the utilization of this book as both a sequential tutorial and as a random-access reference suited for immediate use.

This book demonstrates how:

  • The Matlab programming environment relies on Java for numerous tasks, including networking, data-processing algorithms and graphical user-interface (GUI)
  • We can use Matlab for easy access to external Java functionality, either third-party or user-created
  • Using Java, we can extensively customize the Matlab environment and application GUI, enabling the creation of visually appealing and usable applications

No prior Java knowledge is required. All code snippets and examples are self-contained and can generally be used as-is. Advanced Java concepts are sometimes used, but understanding them is not required to run the code. Java-savvy readers will find it easy to tailor code samples for their particular needs; for Java newcomers, an introduction to Java and numerous online references are provided.

More info about the book, including detailed Table-of-Contents, book structure and FAQ, can be found on the book’s webpage.

Get your book copy now!
Use promo code MZK07 for a 20% discount and free worldwide shipping.

Categories: Java
Tags: ,
14 Comments

New training courses

I would like to take a short break(*) from the regular weekly article on undocumented features, in order to draw your attention to the fact that I am offering a new service, of professional training courses.

My training courses can help you and your team achieve a much higher level of proficiency and efficiency using Matlab. You will quickly learn how to produce higher quality, better looking, faster working, and more robust applications. Your effectiveness in writing Matlab programs will improve, saving you development time while improving the quality. And all this at the convenience of your offices, and at a very competitive cost.

Some of the structured training courses offered include:

The training course will be customized based on your specific needs, in terms of both the training level (introductory to advanced), the topics and the duration, giving you the best value for your training investment.

For more information, please refer to the new training page on this site, or click the “Training” tab on the website header.

The new training service complements my existing services for consulting and contract development.

For more information, please feel free to email me (altmany at gmail dot com) directly.

(*) the regular flow of technical articles will return next week

Categories: Uncategorized
Tags:
Leave a comment

Using Java Collections in Matlab

In this blog, I posted numerous articles showing how built-in Java functionality can greatly increase Matlab programming productivity and the resulting program power. While not really “undocumented” in the true sense (most of the material is actually documented separately in Matlab and Java), in practice, such Java gems are unfortunately often overlooked by Matlab programmers. Many of these tricks are GUI related, but Java has a lot to offer in non-GUI aspects. Today I will show how we can leverage Java’s extensive Collections classes in non-GUI Matlab programming.

Java Collections

Java contains a wide variety of predefined data structures (specifically Collections and Maps), which can easily be adapted to fit most programming tasks. It is unfortunate that the Matlab programming language does not contain similar predefined collection types, apart from its basic cell, array and struct elements. Matlab R2008b (7.7) added containers.Map, which is a much-scaled-down Matlab version of the java.util.Map interface, but is a step in the right direction. Some Matlab programmers prepared their own implementations of data structures, which can be found on the File Exchange.

Matlab’s limitation can easily be overcome with Java’s out-of-the-box set of predefined classes, as described below. Java collections have many advantages over hand-coded Matlab equivalents, in addition to the obvious time saving: Java’s classes are extensively debugged and performance-tuned, which is especially important when searching large collections. Also, these classes provide a consistent interface, are highly configurable and extendable, enable easy cross-type interoperability and generally give Matlab programmers the full power of Java’s collections without needing to program the nuts-and-bolts.

To start using Java collections, readers should first be familiar with them. These classes are part of the core Java language, and are explained in any standard Java programming textbook. The official Java website provides a detailed online tutorial about these classes, their usage and differences, in addition to a detailed reference of these classes.

Java Collections include interfaces and implementation classes. As of Matlab R2011b, Java interfaces cannot be used directly – only the implementation classes. Of the many Collection classes, the following are perhaps most useful (all classes belong to the java.util package, unless otherwise noted):

  • Set: an interface that is implemented by classes characterized by their prevention of duplicate elements. Some notable implementation classes: EnumSet, HashSet, LinkedHashSet, TreeSet.
  • List: an interface that is implemented by classes characterized by ordered elements (a.k.a. sequences), which may be duplicates of each other and accessed based on their exact position within the list. Specially optimized internal algorithms enable sorting, shuffling, reversing, rotating, and other modifications of the list. Some notable implementation classes: Vector, Stack, LinkedList.
  • Queue: an interface that is implemented by classes designed for holding elements prior to processing, in an ordered list accessible only at one (=head) or two (head and tail) positions. All classes include specialized insertion, extraction and inspection methods. Some notable implementation classes: LinkedList, ArrayDeque, PriorityQueue.
  • Map: an interface that is implemented by classes characterized by elements of unique keys paired with associated values. Early Java versions used the java.util.Dictionary abstract superclass, but this was subsequently replaced by the java.util.Map interface class. Maps contain specialized algorithms for fast retrieval based on a supplied key. Some of the notable implementation classes: EnumMap, HashMap, Hashtable, TreeMap, LinkedHashMap.

As noted, Matlab R2008b (7.7)’s new containers.Map class is a scaled-down Matlab version of the java.util.Map interface. It has the added benefit of seamless integration with all Matlab types (unlike Java Collections – see below), as well as the ability since Matlab 7.10 (R2010a) to specify data types. Serious Matlab implementations requiring key-value maps/dictionaries should still use Java’s Map classes to gain access to their larger functionality if not performance. Matlab versions earlier than R2008b have no real alternative in any case and must use the Java classes. The reader may also be interested to examine pure-Matlab object-oriented (class-based) Hashtable implementations, available on the File Exchange (example1, example2).

A potential limitation of using Java Collections is their inability to contain non-primitive Matlab types such as structs. To overcome this, either down-convert the types to some simpler type (using the struct2cell function or programmatically), or create a separate Java object that holds the information and store this object in the Collection.

Many additional Collection classes offer implementation of specialized needs. For example, java.util.concurrent.LinkedBlockingDeque implements a Queue which is also a LinkedList, is a double-ended queue (Deque) and is blocking (meaning that extraction operations will block until at least one element is extractable).

Programming interface

All the Java Collections have intentionally similar interfaces, with additional methods specific to each implementation class based on its use and intent. Most Collections implement the following common self-explanatory methods (simplified interface):

int size()
int hashCode()
boolean isEmpty()
boolean contains(Object element)
boolean containsAll(Collection c)
Iterator iterator()
boolean add(Object element)
boolean remove(Object element)
boolean addAll(Collection c)
boolean removeAll(Collection c)
boolean retainAll(Collection c)
void clear()
Object clone()
Object[] toArray()
String toString()

The full list of supported methods in a specific Collection class can, as any other Java object/class, be inspected using Matlab’s methods or methodsview functions (or my utilities, checkClass and uiinspect):

>> methods('java.util.Hashtable')
Methods for class java.util.Hashtable:
Hashtable	containsKey	  equals	isEmpty   notifyAll	size 
clear		containsValue	  get		keyset    put		toString
clone		elements	  getClass	keys      putAll	values
contains	entrySet	  hashCode	notify    remove	wait

Collections example: Hashtable

A detailed Matlab example that utilizes Hashtable (actually, java.util.Properties, which is a subclass of java.util.Hashtable) for a phone-book application is detailed in Matlab’s External Interface/Java section. The following code snippet complements that example by showing some common characteristics of Collections:

>> hash = java.util.Hashtable;
>> hash.put('key #1','myStr');
>> hash.put('2nd key',magic(3));
>> disp(hash)  % same as: hash.toString
{2nd key=[[D@59da0f, key #1=myStr}
 
>> disp(hash.containsKey('2nd key'))
     1                        % = true
 
>> disp(hash.size)
     2
 
>> disp(hash.get('key #2'))   % key not found
     []
 
>> disp(hash.get('key #1'))   % key found and value retrieved
myStr
 
>> disp(hash.entrySet) % java.util.Collections$SynchronizedSet object
[2nd key=[[D@192094b, key #1=myStr]
 
>> entries = hash.entrySet.toArray
entries =
java.lang.Object[]:
    [java.util.Hashtable$Entry]
    [java.util.Hashtable$Entry]
 
>> disp(entries(1))
2nd key=[[D@192094b
 
>> disp(entries(2))
key #1=myStr
 
>> hash.values  % a java.util.Collections$SynchronizedCollection
ans =
[[[D@59da0f, myStr]
 
>> vals = hash.values.toArray
vals =
java.lang.Object[]:
    [3x3 double]
    'myStr'
 
>> vals(1)
ans =
     8     1     6
     3     5     7
     4     9     2
 
>> vals(2)
ans =
myStr

Enumerators / Iterators

Java Iterators (aka Enumerators), such as those returned by the hash.keys() method, are temporary memory constructs. A common pitfall is to directly chain such constructs. While legal from a syntax viewpoint, this would produce results that are repetitive and probably unintended, as the following code snippet shows:

>> hash.keys
ans =
java.util.Hashtable$Enumerator@7b1d52    < = enumerator reference
>> hash.keys
ans =
java.util.Hashtable$Enumerator@127d1b4   < = new reference object
 
>> disp(hash.keys.nextElement)
2nd key   < = 1st key enumerated in the hash
>> disp(hash.keys.nextElement)
2nd key   < = same key returned, because of the new enumeration obj
 
>> % Wrong way: causes an endless loop since hash.keys regenerates
>> % ^^^^^^^^^  so hash.keys.hasMoreElements is always true
>> while hash.keys.hasMoreElements, doAbc();  end   % endless loop
 
>> % Correct way: store the enumerator in a temporary variable
>> hashKeys = hash.keys; 
>> while hashKeys.hasMoreElements,  doAbc();  end
 
>> hash.keys.methods
Methods for class java.util.Hashtable$Enumerator:
equals            hasNext    nextElement   remove
getClass          hashCode   notify        toString
hasMoreElements   next       notifyAll     wait
 
>> % And similarly for ArrayList iterators:
>> jList = java.util.ArrayList;
>> jList.add(pi); jList.add('text'); jList.add(magic(3)); disp(jList)
[3.141592653589793, text, [[D@1c8f959]
 
>> iterator = jList.iterator
iterator =
java.util.AbstractList$Itr@1ab3929
 
>> disp(iterator.next); fprintf('hasNext: %d\n',iterator.hasNext)
          3.14159265358979
hasNext: 1
 
>> disp(iterator.next); fprintf('hasNext: %d\n',iterator.hasNext)
text
hasNext: 1
 
>> disp(iterator.next); fprintf('hasNext: %d\n',iterator.hasNext)
     8     1     6
     3     5     7
     4     9     2
hasNext: 0

More on this topic and other non-GUI Java libraries that can be used in Matlab, can be found in Chapter 2 of my Matlab-Java Programming book.

Categories: Java, Low risk of breaking in future versions
Tags:
2 Comments

Matlab mex in-place editing

I would like to welcome Matlab Mex power-user Peter Li to a first in a short series of articles about undocumented aspects of Mex programing

Editing Matlab arrays in-place can be an important technique for optimizing calculations, especially when handling data that use large blocks of memory. The Matlab language itself has some limited support for in-place editing, but when we are really concerned with speed we often turn to writing C/C++ extensions using the Mex interface. Unfortunately, editing arrays in-place from Mex extensions is not officially supported in Matlab, and doing it incorrectly can cause data inconsistencies or can even cause Matlab to crash. In this article, I will introduce the problem and show a simple solution that exhibit the basic implementation details of Matlab’s internal copy-on-write mechanism.

Why edit in-place?

To demonstrate the techniques in this article, I use the fast_median function, which is part of my nth_element package on Matlab’s File Exchange. You can download the package and play with the code if you want. The examples are fairly self-explanatory, so if you do not want to try the code you should be okay just following along.

Let us try a few function calls to see how editing in-place can save time and memory:

>> A = rand(100000000, 1);
>> tic; median(A); toc    
Elapsed time is 4.122654 seconds.
 
>> tic; fast_median(A); toc
Elapsed time is 1.646448 seconds.
 
>> tic; fast_median_ip(A); toc
Elapsed time is 0.927898 seconds.

If you try running this, be careful not to make A too large; tune the example according to the memory available on your system. In terms of the execution time for the different functions, your mileage may vary depending on factors such as: your system, what Matlab version you are running, and whether your test data is arranged in a single vector or a multicolumn array.

This example illustrates a few general points: first, fast_median is significantly faster than Matlab’s native median function. This is because fast_median uses a more efficient algorithm; see the nth_element page for more details. Besides being a shameless plug, this demonstrates why we might want to write a Mex function in the first place: rewriting the median function in pure Matlab would be slow, but using C++ we can significantly improve on the status quo.

The second point is that the in-place version, fast_median_ip, yields an additional speed improvement. What is the difference? Let us look behind the scenes; here are the CPU and memory traces from my system monitor after running the above:

Memory and CPU usage for median() vs. fast_median_ip()

Memory and CPU usage for median vs. fast_median_ip

You can see four spikes in CPU use, and some associated changes in memory allocation:

The first spike in CPU is when we created the test data vector; memory use also steps up at that time.

The second CPU spike is the largest; that is Matlab’s median function. You can see that over that period memory use stepped up again, and then stepped back down; the median function makes a copy of the entire input data, and then throws its copy away when it is finished; this is expensive in terms of time and resources.

The fast_median function is the next CPU spike; it has a similar step up and down in memory use, but it is much faster.

Finally, in the case of fast_median_ip we see something different; there is a spike in CPU use, but memory use stays flat; the in-place version is faster and more memory efficient because it does not make a copy of the input data.

There is another important difference with the in-place version; it modifies its input array. This can be demonstrated simply:

>> A = randi(100, [10 1]);
>> A'
ans = 39    42    98    25    64    75     6    56    71    89
 
>> median(A)
ans = 60
 
>> fast_median(A)
ans = 60
>> A'
ans = 39    42    98    25    64    75     6    56    71    89
 
>> fast_median_ip(A)
ans = 60
>> A'
ans = 39     6    25    42    56    64    75    71    98    89

As you can see, all three methods get the same answer, but median and fast_median do not modify A in the workspace, whereas after running fast_median_ip, input array A has changed. This is how the in-place method is able to run without using new memory; it operates on the existing array rather than making a copy.

Pitfalls with in-place editing

Modifying a function’s input is common in many languages, but in Matlab there are only a few special conditions under which this is officially sanctioned. This is not necessarily a bad thing; many people feel that modifying input data is bad programming practice and makes code harder to maintain. But as we have shown, it can be an important capability to have if speed and memory use are critical to an application.

Given that in-place editing is not officially supported in Matlab Mex extensions, what do we have to do to make it work? Let us look at the normal, input-copying fast_median function as a starting point:

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   mxArray *incopy = mxDuplicateArray(prhs[0]);
   plhs[0] = run_fast_median(incopy);
}

This is a pretty simple function (I have taken out a few lines of boiler plate input checking to keep things clean). It relies on helper function run_fast_median to do the actual calculation, so the only real logic here is copying the input array prhs[0]. Importantly, run_fast_median edits its inputs in-place, so the call to mxDuplicateArray ensures that the Mex function is overall well behaved, i.e. that it does not change its inputs.

Who wants to be well behaved though? Can we save time and memory just by taking out the input duplication step? Let us try it:

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   plhs[0] = run_fast_median(const_cast<mxarray *>(prhs[0]));  // </mxarray>
}

Very bad behavior; note that we cast the original const mxArray* input to a mxArray* so that the compiler will let us mess with it; naughty.

But does this accomplish edit in-place for fast_median? Be sure to save any work you have open and then try it:

>> mex fast_median_tweaked.cpp
>> A = randi(100,[10 1]);
>> fast_median_tweaked(A)
ans = 43

Hmm, it looks like this worked fine. But in fact there are subtle problems:

>> A = randi(100,[10 1]);
>> A'
ans = 65    92    14    26    41     2    45    85    53     2
>> B = A;
>> B'
ans = 65    92    14    26    41     2    45    85    53     2
 
>> fast_median_tweaked(A)
ans = 43
>> A'
ans = 2     2    41    26    14    45    65    85    53    92
>> B'
ans = 2     2    41    26    14    45    65    85    53    92

Uhoh, spooky; we expected that running fast_median_tweaked would change input A, but somehow it has also changed B, even though B is supposed to be an independent copy. Not good. In fact, under some conditions this kind of uncontrolled editing in-place can crash the entire Matlab environment with a segfault. What is going on?

Matlab’s copy-on-write mechanism

The answer is that our simple attempt to edit in-place conflicts with Matlab’s internal copy-on-write mechanism. Copy-on-write is an optimization built into Matlab to help avoid expensive copying of variables in memory (actually similar to what we are trying to accomplish with edit in-place). We can see copy-on-write in action with some simple tests:

Matlab's Copy-on-Write memory and CPU usage

Matlab's Copy-on-Write memory and CPU usage

% Test #1: update, then copy
>> tic; A = zeros(100000000, 1); toc
Elapsed time is 0.588937 seconds.
>> tic; A(1) = 0; toc
Elapsed time is 0.000008 seconds.
>> tic; B = A; toc   
Elapsed time is 0.000020 seconds.
 
% Test #2: copy, then update
>> clear
>> tic; A = zeros(100000000, 1); toc
Elapsed time is 0.588937 seconds.
>> tic; B = A; toc   
Elapsed time is 0.000020 seconds.
>> tic; A(1) = 0; toc
Elapsed time is 0.678160 seconds.

In the first set of operations, time and memory are used to create A, but updating A and “copying” A into B take no memory and essentially no time. This may come as a surprise since supposedly we have made an independent copy of A in B; why does creating B take no time or memory when A is clearly a large, expensive block?

The second set of operations makes things more clear. In this case, we again create A and then copy it to B; again this operation is fast and cheap. But assigning into A at this point takes time and consumes a new block of memory, even though we are only assigning into a single index of A. This is copy-on-write: Matlab tries to save you from copying large blocks of memory unless you need to. So when you first assign B to equal A, nothing is copied; the variable B is simply set to point to the same memory location already used by A. Only after you try to change A (or B), does Matlab decide that you really need to have two copies of the large array.

There are some additional tricks Matlab does with copy-on-write. Here is another example:

>> clear
>> tic; A{1} = zeros(100000000, 1); toc
Elapsed time is 0.573240 seconds.
>> tic; A{2} = zeros(100000000, 1); toc
Elapsed time is 0.560369 seconds.
 
>> tic; B = A; toc                     
Elapsed time is 0.000016 seconds.
 
>> tic; A{1}(1) = 0; toc               
Elapsed time is 0.690690 seconds.
>> tic; A{2}(1) = 0; toc
Elapsed time is 0.695758 seconds.
 
>> tic; A{1}(1) = 0; toc
Elapsed time is 0.000011 seconds.
>> tic; A{2}(1) = 0; toc
Elapsed time is 0.000004 seconds.

This shows that for the purposes of copy-on-write, different elements of cell array A are treated independently. When we assign B equal to A, nothing is copied. Then when we change any part of A{1}, that whole element must be copied over. When we subsequently change A{2}, that whole element must also be copied over; it was not copied earlier. At this point, A and B are truly independent of each other, as both elements have experienced copy-on-write, so further assignments into either A or B are fast and require no additional memory.

Try playing with some struct arrays and you will find that copy-on-write also works independently for the elements of structs.

Reconciling edit in-place with copy-on-write: mxUnshareArray

Now it is clear why we cannot simply edit arrays in-place from Mex functions; not only is it naughty, it fundamentally conflicts with copy-on-write. Naively changing an array in-place can inadvertently change other variables that are still waiting for a copy-on-write, as we saw above when fast_median_tweaked inadvertently changed B in the workspace. This is, in the best case, an unmaintainable mess. Under more aggressive in-place editing, it can cause Matlab to crash with a segfault.

Luckily, there is a simple solution, although it requires calling internal, undocumented Matlab functions.

Essentially what we need is a Mex function that can be run on a Matlab array that will do the following:

  1. Check whether the current array is sharing data with any other arrays that are waiting for copy-on-write.
  2. If the array is shared, it must be unshared; the underlying memory must be copied and all the relevant pointers need to be fixed so that the array we want to work on is no longer accessible by anyone else.
  3. If the array is not currently shared, simply proceed; the whole point is to avoid copying memory if we do not need to, so that we can benefit from the efficiencies of edit in-place.

If you think about it, this is exactly the operation that Matlab needs to run internally when it is deciding whether an assignment operation requires a copy-on-write. So it should come as no surprise that such a Mex function already exists in the form of a Matlab internal called mxUnshareArray. Here is how you use it:

extern "C" bool mxUnshareArray(mxArray *array_ptr, bool noDeepCopy);
 
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   mxUnshareArray(const_cast<mxarray *>(prhs[0]), true);  //</mxarray>
   plhs[0] = run_fast_median(const_cast<mxarray *>(prhs[0]));  //</mxarray>
}

This is the method actually used by fast_median_ip to efficiently edit in-place without risking conflicts with copy-on-write. Of course, if the array turns out to need to be unshared, then you do not get the benefit of edit in-place because the memory ends up getting copied. But at least things are safe and you get the in-place benefit as long as the input array is not being shared.

Further topics

The method shown here should allow you to edit normal Matlab numerical or character arrays in-place from Mex functions safely. For a Mex function in C rather than C++, omit the “C” in the extern declaration and of course you will have to use C-style casting rather than const_cast. If you need to edit cell or struct arrays in-place, or especially if you need to edit subsections of shared cell or struct arrays safely and efficiently while leaving the rest of the data shared, then you will need a few more tricks. A good place to get started is this article by Benjamin Schubert.

Unfortunately, over the last few years Mathworks seems to have decided to make it more difficult for users to access these kinds of internal methods to make our code more efficient. So be aware of the risk that in some future version of Matlab this method will no longer work in its current form.

Ultimately much of what is known about mxUnshareArray as well as the internal implementation details of how Matlab keeps track of which arrays are shared goes back to the work of Peter Boettcher, particularly his headerdump.c utility. Unfortunately, it appears that HeaderDump fails with Matlab releases >=R2010a, as Mathworks have changed some of the internal memory formats – perhaps some smart reader can pick up the work and adapt HeaderDump to the new memory format.

In a future article, I hope to discuss headerdump.c and its relevance for copy-on-write and edit in-place, and some other related tools for the latest Matlab releases that do not support HeaderDump.

Categories: Guest bloggers, High risk of breaking in future versions, Memory, Mex, Stock Matlab function, Undocumented feature
Tags: , , , ,
3 Comments

MLintFailureFiles or: Why can’t I save my m-file?!

A short while ago, I was working at a client’s location on some project, when a very strange thing happened: when editing an m-file in the Matlab editor, the mlint process (the one that reports errors and warnings on the right-hand side of the editor) suddenly decided that it has had enough with this specific file. It labeled the entire file as red, meaning that it has errors that prevent it from being run, although as far as we could both see the file was valid and had no syntax errors and actually ran fine.

Here’s a screenshot of the message that mlint reported in the editor:

erroneous mlint error report

erroneous mlint error report

Mlint’s confusion would not have bothered us if it were not for a seemingly-related issue: When we tried to save the file, the Matlab editor would only allow us to save the file under a different name. When we attempted to restart Matlab, the same behavior repeated for this file, and we could not proceed with our actual work, modifying and saving our m-file.

Not easily intimidated, I tried saving the file under a different name in the editor, then renaming it back outside Matlab. No good, same problem.

MLintFailureFiles or: yet another use for prefdir

The clue that eventually led to the fix was that somehow Matlab remembered the “faulty” state of the m-file even after I restarted Matlab. Matlab likes to store persistent cross-session information in a single folder, which is the user’s prefdir folder. I have used this to my advantage only a few weeks ago, when I explained how to use the stored editor state file that is stored in that folder.

My hunch was that Matlab might store the mlint’s “faulty” m-file state somewhere in there. I didn’t have to look very far: one of the files most recently modified had the tell-tale name of MLintFailureFiles. This turns out to be an empty file that contains the full pathname of the “faulty” m-files, one file per line.

The fix turned out to be very easy: simply remove my file from the MLintFailureFiles file (or delete MLintFailureFiles entirely) and restart Matlab. Problem solved – everything back to normal. Resume breathing.

Undocumented? well, not exactly

I then went back to the suggestion in the mlint message and true enough I found related information in the documentation, at the bottom of the doc-page for using mlint. This is from R2008a’s doc page:

About M-Lint and Unexpected MATLAB® Termination

Under some circumstances, when you are editing an M-file, M-Lint can cause the MATLAB session to terminate unexpectedly. The next time you start MATLAB, it displays the following message and disables M-Lint for the M-file that was open in the editor when MATLAB terminated.

M-Lint caused your previous MATLAB session to terminate unexpectedly.
Please send this message and file name to The MathWorks.
See "About M-Lint and Unexpected MATLAB Termination" in the MATLAB documentation for details.

If you want, while waiting for a response from The MathWorks™, you can attempt to reenable M-Lint for the file by following these steps:

  1. In the Editor, reopen the file that you were editing when MATLAB terminated.
  2. Remove the lines of code that you believe M-Lint could not handle.
  3. In a text editor, open the MLintFailureFiles file in your preferences directory. (This is the directory that MATLAB returns when you run prefdir.)
  4. Save and reopen the M-file.

As you can see, this is not very helpful because it’s missing the step of actually editing (or deleting) the MLintFailureFiles file.

Additional documentation of this issue and workaround can be found in two separate Mathworks technical notes: here and here.

A Matlab user has even created a utility on the File Exchange that simply deletes this file, basing his work on the first note above.

The second note indicates that the problem (and workaround) occur in Matlab releases R2008a-R2010b, and indeed the above-mentioned issue (and workaround) can no longer be found in the latest docs (as far as I could see). However, even if this issue (and MLintFailureFiles) may no longer be relevant in the latest Matlab releases, we may still see MLintFailureFiles in our prefdir, because the prefdir contents are automatically copied from the previous release whenever we install a newer Matlab release.

Categories: Medium risk of breaking in future versions, Stock Matlab function
Tags: , ,
Leave a comment

Using spinners in Matlab GUI

One of the few standard Java Swing controls that does not have any Matlab uicontrol counterpart is JSpinner. JSpinner is basically an editbox with two tiny adjacent up/down buttons. Spinners are similar in functionality to a combo-box (a.k.a. drop-down or pop-up menu), where a user can switch between several pre-selected values. They are often used when the list of possible values is too large to display in a combo-box menu. Like combo-boxes, spinners too can be editable (meaning that the user can type a value in the editbox) or not (the user can only “spin” the value using the up/down buttons).

JSpinner uses an internal data model, similarly to JTree, JTable and other complex controls. The default model is SpinnerNumberModel, which defines a min/max value (unlimited=[] by default) and step-size (1 by default). Additional predefined models are SpinnerListModel (which accepts a cell array of possible string values) and SpinnerDateModel (which defines a date range and step unit).

Here’s a basic code snippet showing how to display a simple numeric spinner for numbers between 20 and 35, with an initial value of 24 and increments of 1:

jModel = javax.swing.SpinnerNumberModel(24,20,35,1);
jSpinner = javax.swing.JSpinner(jModel);
jhSpinner = javacomponent(jSpinner, [10,10,60,20], gcf);

The spinner value can be set using the edit-box or by clicking on one of the tiny arrow buttons, or programmatically by setting the Value property. The spinner object also has related read-only properties NextValue and PreviousValue. The spinner’s model object has the corresponding Value (settable), NextValue (read-only) and PreviousValue (read-only) properties. In addition, the model also has the settable Maximum, Minimum and StepSize properties.

To attach a data-change callback, set the spinner’s StateChangedCallback property.

I have created a small Matlab demo, SpinnerDemo, which demonstrates usage of JSpinner in Matlab figures. Each of the three predefined models (number, list, and date) is presented, and the spinner values are inter-connected via their callbacks. The Matlab code is modeled after the Java code that is used to document JSpinner in the official documentation. Readers are welcome to download this demo from the Matlab File Exchange and reuse its source code.

Java's SpinnerDemo

Java's SpinnerDemo

  
My Matlab SpinnerDemo

My Matlab SpinnerDemo

As can be seen from the screenshot, SpinnerDemo also demonstrates how to attach a label to a GUI control with an associated accelerator key (Alt-D in the screenshot example, which sets the focus to the Date control).

An internal component in Matlab, namely com.mathworks.mwswing.MJSpinner, extends javax.swing.JSpinner, but in this particular case I cannot see any big advantage of using the internal MJSpinner rather than the standard JSpinner. On the contrary, using JSpinner will likely improve forward compatibility – MathWorks may well change MJSpinner in the future, but it cannot do anything to the standard Swing JSpinner. In other cases, internal Matlab controls do offer significant advantages over the standard Swing controls, but not here it would seem. In any case, the SpinnerDemo utility uses MJSpinner, but you can safely use JSpinner instead (line #86).

The internal Matlab controls are discussed in detail in Chapter 5 of my Matlab-Java book, and MJSpinner is specifically discussed in section 5.2.1.

Another JSpinner derivative is JIDE’s com.jidesoft.grid.SpinnerCellEditor, which can be used as the cell-editor component in tables. An example of this was shown in the article about Advanced JIDE Property Grids (and section 5.7.5 in the book). You may also be interested in the com.jidesoft.combobox.DateSpinnerComboBox, which presents a control that includes both a date-selection combo-box and a spinner (section 5.7.2):

A property grid with spinner control

A property grid with spinner control

  
JIDE's DateSpinnerComboBox

JIDE's DateSpinnerComboBox

Categories: GUI, Java, Low risk of breaking in future versions
Tags: , ,
1 Comment

Matlab-Java memory leaks, performance

There are several ways of retrieving information from a Java object into Matlab. On the face of it, all these methods look similar. But it turns out that there are important differences between them in terms of memory leakage and performance.

The problem: “Matlab crashes” – now go figure…

A client of one of my Matlab programs recently complained that Matlab crashes after several hours of extensive use of the program. The problem looked like something that is memory related (messages such as Matlab’s out-of-memory error or Java’s heap-space error). Apparently this happens even on 64-bit systems having lots of memory, where memory should never be a problem.

Well, we know that this is only in theory, but in practice Matlab’s internal memory management has problems that occasionally lead to such crashes. This is one of the reasons, by the way, that recent Matlab releases have added the preference option of increasing the default Java heap space (the previous way to do this was a bit complex). Still, even with a high Java heap space setting and lots of RAM, Matlab crashed after using my program for several hours.

Not pleasant at all, even a bit of an embarrassment for me. I’m used to crashing Matlab, but only as a result of my playing around with the internals – I would hate it to happen to my clients.

Finding the leak

While we can do little with Matlab’s internal memory manager, I started searching for the exact location of the memory leak and then to find a way to overcome it. I’ll save readers the description about the grueling task of finding out exactly where the memory leak occurred in a program that has thousands of lines of code and where events get fired asynchronously on a constant basis. Matlab Profiler’s undocumented memory profiling option helped me quite a bit, as well as lots of intuition and trial-and-error. Detecting memory leak is never easy, and I consider myself somewhat lucky this time to have both detected the leak source and a workaround.

It turned out that the leakage happens in a callback that gets invoked multiple times per second by a Java object (see related articles here and here). Each time the Matlab callback function is invoked, it reads the event information from the supplied Java event-data (the callback’s second input parameter). Apparently, about 1KB of memory gets leaked whenever this event-data is being read. This may appear a very small leak, but multiply this by some 50-100K callback invocations per hour and you get a leakage of 50-100MB/hour. Not a small leak at all; more of a flood you could say…

Using get()

The leakage culprit turned out to be the following code snippet:

% 160 uSecs per call, with memory leak
eventData  = get(hEventData,{'EventName','ParamNames','EventData'});
eventName  = eventData{1};
paramNames = eventData{2};
paramData  = eventData{3}.cell;

In this innocent-looking code, hEventData is a Java object that contains the EventName, ParamNames, EventData properties: EventName is a Java String, that is automatically converted by Matlab’s get() function into a Matlab string (char array); ParamNames is a Java array of Strings, that gets automatically converted into a Matlab cell-array of string; and EventData is a Java array of Objects that needs to be converted into a Matlab cell array using the built-in cell function, as described in one of my recent articles.

The code is indeed innocent, works really well and is actually extremely fast: each invocation takes of this code segment takes less than 0.2 millisecs. Unfortunately, because of the memory leak I needed to find a better alternative.

Using handle()

The first idea was to use the built-in handle() function, under the assumption that it would solve the memory leak, as reported here. In fact, MathWorks specifically advises to use handle() rather than to work with “naked” Java objects, when setting Java object callbacks. The official documentation of the set function says:

Do not use the set function on Java objects as it will cause a memory leak.

It stands to reason then that a similar memory leak happens with get and that a similar use of handle would solve this problem:

% 300 uSecs per call, with memory leak
s = get(handle(hEventData));
eventName  = s.EventName;
paramNames = s.ParamNames;
paramData  = cell(s.EventData);

Unfortunately, this variant, although working correctly, still leaks memory, and also performs almost twice as worse than the original version, taking some 0.3 milliseconds to execute per invocation. Looks like this is a dead end.

Using Java accessor methods

The next attempt was to use the Java object’s internal accessor methods for the requested properties. These are public methods of the form getXXX(), isXXX(), setXXX() that enable Matlab to treat XXX as a property by its get and set functions. In our case, we need to use the getter methods, as follows:

% 380 uSecs per call, no memory leak
eventName  = char(hEventData.getEventName);
paramNames = cell(hEventData.getParamNames);
paramData  = cell(hEventData.getEventData);

Here, the method getEventName() returns a Java String, that we convert into a Matlab string using the char function. In our previous two variants, the get function did this conversion for us automatically, but when we use the Java method directly we need to convert the results ourselves. Similarly, when we call getParamNames(), we need to use the cell function to convert the Java String[] array into a Matlab cell array.

This version at last doesn’t leak any memory. Unfortunately, it has an even worse performance: each invocation takes almost 0.4 milliseconds. The difference may seem insignificant. However, recall that this callback gets called dozens of times each second, so the total adds up quickly. It would be nice if there were a faster alternative that does not leak any memory.

Using struct()

Luckily, I found just such an alternative. At 0.24 millisecs per invocation, it is almost as fast as the leaky best-performance original get version. Best of all, it leaks no memory, at least none that I could detect.

The mechanism relies on the little-known fact that public fields of Java objects can be retrieved in Matlab using the built-in struct function. For example:

>> fields = struct(java.awt.Rectangle)
fields = 
             x: 0
             y: 0
         width: 0
        height: 0
      OUT_LEFT: 1
       OUT_TOP: 2
     OUT_RIGHT: 4
    OUT_BOTTOM: 8
 
>> fields = struct(java.awt.Dimension)
fields = 
     width: 0
    height: 0

Note that this useful mechanism is not mentioned in the main documentation page for accessing Java object fields, although it is indeed mentioned in another doc-page – I guess this is a documentation oversight.

In any case, I converted my Java object to use public (rather than private) fields, so that I could use this struct mechanism (Matlab can only access public fields). Yes I know that using private fields is a better programming practice and all that (I’ve programmed OOP for some 15 years…), but sometimes we need to do ugly things in the interest of performance. The latest version now looks like this:

% 240 uSecs per call, no memory leak
s = struct(hEventData);
eventName  = char(s.eventName);
paramNames = cell(s.paramNames);
paramData  = cell(s.eventData);

This solved the memory leakage issue for my client. I felt fortunate that I was not only able to detect Matlab’s memory leak but also find a working workaround without sacrificing performance or functionality.

In this particular case, I was lucky to have full control over my Java object, to be able to convert its fields to become public. Unfortunately, we do not always have similar control over the object that we use, because they were coded by a third party.

By the way, Matlab itself uses this struct mechanism in its code-base. For example, Matlab timers are implemented using Java objects (com.mathworks.timer.TimerTask). The timer callback in Matlab code converts the Java timer event data into a Matlab struct using the struct function, in %matlabroot%/toolbox/matlab/iofun/@timer/timercb.m. The users of the timer callbacks then get passed a simple Matlab EventData struct without ever knowing that the original data came from a Java object.

As an interesting corollary, this same struct mechanism can be used to detect internal properties of Matlab class objects. For example, in the timers again, we can get the underlying timer’s Java object as follows (note the highlighted warning, which I find a bit ironic given the context):

>> timerObj = timerfind
 
   Timer Object: timer-1
 
   Timer Settings
      ExecutionMode: singleShot
             Period: 1
           BusyMode: drop
            Running: off
 
   Callbacks
           TimerFcn: @myTimerFcn
           ErrorFcn: ''
           StartFcn: ''
            StopFcn: ''
 
>> timerFields = struct(timerObj)
Warning: Calling STRUCT on an object prevents the object from hiding its implementation details and should thus be avoided.Use DISP or DISPLAY to see the visible public details of an object. See 'help struct' for more information.(Type "warning off MATLAB:structOnObject" to suppress this warning.)timerFields = 
         ud: {}
    jobject: [1x1 javahandle.com.mathworks.timer.TimerTask]
Categories: Java, Low risk of breaking in future versions, Memory, Semi-documented feature, Stock Matlab function
Tags: , , ,
18 Comments