Spy Easter egg take 2

Three years ago, I posted a short post about Matlab’s built-in Easter egg in the spy function. Apparently, when running spy with no input arguments, it uses an undocumented default built-in sparse matrix that generates the white spy in the famous Spy vs. Spy comics series:

>> spy;

Matlab spy Easter egg

As was recently reported by Aurélien, the default built-in sparse matrix has changed in R2011a (not R2011b as in the original report):

Matlab spy Easter egg

If you ask me, the previous (white spy) image had more relevance to the spy function… I assume the new image was not chosen arbitrarily – if anyone has some insight as to why this image was chosen and its relevance to spy, please post a comment.

Addendum: The original spy image can still be generated using the following code snippet:

c = [';@3EA4:aei7]ced.CFHE;4\T>*Y>,dL0,HOQQMJLJE9PX[[Q.ZF.\JTCA1dd'
     '<a ;FB:;bfj8^df//DGIF&lt;5]UF+ZH-eM>-IorRPNMPIE-Y\\R8[I8]SUDW2e+'
     '=4BGC;<cgk9_e00deojg =6^VG,[I.fN?5jpsSQPNQPF.Z,]S9`S9cTWVX:+,'
     ':5CHD<=4hlh`f11EFPKHA7&WH-\J/gOC?kqtTRRORQJ8--^TB+T=dWYWY;,_'
     ';6D3E=>7imiag2IFOQLID8''XI.]K0"PD@l32UZhP//P988_WC,U>+Z^Y\&lt;2`'
     '&lt;82BF>?8jnjbhLJGPRMJE9/YJ/`L1#QMC$;;V[iv09QE99,XD.YB,[_\]=3a'
     '>9;CG?@9kokc2MKHQSOKF:0ZL0aM2$RNG%AAW\jw9E.FEE-_G8aG.d`]_W5+'
     '?:CDH@A:lpld3NLIRTPLG=1[M1bN3%SOH4BBX]kx:J9LLL8`H9bJ/+d_dX6,'
     '@;DEIAB;mqmePOMJSUQMJ>2\N2cO4&TPP@HCY^lyDKEMMN9+I@+S8,+deY7^'
     '8@EFJBC&lt;4rnfQPNPTVRNKB3]O3dP5''UQQCIDZ_mzEPFNNOE,RA,T9/,++\8_'
     '9A2G3CD=544gRQPQUWUOLE4^P4"Q6(VRRIJE[`n{KQKOOPK-SE.W:F/,,]Z+'
     ':BDH4DE>655hSRQRVXVPMF5_Q5#R>)eSSJKF\ao0L.L-WUL.VF8XCH001_[,'
     ';3EI<eo ?766iTSRSWYWQNG6$R6''S?*fTTlLQ]bp1M/P.XVP8[H9]DIDA=`\]'
     '?4D3=FP@877jUTSTXZXROK7%S7(TF+gUUmMR^cq:N9Q8YZQ9_I>cIJEB>d_^'
     '@5E@>GQA98b3VUTUY*YSPL8&T>)UI,hVhnNS_dr;PE.9Z[RCaR?+JTFC?e`+'
     '79FA?HRB:9c4WVUVZ+ZWQM=,WG*VJ-"gi4OT`es<ql9e [\TD+SA,SWUVW+d,'
     '8:3B@JSX;:dVXWVW[,[XRN>-XH+bK.#hj@PUvftDRMEF,]UH,UB.TYVWX,e\'
     '9;ECAKTY< ;eWYXWX\:)YSOE.YI,cL/$ikCqV1guE/PFL-^XI-YG/WZWXY1+]'
     ':AFDBLUZ=<fXZYXY,;*ZTPF/ZJ-dM0%j#Jrt2hxH0QKM8,YJ.ZI8[^YY\2,,'
     ';B3ECMV[>jgY[ZYZ-&lt;7[XQG0[K.eN1&"$K2u:iyO9.PN9-_K8aJ9\_]\]82['
     '?CEFDNW\?khZ\[Z[==8\YRH1\M/!O2''#%m31Bw0PE/QXE8+R9bS;da^]_93\'
     '@2FGEOX]ali[]\[\>>9(ZSL2]N0"P3($&n;2Cx1QN9--L9,SA+T< +d__`:4,'
     'A3GHFPY^bmj\^]\]??:)[TM3^O1%Q4)%''oA:D0:0OE.8ME-TE,XB,+`da;5['
     '643IGQZ_cnk]_^]^@@;5\UN4_P2&R6*&(3B;E1&lt;1PN99NL8WF.^C/,a+bY6,'
     '7:F3HR[`dol^`_^_AA&lt;6]VO5`Q3''S>+'');CBF:=:QOEEOO9_G8aH6/d,cZ[Y'
     '8;G4IS\aep4_a`_-BD=7''XP6aR4(T?,(5@DCHCC;RPFLPPD`H9bJ70+0d\\Z'
     '9BH>JT^bf45`ba`.CE@8(YQ7#S5)UD-)?AEDIDDD/QKMVQJ+S?cSDF,1e]a,'
     ':C3?K4_cg5[acbaADFA92ZR8$T6*VE.*@JFEJEEE0.NNWTK,U@+TEG0?+_bX'
     ';2D@L9`dh6\bdcbBEGD:3[S=)U7+cK/+CKGFLIKI9/OWZUL-VA,WIHB@,`cY'];
i = double(c(:)-32);
j = cumsum(diff([0; i])< =0) + 1;
S = sparse(i,j,1)';
spy(S)

Happy Easter / Passover everybody!

Categories: Low risk of breaking in future versions, Stock Matlab function, Undocumented feature
Tags: , , ,
4 Comments

Extending a Java class with UDD

Once again I welcome Donn Shull, with another article about Matlab’s internal UDD mechanism

Extending a Java class with UDD

During the series on UDD, we have mentioned the connection between UDD and Java. In UDD Events and Listeners we described how in Matlab, each Java object can have a UDD companion. In Hierarchical Systems with UDD we briefly noted that a UDD hierarchy may be passed to Java. In the numerous posts on handle graphics and callbacks, Yair has discussed the UDD packages javahandle and javahandle_withcallbacks. Based on this information, it seems reasonable to speculate that it may be possible to extend a Java class with UDD using UDD’s class inheritance mechanism.

This can be extremely useful in two cases:

  • You don’t know Java but found a Java class you would like to use in Matlab, it just needs minor modifications for your specific needs
  • You do know Java, but don’t have access to the original source code, and choose to extend the Java class with Matlab code, rather than Java code

Today I will show how this can be done using a simple example. Our example will illustrate the following things:

  1. Subclassing a Java class with UDD
  2. Adding UDD properties to the to the subclass
  3. Overloading a Java method with Matlab code
  4. Directly accessing the superclass methods

The example will show extending Java socket classes to provide a simple method for communication between two Matlab sessions. The protocol has been kept purposely simple and is not robust. Additional work would need to be done to create a real-life socket-based communication between Matlab systems (see for example this FEX submission).

Today’s example consists of two subclasses: a subclass of java.net.ServerSocket and a subclass of java.net.Socket. The protocol will be sending strings back and forth between the two sessions. In each direction the information exchange will consist of two bytes containing the string length, followed by the actual string. The entire source code can be downloaded from here.

Creating the simple.ServerSocket class

As in the UDD series, we will use the simple package for our classes and in this package create a ServerSocket class and a Socket class. Recall the simple package definition class is placed in a file named schema.m in a directory called @simple, placed somewhere on the Matlab path. schema.m consists of:

function schema()
%SCHEMA  simple package definition function.
   schema.package('simple');
end

In our ServerSocket class we will add three UDD properties and overload two of the Java class methods. It is worth noting that our final class will have all the parent Java classes public properties and methods and if necessary we can access the parent or super class methods directly. As before, we create a subfolder of the @simple folder named @ServerSocket; in this folder we place four files:

  1. schema.m – the class definition file
  2. ServerSocket.m – the class constructor
  3. accept.m – one of the Java methods that we will overload
  4. bind.m – the other Java method that we will overload

At the beginning of our schema.m file, we will use the following code to subclass the Java class:

function schema.m
%SCHEMA  simple.ServerSocket class definition function.
 
    % parent schema.class definition
    javaPackage = findpackage('javahandle');
    javaClass = findclass(javaPackage, 'java.net.ServerSocket');
 
    % class package (schema.package)
    simplePackage = findpackage('simple');
 
    % class definition
    simpleClass = schema.class(simplePackage, 'ServerSocket', javaClass);

Here, we use findpackage and findclass to obtain the schema.class for the Java class that we are going to use as our parent. We then obtain a handle to the containing package, and finally use the subclass variation to define our ServerSocket as a variation of the Java parent’s schema.class.

Next, in the class definition file we place the code to define the signatures for the methods we are overloading:

    % accept.m overloads java accept method and adds communication protocol
    m = schema.method(simpleClass, 'accept');
    s = m.Signature;
    s.varargin    = 'off';
    s.InputTypes  = {'handle'};
    s.OutputTypes = {'string'};
 
    % bind.m overloads java bind method
    m = schema.method(simpleClass, 'bind');
    s = m.Signature;
    s.varargin    = 'off';
    s.InputTypes  = {'handle'};
    s.OutputTypes = {};be

Finally, we add three UDD properties to the class: The first will be used to hold a string representation of the address of our ServerSocket; the second will store the communication port number; the third is a handle property that will hold the reference to the socket used by the actual communication.

    % holds remote address as a matlab string
    p = schema.prop(simpleClass, 'address', 'string');
    p.FactoryValue = 'localhost';
 
    % holds remote port as a matlab int
    p = schema.prop(simpleClass, 'port', 'int16');
    p.FactoryValue = 2222;
 
    % holds a handle reference to the socket created in the accept method
    p = schema.prop(simpleClass, 'socket', 'handle');
end

We now need to write our overloaded methods. The bind method is simple: it first creates a Java internet address using the new address and port properties; then it uses the standard Java class methods to call the superclass’s bind method with the specified internet address:

function bind(this)
    % use the object socket and port port properties to bind this instance
    % to a address calling the superclass bind method
    inetAddress = java.net.InetSocketAddress(this.address, this.port);
    this.java.bind(inetAddress);
end

The overloaded accept method is a bit more complicated and crude: It starts by calling the superclass accept method to create a communication socket and stores the created socket in our class’s socket property. Then it goes into an infinite loop of waiting for incoming commands, uses evalc to execute them, and returns the captured result to the caller. The only way out of this loop is using Ctrl-C from the keyboard.

function accept(this)
 
    % use the superclass accept
    this.socket = handle(this.java.accept);
 
    % infinate loop use ctrl-c to exit
    while 1
        % wait for a command then execute it capturing output
        while this.socket.getInputStream.available < 2
        end
 
        msb = this.socket.getInputStream.read;
        lsb = this.socket.getInputStream.read;
 
        numChar = 256 * msb + lsb;
        cmd = uint8(zeros(1, numChar));
 
        for index = 1:numChar
            cmd(index) = this.socket.getInputStream.read;
        end
        result = evalc(char(cmd));
 
        % send the result back to the calling system
        len = numel(result);
        msb = uint8(floor(len/256));
        lsb = uint8(mod(len,256));
 
        this.socket.getOutputStream.write(uint8([msb, lsb, result]));
    end
end

Creating the simple.Socket class

The simple.Socket class is created like ServerSocket, this time in the @Socket folder under the @simple folder. In this subclass we add properties for the address and port, just as in ServerSocket. We overload the superclass’s connect method with our own variant, and add a new method to make the remote calls to the ServerSocket running in another Matlab instance. Beginning with the schema.m file we have:

function schema
%SCHEMA  simple.Socket class definition function.
 
    % package definition
    simplePackage = findpackage('simple');
    javaPackage = findpackage('javahandle');
    javaClass = findclass(javaPackage, 'java.net.Socket');
 
    % class definition
    simpleClass = schema.class(simplePackage, 'Socket', javaClass);
 
    % define class methods
    % connect.m overloads java connect method
    m = schema.method(simpleClass, 'connect');
    s = m.Signature;
    s.varargin    = 'off';
    s.InputTypes  = {'handle'};
    s.OutputTypes = {};
 
    % remoteEval.m matlab method for remote evaluation of Matlab commands
    m = schema.method(simpleClass, 'remoteEval');
    s = m.Signature;
    s.varargin    = 'off';
    s.InputTypes  = {'handle', 'string'};
    s.OutputTypes = {'string'};
 
    % add properties to this class
    % holds remote address as a Matlab string
    p = schema.prop(simpleClass, 'address', 'string');
    p.FactoryValue = 'localhost';
 
    % holds remote port as a Matlab int
    p = schema.prop(simpleClass, 'port', 'int16');
    p.FactoryValue = 2222;
end

The class constructor Socket.m is simply:

function skt = Socket
%SOCKET constructor for the simple.Socket class
    skt = simple.Socket;
end

The overloaded connect method is almost identical to the overloaded bind method we used for ServerSocket. We form a Java internet address from our new properties and then invoke the superclass's connect Java method:

function connect(this)
%CONNECT overload of the java.net.Socket connect method
    % use the object address and port properties to connect to the remote
    % session via the superclass connect method
    inetAddress = java.net.InetSocketAddress(this.address, this.port);
    this.java.connect(inetAddress);
end

Finally, our remoteEval method is very similar to the loop portion of the overloaded accept method we wrote for simple.ServerSocket. We take the command string input and convert it into a series of bytes prepended by the length of the string, send it to the other Matlab session and wait for a response:

function result = remoteEval(this, cmd)
%REMOTEEVAL evaluate a Matlab command on a remotely connected Matlab
 
    % The command string is sent as a series of bytes preceded by a pair of
    % bytes which represents the length of the string  
    cmd = uint8(cmd);
 
    len = numel(cmd);
    msb = uint8(floor(len/256));
    lsb = uint8(mod(len,256));
 
    this.getOutputStream.write([msb, lsb, cmd]);
 
    % We will expect the remote session to return a string in the same format
    % as the command
    while this.getInputStream.available < 2
    end
 
    msb = this.getInputStream.read;
    lsb = this.getInputStream.read;
 
    numChar = 256 * msb + lsb;
 
    result = uint8(zeros(1, numChar));
    for index = 1:numChar
        result(index) = this.getInputStream.read;
    end
    result = char(result);
end

Using simple.ServerSocket and simple.Socket to communicate between Matlab sessions

To use this example, add the zip contents to your Matlab path, then open an instance of Matlab and issue the following commands:

>> ss = simple.ServerSocket;
>> ss.bind;
>> ss.accept;

Then open another Matlab instance and issue these commands:

>> s = simple.Socket;
>> s.connect;

At this point you can send commands from this Matlab instance (the client) to the first instance (the server) using the remoteEval method. The command will then be transmitted to the server, executed, and the server will return the captured string result to the client:

>> remoteResult = s.remoteEval('pi')
remoteResult =
    3.1416

The defaults are for localhost and port 2222. These can be changed prior to using the server's bind method and the client's connect method. To keep things as simple as possible, error checking etc. has been left out, so this is just a demonstration and is far from robust.

There are some things to note about our new classes. If we type methods(s) or s.methods at the Matlab command prompt in our simple.Socket session we obtain:

>> s.methods
 
Methods for class simple.Socket:
 
Socket                     getOOBInline               isClosed                   setReuseAddress            
bind                       getOutputStream            isConnected                setSendBufferSize          
close                      getPort                    isInputShutdown            setSoLinger                
connect                    getReceiveBufferSize       isOutputShutdown           setSoTimeout               
equals                     getRemoteSocketAddress     java                       setSocketImplFactory       
getChannel                 getReuseAddress            notify                     setTcpNoDelay              
getClass                   getSendBufferSize          notifyAll                  setTrafficClass            
getInetAddress             getSoLinger                remoteEval                 shutdownInput              
getInputStream             getSoTimeout               sendUrgentData             shutdownOutput             
getKeepAlive               getTcpNoDelay              setKeepAlive               toString                   
getLocalAddress            getTrafficClass            setOOBInline               wait                       
getLocalPort               hashCode                   setPerformancePreferences  
getLocalSocketAddress      isBound                    setReceiveBufferSize

This shows that our simple.Socket class has all of the methods of the Java superclass, plus our added remoteEval method and the java method that was automatically added by Matlab. This means that all of the Java methods are methods of our class instance and the added java means that we can access the superclass methods from our class instance if the need arises. If we use the struct function which Yair has previously discussed, we obtain:

>>  struct(s)
ans = 
              OOBInline: 0
                  Bound: 1
                Channel: []
                  Class: [1x1 java.lang.Class]
                 Closed: 0
              Connected: 1
            InetAddress: [1x1 java.net.Inet4Address]
          InputShutdown: 0
            InputStream: [1x1 java.net.SocketInputStream]
              KeepAlive: 0
           LocalAddress: [1x1 java.net.Inet4Address]
              LocalPort: 51269
     LocalSocketAddress: [1x1 java.net.InetSocketAddress]
         OutputShutdown: 0
           OutputStream: [1x1 java.net.SocketOutputStream]
      ReceiveBufferSize: 8192
    RemoteSocketAddress: [1x1 java.net.InetSocketAddress]
           ReuseAddress: 0
         SendBufferSize: 8192
               SoLinger: -1
              SoTimeout: 0
             TcpNoDelay: 0
           TrafficClass: 0
                address: 'localhost'
                   port: 2222

We see that we have access to all of the public properties of the Java superclass, as well as the UDD properties that we have added.

Conclusion

At the beginning of this post I said that this would be a simple non-robust communications method. In order to make this anything more than that, a number of things would need to be implemented, for example:

  • Improve the accept method to exit after a timeout or when a connection has been made and then terminated
  • Add checksums and timeouts for communication to determine the reliability of the communication
  • Add a retry request protocol for instances of communication failure
  • Add support for any serializable Matlab type, not just strings

The intent here was just to show that extending Java classes with Matlab is possible, relatively simple, and can be extremely useful. After all, with over 10 million Java developers out there, chances are that somebody somewhere has already posted a Java class that answers your exact need, or at least close enough that it can be used in Matlab with only some small modifications.

Categories: Guest bloggers, Java, Medium risk of breaking in future versions, Undocumented feature
Tags: , , , ,
Leave a comment

Expanding urlread capabilities

I would like to welcome guest blogger Jim Hokanson. Today, Jim will explain some of the limitations that at one time or another many of us have encountered with Matlab’s built-in urlread function. More importantly, Jim has spent a lot of time in creating an expanded-capabilities Matlab function, which he explains below. Note that while urlread‘s internals are undocumented, the changes outlined below rely on pretty standard Java and HTTP, and should therefore be quite safe to use on multiple Matlab versions.

Abstract

I recently tried to implement the Mendeley API but quickly found that urlread was not going to be sufficient for my needs. The first indication of this was my inability to send the proper authorization information for POST requests. It became even more obvious with the need to perform DELETE and PUT methods, since urlread only supports GET and POST. My implementation of urlread, which I refer to as urlread2, addresses these and a couple of other issues and can be found on the Matlab File Exchange. Other developers have tackled urlread‘s shortcomings (this example added timeout support, and this example added support for binary file upload) – today’s article will focus on my solution, but others are obviously possible.

Introduction

HTTP is the underlying computer networking protocol that enables us to read webpages on the Internet. It consists of a request made by the user to an Internet server (typically located via URL), and a response from that server. Importantly, the request and response consist of three main parts: a resource line (for requests) or status line (for responses), followed by headers, and optionally a message body.

Matlab’s built-in urlread function enables Matlab users to easily read the server’s response text into a Matlab string:

text = urlread('http://www.google.com');

This is done internally using Java code that connects to the specified URL and reads the information sent by the URL’s server (more on this).

urlread accepts optional additional inputs specifying the request type (‘get’ or ‘post’) and parameter values for the request.

Unfortunately, urlread has the following limitations:

  1. It does not allow specification of request headers
  2. It makes assumptions as to the request headers needed based on the input method
  3. It does not expose the response headers and status line
  4. It assumes the response body contains text, and not a binary payload
  5. It does not enable uploading binary contents to the server
  6. It does not enable specifying a timeout in case the server is not responding

urlread2

The urlread2 function addresses all of these problems. The overall design decision for this function was to make it more general, requiring more work up front to use in some cases, but more flexibility.

For reference, the following is the calling format for urlread2 (which is reminiscent of urlread‘s):

urlread2(url,*method,*body,*headersIn, varargin)

The * indicate optional inputs that must be spatially maintained.

  • url – (string), url to request
  • method – (string, default GET) HTTP request method
  • body – (string, default ”), body of the request
  • headersIn – (structure, default []), see the following section
  • varargin – extra properties that need to be specified via property/pair values

Addressing Problem 1 – Request header

urlread internally uses a Java object called urlConnection that is generally an instance of the class sun.net.www.protocol.http.HttpURLConnection. The method setRequestProperty() can be used to set headers for the request. This method has two inputs, the header name and the value of that header. A simple example of this can be seen below:

urlConnection.setRequestProperty('Content-Type','application/x-www-form-urlencoded');

Here ‘Content-Type’ is the header name and the second input is the value of that property. My function requires passing in nearly all headers as a structure array, with fields for the name and value. The preceding header would be created using a helper function http_createHeader.m:

header = http_createHeader('Content-Type','application/x-www-form-urlencoded');

Multiple headers can be passed in to the function by concatenating header structures into a structure array.

Addressing Problem 2 – Request parameters

When making a POST request, parameters are generally specified in the message body using the following format:

[property]=[value]&[property]=[value]

The properties and values are also encoded in a particular way, generally termed urlencoded (encoding and decoding can be done using Matlab’s built-in urlencode and urldecode functions). For GET requests this string is appended to the url with the “?” symbol. Since urlencoding methods can vary, and in the spirit of reducing assumptions, I use separate functions to generate these strings outside of urlread2, and then pass the result in either as the url (for GET) or as the body input (for POST). As an example, I might search the Mathworks website using the upper right search bar on its site for “undocumented matlab” under file exchange (hmmm… pretty cute stuff there!). Doing this performs a GET request with the following property/value pairs:

params = {'search_submit','fileexchange', 'term','undocumented matlab', 'query','undocumented matlab'};

These property/value pairs are somewhat obvious from looking at the URL, but could also be determined by using programs such as Fiddler, Firebug, or HttpWatch.

After urlencoding and concatenating, we would form the following string:

search_submit=fileexchange&term=undocumented+matlab&query=undocumented+matlab

This functionality is normally accomplished internally in urlread, but I use a function http_paramsToString to produce that result. That function also returns the required header for POST requests. The following is an example of both GET and POST requests:

[queryString,header] = http_paramsToString(params,1);
 
% For GET:
url = [url '?' queryString];
urlread2(url)
 
% For POST:
urlread2(url,'POST',queryString,header)

Addressing Problem 3 – Response header

According to the HTTP protocol, each server response starts with a simple header that indicates a numeric response status. The following Matlab code provides access to the status line using the urlConnection object:

status = struct('value',urlConnection.getResponseCode(), 'msg',char(urlConnection.getResponseMessage))
status = 
    value: 200
      msg: 'OK'

urlConnection‘s getHeaderField() and getHeaderFieldKey() methods enable reading the specific parts of the response header:

headerValue = char(urlConnection.getHeaderField(headerIndex));
headerName  = char(urlConnection.getHeaderFieldKey(headerIndex));

headerIndex starts at 0 and increases by 1 until both headerValue and headerName return empty.

It is important to note that header keys (names) can be repeated for different values. Sometimes this is desired, such as if there are multiple cookies being sent to the user. To generically handle this case, two header structures are returned. In both cases the header names are the field names in the structure, after replacing hyphens with underscores. In one case, allHeaders, the values are cell arrays of strings containing all values presented with the particular key. The other structure, firstHeaders, contains only the first instance of the header as a string to avoid needing to dereference a cell array.

Addressing Problem 4 – Response body

urlread assumes text output. This is fine for most webpages, which use HTML and are therefore text-based. However, urlread fails when trying to download any non-text resource such as an image, a ZIP file, or a PDF document. I have added a flag in urlread2 called CAST_OUTPUT, which defaults to true, i.e. text response, just as urlread assumes. Using varargin, this flag can be set to false ({‘CAST_OUTPUT’,false}) to indicate a binary response.

Summary

urlread2‘s functionality has been expanded to also address other limitations of urlread: It enables binary inputs, better character-set handling of the output, redirection following, and read timeouts.

The modifications described above provide direct access to the key components of the HTTP request and response messages. Its more generic nature lets urlread2 focus on HTTP transmission, and leaves request formation and response interpretation up to the user. I think ultimately this approach is better than providing one-off modifications of the original urlread function to suit a particular need. urlread2 and supporting files can be found on the Matlab File Exchange.

Categories: Guest bloggers, Java, Low risk of breaking in future versions, Stock Matlab function
Tags: ,
2 Comments

Matlab’s internal memory representation

Once again I’d like to welcome guest blogger Peter Li. Peter wrote about Matlab Mex in-place editing last month. Today, Peter pokes around in Matlab’s internal memory representation to the greater good and glory of Matlab Mex programming.

Disclaimer: The information in this article is provided for informational purposes only. Be aware that poking into Matlab’s internals is not condoned or supported by MathWorks, and is not recommended for any regular usage. Poking into memory has the potential to crash your computer so save your data! Moreover, be advised (as the text below will show) that the information is highly prone to change without any advance notice in future Matlab releases, which could lead to very adverse effects on any program that relies on it. On the scale of undocumented Matlab topics, this practically breaks the scale, so be EXTREMELY careful when using this.

A few weeks ago I discussed Matlab’s copy-on-write mechanism as part of my discussion of editing Matlab arrays in-place. Today I want to explore some behind-the-scenes details of how the copy-on-write mechanism is implemented. In the process we will learn a little about Matlab’s internal array representation. I will also introduce some simple tools you can use to explore more of Matlab’s internals. I will only cover basic information, so there are plenty more details left to be filled in by others who are interested.

Brief review of copy-on-write and mxArray

Copy-on-write is Matlab’s mechanism for avoiding unnecessary duplication of data in memory. To implement this, Matlab needs to keep track internally of which sets of variables are copies of each other. As described in MathWorks’s article, “the Matlab language works with a single object type: the Matlab array. All Matlab variables (including scalars, vectors, matrices, strings, cell arrays, structures, and objects) are stored as Matlab arrays. In C/C++, the Matlab array is declared to be of type mxArray“. This means that mxArray defines how Matlab lays out all the information about an array (its Matlab data type, its size, its data, etc.) in memory. So understanding Matlab’s internal array representation basically boils down to understanding mxArray.

Unfortunately, MathWorks also tells us that “mxArray is a C language opaque type“. This means that MathWorks does not expose the organization of mxArray to users (i.e. Matlab or Mex programmers). Instead, MathWorks defines mxArray internally, and allows users to interact with it only through an API, a set of functions that know how to handle mxArray in their back end. So, for example, a Mex programmer does not get the dimensions of an mxArray by directly accessing the relevant field in memory. Instead, the Mex programmer only has a pointer to the mxArray, and passes this pointer into an API function that knows where in memory to find the requested information and then passes the result back to the programmer.

This is generally a good thing: the API provides an abstraction layer between the programmer and the memory structures so that if MathWorks needs to change the back end organization (to add a new feature for example), we programmers do not need to modify our code; instead MathWorks just updates the API to reflect the new internal organization. On the other hand, being able to look into the internal structure of mxArray on occasion can help us understand how Matlab works, and can help us write more efficient code if we are careful as in the example of editing arrays in-place.

So how do we get a glimpse inside mxArray? The first step is simply to find the region of memory where the mxArray lives: its beginning and end. Finding where in memory the mxArray begins is pretty easy: it is given by its pointer value. Here is a simple Mex function that takes a Matlab array as input and prints its memory address:

/* printaddr.cpp */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   if (nrhs < 1) mexErrMsgTxt("One input required.");
   printf("%p\n", prhs[0]);
}

This function is nice as it prints the address in a standard hexadecimal format. The same information can also be received directly in Matlab (i.e., without needing printaddr), using the undocumented format debug command (here’s another reference):

>> format debug
 
>> A = 1:10
A =
Structure address = 7fc3b8869ae0
m = 1
n = 10
pr = 7fc44922c890
pi = 0
     1     2     3     4     5     6     7     8     9    10
 
>> printaddr(A)
7fc3b8869ae0

To play with this further from within Matlab however, it's nice to have the address returned to us as a 64-bit unsigned integer; here's a Mex function that does that:

/* getaddr.cpp */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   if (nrhs < 1) mexErrMsgTxt("One input required.");
   plhs[0] = mxCreateNumericMatrix(1, 1, mxUINT64_CLASS, mxREAL);
   unsigned long *out = static_cast<unsigned long *>(mxGetData(plhs[0]));
   out[0] = (unsigned long) prhs[0];
}

Here's getaddr in action:

>> getaddr(A)
ans = 
           139870853618400
 
% And using pure Matlab:
>> hex2dec('7f36388b5ae0')  % output of printaddr or format debug
ans =
           139870853618400

So now we know where to find our array in memory. With this information we can already learn a lot. To make our exploration a little cleaner though, it would be nice to know where the array ends in memory too, in other words we would like to know the size of the mxArray.

Finding the structure of mxArray

The first thing to understand is that the amount of memory taken by an mxArray does not have anything to do with the dimensions of the array in Matlab. So a 1x1 Matlab array and a 100x100 Matlab array have the same size mxArray representation in memory. As you will know if you have experience programming in Mex, this is simply because the Matlab array's data contents are not stored directly within mxArray. Instead, mxArray only stores a pointer to another memory location where the actual data reside. This is fine; the internal information we want to poke into is all still in mxArray, and it is easy to get the pointer to the array's data contents using the API functions mxGetData or mxGetPr.

So we are still left with trying to figure out the size of mxArray. There are a couple paths forward. First I want to talk about a historical tool that used to make a lot of this internal information easily available. This was a function called headerdump, by Peter Boetcher (described here and here). headerdump was created for exactly the goal we are currently working towards: to understand Matlab's copy-on-write mechanism. Unfortunately, as Matlab has evolved, newer versions have incrementally broken this useful tool. So our goal here is to create a replacement. Still, we can learn a lot from the earlier work.

One of the things that helped people figure out Matlab's internals in the past is that in older versions of Matlab mxArray is not a completely opaque type. Even in recent versions up through at least R2010a, if you look into $MATLAB/extern/include/matrix.h you can find a definition of mxArray_tag that looks something like this:

/* R2010a */
struct mxArray_tag {
   void  *reserved;
   int    reserved1[2];
   void  *reserved2;
   size_t  number_of_dims;
   unsigned int reserved3;
   struct {
       unsigned int  flag0  : 1;
       unsigned int  flag1  : 1;
       unsigned int  flag2  : 1;
       unsigned int  flag3  : 1;
       unsigned int  flag4  : 1;
       unsigned int  flag5  : 1;
       unsigned int  flag6  : 1;
       unsigned int  flag7  : 1;
       unsigned int  flag7a : 1;
       unsigned int  flag8  : 1;
       unsigned int  flag9  : 1;
       unsigned int  flag10 : 1;
       unsigned int  flag11 : 4;
       unsigned int  flag12 : 8;
       unsigned int  flag13 : 8;
   }   flags;
   size_t reserved4[2];
   union {
       struct {
           void  *pdata;
           void  *pimag_data;
           void  *reserved5;
           size_t reserved6[3];
       }   number_array;
   }   data;
};

This is what you could call murky or obfuscated, but not completely opaque. The fields mostly have unhelpful names like "reserved", but on the other hand we at least have a sense for what fields there are and their layout.

A more informative (yet unofficial) definition was provided by James Tursa and Peter Boetcher:

#include "mex.h"
/* Definition of structure mxArray_tag for debugging purposes. Might not be fully correct 
 * for Matlab 2006b or 2007a, but the important things are. Thanks to Peter Boettcher.
 */
struct mxArray_tag {
  const char *name;
  mxClassID class_id;
  int vartype;
  mxArray    *crosslink;
  int      number_of_dims;
  int      refcount;
  struct {
    unsigned int    scalar_flag : 1;
    unsigned int    flag1 : 1;
    unsigned int    flag2 : 1;
    unsigned int    flag3 : 1;
    unsigned int    flag4 : 1;
    unsigned int    flag5 : 1;
    unsigned int    flag6 : 1;
    unsigned int    flag7 : 1;
    unsigned int    private_data_flag : 1;
    unsigned int    flag8 : 1;
    unsigned int    flag9 : 1;
    unsigned int    flag10 : 1;
    unsigned int    flag11 : 4;
    unsigned int    flag12 : 8;
    unsigned int    flag13 : 8;
  }   flags;
  int  rowdim;
  int  coldim;
  union {
    struct {
      double  *pdata;       // original: void*
      double  *pimag_data;  // original: void*
      void *irptr;
      void  *jcptr;
      int   nelements;
      int   nfields;
    }   number_array;
    struct {
      mxArray **pdata;
      char  *field_names;
      void  *dummy1;
      void  *dummy2;
      int   dummy3;
      int   nfields;
    }   struct_array;
    struct {
      void *pdata;  /*mxGetInfo*/
      char *field_names;
      char *name;
      int checksum;
      int  nelements;
      int  reserved;
    }  object_array;
  }   data;
};

For comparison, here is another definition from an earlier version of Matlab.

/* R11 aka Matlab 5.0 (1999) */
struct mxArray_tag {
  char name[mxMAXNAM];
  int  class_id;
  int  vartype;
  mxArray *crosslink;
  int  number_of_dims;
  int  nelements_allocated;
  int  dataflags;
  int  rowdim;
  int  coldim;
  union {
    struct {
      void *pdata;
      void *pimag_data;
      void *irptr;
      void *jcptr;
      int   reserved;
      int   nfields;
    }   number_array;
  }   data;
};

I took this R11 definition from the source code to headerdump (specifically, from mxinternals.h, which also has mxArray_tag definitions for R12 (Matlab 6.0) and R13 (Matlab 6.5)), and you can see that it is much more informative, because many fields have been given useful names thanks to the work of Peter Boetcher and others. Note also that the definition from this old version of Matlab is quite different from the version from R2010a.

At this point, if you are running a much earlier version of Matlab like R11 or R13, you can break off from the current article and start playing around with headerdump directly to try to understand Matlab's internals. For more recent versions of Matlab, we have more work to do. Getting back to our original goal, if we take the mxArray_tag definition from R2010a and run sizeof, we get an answer for the amount of memory taken up by an mxArray in R2010a: 104 bytes.

Determining the size of mxArray

It was nice to derive the size of mxArray from actual MathWorks code, but unfortunately this information is no longer available as of R2011a. Somewhere between R2010a and R2011a, MathWorks stepped up their efforts to make mxArray completely opaque. So we should find another way to get the size of mxArray for current and future Matlab versions.

One ugly trick that works is to create many new arrays quickly and see where their starting points end up in memory:

>> A = num2cell(1:100)';
>> addrs = sort(cellfun(@getaddr, A));

What we did here is create 100 new arrays, and then get all their memory addresses in sorted order. Now we can take a look at how far apart these new arrays ended up in memory:

>> semilogy(diff(addrs));

The resulting plot will look different each time you run this; it is not really predictable where Matlab will put new arrays into memory. Here is an example from my system:

Plot of memory addresses

Plot of memory addresses

Your results may look different, and you might have to increase the number of new arrays from 100 to 1000 to get the qualitative result, but the important feature of this plot is that there is a minimum distance between new arrays of about 102. In fact, if we just go straight for this minimum distance:

>> min(diff(addrs))
ans = 
            104

we find that although mxArray has gone completely opaque from R2010a to R2011a, the full size of mxArray in memory has stayed the same: 104 bytes.

Dumping mxArray from memory

We now have all the information we need to start looking into Matlab's array representation. There are many tools available that allow you to browse memory locations or dump memory contents to disk. For our purposes though, it is nice to be able to do everything from within Matlab. Therefore I introduce a simple tool that prints memory locations into the Matlab console:

/* printmem.cpp */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
  if (nrhs < 1 || !mxIsUint64(prhs[0]) || mxIsEmpty(prhs[0]))
    mexErrMsgTxt("First argument must be a uint64 memory address");
  unsigned long *addr = static_cast<unsigned long *>(mxGetData(prhs[0]));
  unsigned char *mem = (unsigned char *) addr[0];
 
  if (nrhs < 2 || !mxIsDouble(prhs[1]) || mxIsEmpty(prhs[1]))
    mexErrMsgTxt("Second argument must be a double-type integer byte size.");      
  unsigned int nbytes = static_cast<unsigned int>(mxGetScalar(prhs[1]));
 
  for (int i = 0; i < nbytes; i++) {
    printf("%.2x ", mem[i]);
    if ((i+1) % 16 == 0) printf("\n");
 }
 printf("\n");
}

Here is how you use it in Matlab:

>> A = 0;
>> printmem(getaddr(A), 104)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 01 02 00 00 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 70 fa 33 df 6f 7f 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00

And there you have it: the inner guts of mxArray laid bare. I have printed each byte as a two character hexadecimal value, as is standard, so there are 16 bytes printed per row.

What does it mean?

So now we have 104 bytes of Matlab internals to dig into. We can start playing with this with a few simple examples:

>> A = 0; B = 1;
>> printmem(getaddr(A), 104)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 01 02 00 00 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 c0 b0 27 df 6f 7f 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00

>> printmem(getaddr(B), 104)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 01 02 00 00 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 70 fa 33 df 6f 7f 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00

In this and subsequent examples, I will highlight bytes that are different or that are of interest. What we can see from this example is that although arrays A and B have different content, almost nothing is different between their mxArray representations. What is different, is the memory address stored in the highlighted bytes. This confirms our earlier assertion that mxArray does not store the array contents, but only a pointer to the content location.

Now let us try to figure out some of the other fields:

>> A = 1:3; B = 1:10; C = (1:10)';
>> printmem(getaddr(A), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
03 00 00 00 00 00 00 00 60 80 22 df 6f 7f 00 00

>> printmem(getaddr(B), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 80 83 29 df 6f 7f 00 00

>> printmem(getaddr(C), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 0a 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 80 83 29 df 6f 7f 00 00

(Note that this time I only printed the first four lines of each array as this is where the interesting differences are for this example.)

In red I highlighted the bytes in each array that give its number of rows and columns (note that hexadecimal 0a is 10 in decimal). In blue I highlighted areas that store the value "02", which could be the location for storing the number of dimensions. Let us look into this more:

>> A = rand([3 3 3]);
>> printmem(getaddr(A), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 30 4a 3f df 6f 7f 00 00
09 00 00 00 00 00 00 00 b0 d3 24 df 6f 7f 00 00

Two interesting results here: The first highlighted region changed from 02 to 03, so this must be the place where mxArray indicates a 3-dimensional array rather than 2D. Another important thing also changed though: we can see in the second highlighted region that there is a new memory address stored where we used to find the number of rows. And in the third highlighted region we now have the number 09 instead of the number of columns.

Clearly, Matlab has a different way of representing a 2D matrix versus arrays of higher dimension such as 3D. In the 2D case, mxArray simply holds the nrows and ncols directly, but for a higher dimension case we hold only the number of dimensions (03), the total number of elements (09), and a pointer to another memory location (0x7f6fdf3f4a30) which holds the array of sizes for each dimension.

The copy-on-write mechanism

Finally, we are in a position to understand how Matlab internally implements copy-on-write:

>> A = 1:10;
>> printmem(getaddr(A), 64);
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 90 f3 24 df 6f 7f 00 00

>> B = A;
>> printaddr(B);
0x7f6f4c7b6810

>> printmem(getaddr(A), 64);
10 68 7b 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
10 68 7b 4c 6f 7f 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 90 f3 24 df 6f 7f 00 00

What we see is that by setting B = A, we change the internal representation of A itself. Two new memory address pointers are added to the mxArray for A. As it turns out, both of these point to the address for array B, which makes sense; this is how Matlab internally keeps track of arrays that are copies of each other. Note that because byte order is little-endian, the memory addresses from printmem are byte-wise, i.e. every two characters, reversed relative to the address from printaddr.

We can also look into array B:

>> printmem(getaddr(B), 64);
f0 41 7a 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
f0 41 7a 4c 6f 7f 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 90 f3 24 df 6f 7f 00 00

>> printaddr(A);
0x7f6f4c7a41f0

As I have highlighted, there are two interesting points here. First the red highlights show that array B has pointers back to array A. Second the blue highlight shows that the Matlab data for array B actually just points back to the same memory as the data for array A (the values 1:10).

Finally, we would like to understand why there are two pointers added. Let us see what happens if we add a third linked variable:

>> C = B;
>> printaddr(A); printaddr(B); printaddr(C);
0x7f6f4c7a41f0
0x7f6f4c7b6810
0x7f6f4c7b69b0

>> printmem(getaddr(A), 32)
b0 69 7b 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
10 68 7b 4c 6f 7f 00 00 02 00 00 00 00 00 00 00

>> printmem(getaddr(B), 32)
f0 41 7a 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
b0 69 7b 4c 6f 7f 00 00 02 00 00 00 00 00 00 00

>> printmem(getaddr(C), 32)
10 68 7b 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
f0 41 7a 4c 6f 7f 00 00 02 00 00 00 00 00 00 00

So it turns out that Matlab keeps track of a set of linked variables with a kind of cyclical, doubly-linked list structure; array A is linked to B in the forward direction and is also linked to C in the reverse direction by looping back around, etc. The cyclical nature of this makes sense, as we need to be able to start from any of A, B, or C and find all the linked arrays. But it is still not entirely clear why the list needs to be cyclical AND linked in both directions. In fact, in earlier versions of Matlab this cyclical list was only singly-linked.

Conclusions

Obviously there is a lot more to mxArray and Matlab internals than what we have delved into here. Still, with this basic introduction I hope to have whet your appetite for understanding more about Matlab internals, and provided some simple tools to help you explore. I want to reiterate that in general MathWorks's approach of an opaque mxArray type with access abstracted through an API layer is a good policy. The last thing you would want to do is take the information here and write a bunch of code that relies on the structure of mxArray to work; next time MathWorks needs to add a new feature and change mxArray, all your code will break. So in general we are all better off playing within the API that MathWorks provides. And remember: poking into memory can crash your computer, so save your data!

On the other hand, occasionally there are cases, like in-place editing, where it is useful to push the capabilities of Matlab a little beyond what MathWorks envisioned. In these cases, having an understanding of Matlab's internals can be critical, for example in understanding how to avoid conflicting with copy-on-write. Therefore I hope the information presented here will prove useful. Ideally, someone will be motivated to take this starting point and repair some of the tools like headerdump that made Matlab's internal workings more transparent in the past. I believe that having more of this information out in the Matlab community is good for the community as a whole.

Categories: Guest bloggers, High risk of breaking in future versions, Memory, Mex, Undocumented feature
Tags: , , , ,
4 Comments

Java stack traces in Matlab

When debugging Java events in Matlab callbacks, it is sometimes useful to check the stack trace of the originating Java code. Matlab’s built-in dbstack function only reports the stack-trace of Matlab code, and any prior Java code in the stack trace is not reported. Knowing this information is also extremely important when debugging Java components that are used in Matlab, especially when using the Java-to-Matlab Interface (JMI).

Let’s use a specific example to illustrate: Matlab’s uitable passes information back and forth between its underlying Java code and Matlab, via the arrayviewfunc Matlab function (%matlabroot%/toolbox/matlab/codetools/arrayviewfunc.m). If we place a breakpoint in arrayviewfunc and then update the table’s data, dbstack will only report the Matlab stack:

% Prepare an empty uitable
>> hTable = uitable('ColumnName',{'a','b','c'});
 
% Place a breakpoint in arrayviewfunc
>> dbstop in arrayviewfunc at reportValuesCallback
>> dbstatus
Breakpoint for arrayviewfunc>reportValuesCallback is on line 588.
 
% Update the table data and wait for the breakpoint to trigger
>> set(hTable,'Data',magic(3));
 
% Check the Matlab stack trace – no sign of the invoking Java code
K>> dbstack
> In arrayviewfunc>reportValuesCallback at 588
  In arrayviewfunc at 42

To see the originating Java stack trace, we can use java.lang.Thread‘s static dumpStack() method, which spills the Java stack trace onto the stderr console (will appear in red in Matlab’s Command Window):

K>> java.lang.Thread.dumpStack
java.lang.Exception: Stack trace
   at java.lang.Thread.dumpStack(Unknown Source)
   at com.mathworks.jmi.NativeMatlab.SendMatlabMessage(Native Method)
   at com.mathworks.jmi.NativeMatlab.sendMatlabMessage(NativeMatlab.java:219)
   at com.mathworks.jmi.MatlabLooper.sendMatlabMessage(MatlabLooper.java:121)
   at com.mathworks.jmi.Matlab.mtFeval(Matlab.java:1550)
   at com.mathworks.hg.peer.ui.table.DefaultUIStyleTableModel$UITableValueTableModel$1.runOnMatlabThread(DefaultUIStyleTableModel.java:467)
   at com.mathworks.jmi.MatlabWorker$2.run(MatlabWorker.java:79)
   at com.mathworks.jmi.NativeMatlab.dispatchMTRequests(NativeMatlab.java:364)

To access and possibly parse the originating Java stack trace, we can use the following trick:

K>> st = java.lang.Thread.currentThread.getStackTrace;
K>> for idx = 2 : length(st), disp(st(idx)); end
com.mathworks.jmi.NativeMatlab.SendMatlabMessage(Native Method)
com.mathworks.jmi.NativeMatlab.sendMatlabMessage(NativeMatlab.java:219)
com.mathworks.jmi.MatlabLooper.sendMatlabMessage(MatlabLooper.java:121)
com.mathworks.jmi.Matlab.mtFeval(Matlab.java:1550)
com.mathworks.hg.peer.ui.table.DefaultUIStyleTableModel$UITableValueTableModel$1.runOnMatlabThread(DefaultUIStyleTableModel.java:467)
com.mathworks.jmi.MatlabWorker$2.run(MatlabWorker.java:79)
com.mathworks.jmi.NativeMatlab.dispatchMTRequests(NativeMatlab.java:364)

Each of the stack trace elements can be inspected, to get specific properties:

K>> st(1).getFileName
ans =
     []		% empty = unknown
 
K>> st(2).get
	Class = [ (1 by 1) java.lang.Class array]
	ClassName = com.mathworks.jmi.NativeMatlab
	FileName = NativeMatlab.java
	LineNumber = [-2]
	MethodName = SendMatlabMessage
	NativeMethod = on
 
K>> st(2).isNativeMethod
ans =
     1		% 1 = true
 
K>> char(st(2).getFileName)  % cast java.lang.String to a Matlab char
ans =
NativeMatlab.java
 
K>> st(2).getLineNumber
ans =
    -2		% negative = unknown
 
K>> st(5).get
	Class = [ (1 by 1) java.lang.Class array]
	ClassName = com.mathworks.jmi.Matlab
	FileName = Matlab.java
	LineNumber = [1550]
	MethodName = mtFeval
	NativeMethod = off

This works well in JVM 1.5 (i.e., Matlab 7.04 or R14 SP2) and higher. In older Matlab releases you can use a slight modification:

K>> t = java.lang.Throwable; st=t.getStackTrace;
K>> for idx = 1 : length(st), disp(st(idx)); end
com.mathworks.jmi.NativeMatlab.SendMatlabMessage(Native Method)
... (etc.)
Categories: Java, Low risk of breaking in future versions
Tags: ,
3 Comments

Profiling Matlab memory usage

Anyone who has had experience with real-life applications knows that Memory usage can have a significant impact on the application’s usability, in aspects such as performance, interactivity, and even (on some lousy memory-management Operating Systems) crashes/hangs.

In Matlab releases of the past few years, this has been addressed by expanding the information reported by the built-in memory function. In addition, an undocumented feature was added to the Matlab Profiler that enables monitoring memory usage.


Profile report with memory & JIT infoProfile report with memory & JIT info

Profile report with memory & JIT info

Profile report with memory & JIT info

In Matlab release R2008a (but not on newer releases) we could also use a nifty parameter of the undocumented feature function:

>> feature mtic; a=ones(100); feature mtoc
ans = 
      TotalAllocated: 84216
          TotalFreed: 2584
    LargestAllocated: 80000
           NumAllocs: 56
            NumFrees: 43
                Peak: 81640

As can easily be seen in this example, allocating 1002 doubles requires 80000 bytes of allocation, plus some 4KB others that were allocated (and 2KB freed) within the function ones. Running the same code line again gives a very similar result, but now there are 80000 more bytes freed when the matrix a is overwritten:

>> feature mtic; a=ones(100); feature mtoc
ans = 
      TotalAllocated: 84120
          TotalFreed: 82760
    LargestAllocated: 80000
           NumAllocs: 54
            NumFrees: 49
                Peak: 81328

This is pretty informative and very handy for debugging memory bottlenecks. Unfortunately, starting in R2008b, features mtic and mtoc are no longer supported “under the current memory manager. Sometime around 2010 the mtic and mtoc features were completely removed. Users of R2008b and newer releases therefore need to use the internal structs returned by the memory function, and/or use the profiler’s memory-monitoring feature. If you ask me, using mtic/mtoc was much simpler and easier. I for one miss these features.

In a related matter, if we wish to monitor Java’s memory used within Matlab, we are in a bind, because there are no built-in tools to help us. there are several JVM switches that can be turned on in the java.opts file: -Xrunhprof[:help]|[:option=value,...], -Xprof, -Xrunprof, -XX:+PrintClassHistogram and so on. There are several memory-monitoring (so-called “heap-walking”) tools: the standard JDK jconsole, jmap, jhat and jvisualvm (with its useful plugins) provide good basic coverage. MathWorks has posted a tutorial on using jconsole with Matlab. There are a number of other third-party tools such as JMP (for JVMs 1.5 and earlier) or TIJMP (for JVM 1.6). Within Matlab, we can use utilities such as Classmexer to estimate a particular object’s size (both shallow and deep referencing), or use java.lang.Runtime.getRuntime()‘s methods (maxMemory(), freeMemory() and totalMemory()) to monitor overall Java memory (sample usage).

Specifically in R2011b (but in no other release), we can also use a built-in Java memory monitor. Unfortunately, this simple and yet useful memory monitor was removed in R2012a (or maybe it was just moved to another package and I haven’t found out where… yet…):

com.mathworks.xwidgets.JavaMemoryMonitor.invoke

Matlab R2011b's Java memory monitor

Matlab R2011b's Java memory monitor

As I have already noted quite often, using undocumented Matlab features and functions carries the risk that they will not be supported in some future Matlab release. Today’s article is a case in point.

Categories: High risk of breaking in future versions, Memory, Stock Matlab function, Undocumented function
Tags: , , ,
2 Comments