Peter Li – Undocumented Matlab

Matlab's internal memory representation

Yair Altman — Thu, 15 Mar 2012 18:11:23 +0000

Once again I’d like to welcome guest blogger Peter Li. Peter wrote about Matlab Mex in-place editing last month. Today, Peter pokes around in Matlab’s internal memory representation to the greater good and glory of Matlab Mex programming.
Disclaimer: The information in this article is provided for informational purposes only. Be aware that poking into Matlab’s internals is not condoned or supported by MathWorks, and is not recommended for any regular usage. Poking into memory has the potential to crash your computer so save your data! Moreover, be advised (as the text below will show) that the information is highly prone to change without any advance notice in future Matlab releases, which could lead to very adverse effects on any program that relies on it. On the scale of undocumented Matlab topics, this practically breaks the scale, so be EXTREMELY careful when using this.
A few weeks ago I discussed Matlab’s copy-on-write mechanism as part of my discussion of editing Matlab arrays in-place. Today I want to explore some behind-the-scenes details of how the copy-on-write mechanism is implemented. In the process we will learn a little about Matlab’s internal array representation. I will also introduce some simple tools you can use to explore more of Matlab’s internals. I will only cover basic information, so there are plenty more details left to be filled in by others who are interested.

Brief review of copy-on-write and mxArray

Copy-on-write is Matlab’s mechanism for avoiding unnecessary duplication of data in memory. To implement this, Matlab needs to keep track internally of which sets of variables are copies of each other. As described in MathWorks’s article, “the Matlab language works with a single object type: the Matlab array. All Matlab variables (including scalars, vectors, matrices, strings, cell arrays, structures, and objects) are stored as Matlab arrays. In C/C++, the Matlab array is declared to be of type mxArray“. This means that mxArray defines how Matlab lays out all the information about an array (its Matlab data type, its size, its data, etc.) in memory. So understanding Matlab’s internal array representation basically boils down to understanding mxArray.
Unfortunately, MathWorks also tells us that “mxArray is a C language opaque type“. This means that MathWorks does not expose the organization of mxArray to users (i.e. Matlab or Mex programmers). Instead, MathWorks defines mxArray internally, and allows users to interact with it only through an API, a set of functions that know how to handle mxArray in their back end. So, for example, a Mex programmer does not get the dimensions of an mxArray by directly accessing the relevant field in memory. Instead, the Mex programmer only has a pointer to the mxArray, and passes this pointer into an API function that knows where in memory to find the requested information and then passes the result back to the programmer.
This is generally a good thing: the API provides an abstraction layer between the programmer and the memory structures so that if MathWorks needs to change the back end organization (to add a new feature for example), we programmers do not need to modify our code; instead MathWorks just updates the API to reflect the new internal organization. On the other hand, being able to look into the internal structure of mxArray on occasion can help us understand how Matlab works, and can help us write more efficient code if we are careful as in the example of editing arrays in-place.
So how do we get a glimpse inside mxArray? The first step is simply to find the region of memory where the mxArray lives: its beginning and end. Finding where in memory the mxArray begins is pretty easy: it is given by its pointer value. Here is a simple Mex function that takes a Matlab array as input and prints its memory address:

/* printaddr.cpp */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   if (nrhs < 1) mexErrMsgTxt("One input required.");
   printf("%p\n", prhs[0]);
}

This function is nice as it prints the address in a standard hexadecimal format. The same information can also be received directly in Matlab (i.e., without needing printaddr), using the undocumented format debug command (here's another reference):

>> format debug
>> A = 1:10
A =
Structure address = 7fc3b8869ae0
m = 1
n = 10
pr = 7fc44922c890
pi = 0
     1     2     3     4     5     6     7     8     9    10
>> printaddr(A)
7fc3b8869ae0

To play with this further from within Matlab however, it’s nice to have the address returned to us as a 64-bit unsigned integer; here’s a Mex function that does that:

/* getaddr.cpp */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   if (nrhs < 1) mexErrMsgTxt("One input required.");
   plhs[0] = mxCreateNumericMatrix(1, 1, mxUINT64_CLASS, mxREAL);
   unsigned long *out = static_cast(mxGetData(plhs[0]));
   out[0] = (unsigned long) prhs[0];
}

Here’s getaddr in action:

>> getaddr(A)
ans =
           139870853618400
% And using pure Matlab:
>> hex2dec('7f36388b5ae0')  % output of printaddr or format debug
ans =
           139870853618400

So now we know where to find our array in memory. With this information we can already learn a lot. To make our exploration a little cleaner though, it would be nice to know where the array ends in memory too, in other words we would like to know the size of the mxArray.

Finding the structure of mxArray

The first thing to understand is that the amount of memory taken by an mxArray does not have anything to do with the dimensions of the array in Matlab. So a 1×1 Matlab array and a 100×100 Matlab array have the same size mxArray representation in memory. As you will know if you have experience programming in Mex, this is simply because the Matlab array’s data contents are not stored directly within mxArray. Instead, mxArray only stores a pointer to another memory location where the actual data reside. This is fine; the internal information we want to poke into is all still in mxArray, and it is easy to get the pointer to the array’s data contents using the API functions mxGetData or mxGetPr.
So we are still left with trying to figure out the size of mxArray. There are a couple paths forward. First I want to talk about a historical tool that used to make a lot of this internal information easily available. This was a function called headerdump, by Peter Boetcher (described here and here). headerdump was created for exactly the goal we are currently working towards: to understand Matlab’s copy-on-write mechanism. Unfortunately, as Matlab has evolved, newer versions have incrementally broken this useful tool. So our goal here is to create a replacement. Still, we can learn a lot from the earlier work.
One of the things that helped people figure out Matlab’s internals in the past is that in older versions of Matlab mxArray is not a completely opaque type. Even in recent versions up through at least R2010a, if you look into $MATLAB/extern/include/matrix.h you can find a definition of mxArray_tag that looks something like this:

/* R2010a */
struct mxArray_tag {
   void  *reserved;
   int    reserved1[2];
   void  *reserved2;
   size_t  number_of_dims;
   unsigned int reserved3;
   struct {
       unsigned int  flag0  : 1;
       unsigned int  flag1  : 1;
       unsigned int  flag2  : 1;
       unsigned int  flag3  : 1;
       unsigned int  flag4  : 1;
       unsigned int  flag5  : 1;
       unsigned int  flag6  : 1;
       unsigned int  flag7  : 1;
       unsigned int  flag7a : 1;
       unsigned int  flag8  : 1;
       unsigned int  flag9  : 1;
       unsigned int  flag10 : 1;
       unsigned int  flag11 : 4;
       unsigned int  flag12 : 8;
       unsigned int  flag13 : 8;
   }   flags;
   size_t reserved4[2];
   union {
       struct {
           void  *pdata;
           void  *pimag_data;
           void  *reserved5;
           size_t reserved6[3];
       }   number_array;
   }   data;
};

This is what you could call murky or obfuscated, but not completely opaque. The fields mostly have unhelpful names like “reserved”, but on the other hand we at least have a sense for what fields there are and their layout.
A more informative (yet unofficial) definition was provided by James Tursa and Peter Boetcher:

#include "mex.h"
/* Definition of structure mxArray_tag for debugging purposes. Might not be fully correct
 * for Matlab 2006b or 2007a, but the important things are. Thanks to Peter Boettcher.
 */
struct mxArray_tag {
  const char *name;
  mxClassID class_id;
  int vartype;
  mxArray    *crosslink;
  int      number_of_dims;
  int      refcount;
  struct {
    unsigned int    scalar_flag : 1;
    unsigned int    flag1 : 1;
    unsigned int    flag2 : 1;
    unsigned int    flag3 : 1;
    unsigned int    flag4 : 1;
    unsigned int    flag5 : 1;
    unsigned int    flag6 : 1;
    unsigned int    flag7 : 1;
    unsigned int    private_data_flag : 1;
    unsigned int    flag8 : 1;
    unsigned int    flag9 : 1;
    unsigned int    flag10 : 1;
    unsigned int    flag11 : 4;
    unsigned int    flag12 : 8;
    unsigned int    flag13 : 8;
  }   flags;
  int  rowdim;
  int  coldim;
  union {
    struct {
      double  *pdata;       // original: void*
      double  *pimag_data;  // original: void*
      void    *irptr;
      void    *jcptr;
      int     nelements;
      int     nfields;
    }   number_array;
    struct {
      mxArray **pdata;
      char    *field_names;
      void    *dummy1;
      void    *dummy2;
      int     dummy3;
      int     nfields;
    }   struct_array;
    struct {
      void  *pdata;  /*mxGetInfo*/
      char  *field_names;
      char  *name;
      int   checksum;
      int   nelements;
      int   reserved;
    }  object_array;
  }   data;
};

For comparison, here is another definition from an earlier version of Matlab.

/* R11 aka Matlab 5.0 (1999) */
struct mxArray_tag {
  char name[mxMAXNAM];
  int  class_id;
  int  vartype;
  mxArray *crosslink;
  int  number_of_dims;
  int  nelements_allocated;
  int  dataflags;
  int  rowdim;
  int  coldim;
  union {
    struct {
      void *pdata;
      void *pimag_data;
      void *irptr;
      void *jcptr;
      int   reserved;
      int   nfields;
    }   number_array;
  }   data;
};

I took this R11 definition from the source code to headerdump (specifically, from mxinternals.h, which also has mxArray_tag definitions for R12 (Matlab 6.0) and R13 (Matlab 6.5)), and you can see that it is much more informative, because many fields have been given useful names thanks to the work of Peter Boetcher and others. Note also that the definition from this old version of Matlab is quite different from the version from R2010a.
At this point, if you are running a much earlier version of Matlab like R11 or R13, you can break off from the current article and start playing around with headerdump directly to try to understand Matlab’s internals. For more recent versions of Matlab, we have more work to do. Getting back to our original goal, if we take the mxArray_tag definition from R2010a and run sizeof, we get an answer for the amount of memory taken up by an mxArray in R2010a: 104 bytes.

Determining the size of mxArray

It was nice to derive the size of mxArray from actual MathWorks code, but unfortunately this information is no longer available as of R2011a. Somewhere between R2010a and R2011a, MathWorks stepped up their efforts to make mxArray completely opaque. So we should find another way to get the size of mxArray for current and future Matlab versions.
One ugly trick that works is to create many new arrays quickly and see where their starting points end up in memory:

>> A = num2cell(1:100)';
>> addrs = sort(cellfun(@getaddr, A));

What we did here is create 100 new arrays, and then get all their memory addresses in sorted order. Now we can take a look at how far apart these new arrays ended up in memory:

>> semilogy(diff(addrs));

The resulting plot will look different each time you run this; it is not really predictable where Matlab will put new arrays into memory. Here is an example from my system:

Plot of memory addresses

Your results may look different, and you might have to increase the number of new arrays from 100 to 1000 to get the qualitative result, but the important feature of this plot is that there is a minimum distance between new arrays of about 10². In fact, if we just go straight for this minimum distance:

>> min(diff(addrs))
ans =
            104

we find that although mxArray has gone completely opaque from R2010a to R2011a, the full size of mxArray in memory has stayed the same: 104 bytes.

Dumping mxArray from memory

We now have all the information we need to start looking into Matlab’s array representation. There are many tools available that allow you to browse memory locations or dump memory contents to disk. For our purposes though, it is nice to be able to do everything from within Matlab. Therefore I introduce a simple tool that prints memory locations into the Matlab console:

/* printmem.cpp */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
  if (nrhs < 1 || !mxIsUint64(prhs[0]) || mxIsEmpty(prhs[0]))
    mexErrMsgTxt("First argument must be a uint64 memory address");
  unsigned long *addr = static_cast(mxGetData(prhs[0]));
  unsigned char *mem = (unsigned char *) addr[0];
  if (nrhs < 2 || !mxIsDouble(prhs[1]) || mxIsEmpty(prhs[1]))
    mexErrMsgTxt("Second argument must be a double-type integer byte size.");
  unsigned int nbytes = static_cast(mxGetScalar(prhs[1]));
  for (int i = 0; i < nbytes; i++) {
    printf("%.2x ", mem[i]);
    if ((i+1) % 16 == 0) printf("\n");
 }
 printf("\n");
}

Here is how you use it in Matlab:

>> A = 0;
>> printmem(getaddr(A), 104)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 01 02 00 00 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 70 fa 33 df 6f 7f 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00

And there you have it: the inner guts of mxArray laid bare. I have printed each byte as a two character hexadecimal value, as is standard, so there are 16 bytes printed per row.

What does it mean?

So now we have 104 bytes of Matlab internals to dig into. We can start playing with this with a few simple examples:

>> A = 0; B = 1;
>> printmem(getaddr(A), 104)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 01 02 00 00 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 c0 b0 27 df 6f 7f 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
>> printmem(getaddr(B), 104)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 01 02 00 00 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 70 fa 33 df 6f 7f 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00

In this and subsequent examples, I will highlight bytes that are different or that are of interest. What we can see from this example is that although arrays A and B have different content, almost nothing is different between their mxArray representations. What is different, is the memory address stored in the highlighted bytes. This confirms our earlier assertion that mxArray does not store the array contents, but only a pointer to the content location.
Now let us try to figure out some of the other fields:

>> A = 1:3; B = 1:10; C = (1:10)';
>> printmem(getaddr(A), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
03 00 00 00 00 00 00 00 60 80 22 df 6f 7f 00 00
>> printmem(getaddr(B), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 80 83 29 df 6f 7f 00 00
>> printmem(getaddr(C), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 0a 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 80 83 29 df 6f 7f 00 00

(Note that this time I only printed the first four lines of each array as this is where the interesting differences are for this example.)
In red I highlighted the bytes in each array that give its number of rows and columns (note that hexadecimal 0a is 10 in decimal). In blue I highlighted areas that store the value “02”, which could be the location for storing the number of dimensions. Let us look into this more:

>> A = rand([3 3 3]);
>> printmem(getaddr(A), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 30 4a 3f df 6f 7f 00 00
09 00 00 00 00 00 00 00 b0 d3 24 df 6f 7f 00 00

Two interesting results here: The first highlighted region changed from 02 to 03, so this must be the place where mxArray indicates a 3-dimensional array rather than 2D. Another important thing also changed though: we can see in the second highlighted region that there is a new memory address stored where we used to find the number of rows. And in the third highlighted region we now have the number 09 instead of the number of columns.
Clearly, Matlab has a different way of representing a 2D matrix versus arrays of higher dimension such as 3D. In the 2D case, mxArray simply holds the nrows and ncols directly, but for a higher dimension case we hold only the number of dimensions (03), the total number of elements (09), and a pointer to another memory location (0x7f6fdf3f4a30) which holds the array of sizes for each dimension.

The copy-on-write mechanism

Finally, we are in a position to understand how Matlab internally implements copy-on-write:

>> A = 1:10;
>> printmem(getaddr(A), 64);
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 90 f3 24 df 6f 7f 00 00
>> B = A;
>> printaddr(B);
0x7f6f4c7b6810
>> printmem(getaddr(A), 64);
10 68 7b 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
10 68 7b 4c 6f 7f 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 90 f3 24 df 6f 7f 00 00

What we see is that by setting B = A, we change the internal representation of A itself. Two new memory address pointers are added to the mxArray for A. As it turns out, both of these point to the address for array B, which makes sense; this is how Matlab internally keeps track of arrays that are copies of each other. Note that because byte order is little-endian, the memory addresses from printmem are byte-wise, i.e. every two characters, reversed relative to the address from printaddr.
We can also look into array B:

>> printmem(getaddr(B), 64);
f0 41 7a 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
f0 41 7a 4c 6f 7f 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 90 f3 24 df 6f 7f 00 00
>> printaddr(A);
0x7f6f4c7a41f0

As I have highlighted, there are two interesting points here. First the red highlights show that array B has pointers back to array A. Second the blue highlight shows that the Matlab data for array B actually just points back to the same memory as the data for array A (the values 1:10).
Finally, we would like to understand why there are two pointers added. Let us see what happens if we add a third linked variable:

>> C = B;
>> printaddr(A); printaddr(B); printaddr(C);
0x7f6f4c7a41f0
0x7f6f4c7b6810
0x7f6f4c7b69b0
>> printmem(getaddr(A), 32)
b0 69 7b 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
10 68 7b 4c 6f 7f 00 00 02 00 00 00 00 00 00 00
>> printmem(getaddr(B), 32)
f0 41 7a 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
b0 69 7b 4c 6f 7f 00 00 02 00 00 00 00 00 00 00
>> printmem(getaddr(C), 32)
10 68 7b 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
f0 41 7a 4c 6f 7f 00 00 02 00 00 00 00 00 00 00

So it turns out that Matlab keeps track of a set of linked variables with a kind of cyclical, doubly-linked list structure; array A is linked to B in the forward direction and is also linked to C in the reverse direction by looping back around, etc. The cyclical nature of this makes sense, as we need to be able to start from any of A, B, or C and find all the linked arrays. But it is still not entirely clear why the list needs to be cyclical AND linked in both directions. In fact, in earlier versions of Matlab this cyclical list was only singly-linked.

Conclusions

Obviously there is a lot more to mxArray and Matlab internals than what we have delved into here. Still, with this basic introduction I hope to have whet your appetite for understanding more about Matlab internals, and provided some simple tools to help you explore. I want to reiterate that in general MathWorks’s approach of an opaque mxArray type with access abstracted through an API layer is a good policy. The last thing you would want to do is take the information here and write a bunch of code that relies on the structure of mxArray to work; next time MathWorks needs to add a new feature and change mxArray, all your code will break. So in general we are all better off playing within the API that MathWorks provides. And remember: poking into memory can crash your computer, so save your data!
On the other hand, occasionally there are cases, like in-place editing, where it is useful to push the capabilities of Matlab a little beyond what MathWorks envisioned. In these cases, having an understanding of Matlab’s internals can be critical, for example in understanding how to avoid conflicting with copy-on-write. Therefore I hope the information presented here will prove useful. Ideally, someone will be motivated to take this starting point and repair some of the tools like headerdump that made Matlab’s internal workings more transparent in the past. I believe that having more of this information out in the Matlab community is good for the community as a whole.

The post Matlab's internal memory representation appeared first on Undocumented Matlab.

Matlab mex in-place editing

Yair Altman — Wed, 08 Feb 2012 17:00:25 +0000

I would like to welcome Matlab Mex power-user Peter Li to a first in a short series of articles about undocumented aspects of Mex programing
Editing Matlab arrays in-place can be an important technique for optimizing calculations, especially when handling data that use large blocks of memory. The Matlab language itself has some limited support for in-place editing, but when we are really concerned with speed we often turn to writing C/C++ extensions using the Mex interface. Unfortunately, editing arrays in-place from Mex extensions is not officially supported in Matlab, and doing it incorrectly can cause data inconsistencies or can even cause Matlab to crash. In this article, I will introduce the problem and show a simple solution that exhibit the basic implementation details of Matlab’s internal copy-on-write mechanism.

Why edit in-place?

To demonstrate the techniques in this article, I use the fast_median function, which is part of my nth_element package on Matlab’s File Exchange. You can download the package and play with the code if you want. The examples are fairly self-explanatory, so if you do not want to try the code you should be okay just following along.
Let us try a few function calls to see how editing in-place can save time and memory:

>> A = rand(100000000, 1);
>> tic; median(A); toc
Elapsed time is 4.122654 seconds.
>> tic; fast_median(A); toc
Elapsed time is 1.646448 seconds.
>> tic; fast_median_ip(A); toc
Elapsed time is 0.927898 seconds.

If you try running this, be careful not to make A too large; tune the example according to the memory available on your system. In terms of the execution time for the different functions, your mileage may vary depending on factors such as: your system, what Matlab version you are running, and whether your test data is arranged in a single vector or a multicolumn array.
This example illustrates a few general points: first, fast_median is significantly faster than Matlab’s native median function. This is because fast_median uses a more efficient algorithm; see the nth_element page for more details. Besides being a shameless plug, this demonstrates why we might want to write a Mex function in the first place: rewriting the median function in pure Matlab would be slow, but using C++ we can significantly improve on the status quo.
The second point is that the in-place version, fast_median_ip, yields an additional speed improvement. What is the difference? Let us look behind the scenes; here are the CPU and memory traces from my system monitor after running the above:

Memory and CPU usage for median vs. fast_median_ip

You can see four spikes in CPU use, and some associated changes in memory allocation:
The first spike in CPU is when we created the test data vector; memory use also steps up at that time.
The second CPU spike is the largest; that is Matlab’s median function. You can see that over that period memory use stepped up again, and then stepped back down; the median function makes a copy of the entire input data, and then throws its copy away when it is finished; this is expensive in terms of time and resources.
The fast_median function is the next CPU spike; it has a similar step up and down in memory use, but it is much faster.
Finally, in the case of fast_median_ip we see something different; there is a spike in CPU use, but memory use stays flat; the in-place version is faster and more memory efficient because it does not make a copy of the input data.

There is another important difference with the in-place version; it modifies its input array. This can be demonstrated simply:

>> A = randi(100, [10 1]);
>> A'
ans = 39    42    98    25    64    75     6    56    71    89
>> median(A)
ans = 60
>> fast_median(A)
ans = 60
>> A'
ans = 39    42    98    25    64    75     6    56    71    89
>> fast_median_ip(A)
ans = 60
>> A'
ans = 39     6    25    42    56    64    75    71    98    89

As you can see, all three methods get the same answer, but median and fast_median do not modify A in the workspace, whereas after running fast_median_ip, input array A has changed. This is how the in-place method is able to run without using new memory; it operates on the existing array rather than making a copy.

Pitfalls with in-place editing

Modifying a function’s input is common in many languages, but in Matlab there are only a few special conditions under which this is officially sanctioned. This is not necessarily a bad thing; many people feel that modifying input data is bad programming practice and makes code harder to maintain. But as we have shown, it can be an important capability to have if speed and memory use are critical to an application.
Given that in-place editing is not officially supported in Matlab Mex extensions, what do we have to do to make it work? Let us look at the normal, input-copying fast_median function as a starting point:

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   mxArray *incopy = mxDuplicateArray(prhs[0]);
   plhs[0] = run_fast_median(incopy);
}

This is a pretty simple function (I have taken out a few lines of boiler plate input checking to keep things clean). It relies on helper function run_fast_median to do the actual calculation, so the only real logic here is copying the input array prhs[0]. Importantly, run_fast_median edits its inputs in-place, so the call to mxDuplicateArray ensures that the Mex function is overall well behaved, i.e. that it does not change its inputs.
Who wants to be well behaved though? Can we save time and memory just by taking out the input duplication step? Let us try it:

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   plhs[0] = run_fast_median(const_cast(prhs[0]));  // 
}

Very bad behavior; note that we cast the original const mxArray* input to a mxArray* so that the compiler will let us mess with it; naughty.
But does this accomplish edit in-place for fast_median? Be sure to save any work you have open and then try it:

>> mex fast_median_tweaked.cpp
>> A = randi(100,[10 1]);
>> fast_median_tweaked(A)
ans = 43

Hmm, it looks like this worked fine. But in fact there are subtle problems:

>> A = randi(100,[10 1]);
>> A'
ans = 65    92    14    26    41     2    45    85    53     2
>> B = A;
>> B'
ans = 65    92    14    26    41     2    45    85    53     2
>> fast_median_tweaked(A)
ans = 43
>> A'
ans = 2     2    41    26    14    45    65    85    53    92
>> B'
ans = 2     2    41    26    14    45    65    85    53    92

Uhoh, spooky; we expected that running fast_median_tweaked would change input A, but somehow it has also changed B, even though B is supposed to be an independent copy. Not good. In fact, under some conditions this kind of uncontrolled editing in-place can crash the entire Matlab environment with a segfault. What is going on?

Matlab’s copy-on-write mechanism

The answer is that our simple attempt to edit in-place conflicts with Matlab’s internal copy-on-write mechanism. Copy-on-write is an optimization built into Matlab to help avoid expensive copying of variables in memory (actually similar to what we are trying to accomplish with edit in-place). We can see copy-on-write in action with some simple tests:

Matlab's Copy-on-Write memory and CPU usage

% Test #1: update, then copy
>> tic; A = zeros(100000000, 1); toc
Elapsed time is 0.588937 seconds.
>> tic; A(1) = 0; toc
Elapsed time is 0.000008 seconds.
>> tic; B = A; toc
Elapsed time is 0.000020 seconds.
% Test #2: copy, then update
>> clear
>> tic; A = zeros(100000000, 1); toc
Elapsed time is 0.588937 seconds.
>> tic; B = A; toc
Elapsed time is 0.000020 seconds.
>> tic; A(1) = 0; toc
Elapsed time is 0.678160 seconds.

In the first set of operations, time and memory are used to create A, but updating A and “copying” A into B take no memory and essentially no time. This may come as a surprise since supposedly we have made an independent copy of A in B; why does creating B take no time or memory when A is clearly a large, expensive block?
The second set of operations makes things more clear. In this case, we again create A and then copy it to B; again this operation is fast and cheap. But assigning into A at this point takes time and consumes a new block of memory, even though we are only assigning into a single index of A. This is copy-on-write: Matlab tries to save you from copying large blocks of memory unless you need to. So when you first assign B to equal A, nothing is copied; the variable B is simply set to point to the same memory location already used by A. Only after you try to change A (or B), does Matlab decide that you really need to have two copies of the large array.
There are some additional tricks Matlab does with copy-on-write. Here is another example:

>> clear
>> tic; A{1} = zeros(100000000, 1); toc
Elapsed time is 0.573240 seconds.
>> tic; A{2} = zeros(100000000, 1); toc
Elapsed time is 0.560369 seconds.
>> tic; B = A; toc
Elapsed time is 0.000016 seconds.
>> tic; A{1}(1) = 0; toc
Elapsed time is 0.690690 seconds.
>> tic; A{2}(1) = 0; toc
Elapsed time is 0.695758 seconds.
>> tic; A{1}(1) = 0; toc
Elapsed time is 0.000011 seconds.
>> tic; A{2}(1) = 0; toc
Elapsed time is 0.000004 seconds.

This shows that for the purposes of copy-on-write, different elements of cell array A are treated independently. When we assign B equal to A, nothing is copied. Then when we change any part of A{1}, that whole element must be copied over. When we subsequently change A{2}, that whole element must also be copied over; it was not copied earlier. At this point, A and B are truly independent of each other, as both elements have experienced copy-on-write, so further assignments into either A or B are fast and require no additional memory.
Try playing with some struct arrays and you will find that copy-on-write also works independently for the elements of structs.

Reconciling edit in-place with copy-on-write: mxUnshareArray

Now it is clear why we cannot simply edit arrays in-place from Mex functions; not only is it naughty, it fundamentally conflicts with copy-on-write. Naively changing an array in-place can inadvertently change other variables that are still waiting for a copy-on-write, as we saw above when fast_median_tweaked inadvertently changed B in the workspace. This is, in the best case, an unmaintainable mess. Under more aggressive in-place editing, it can cause Matlab to crash with a segfault.
Luckily, there is a simple solution, although it requires calling internal, undocumented Matlab functions.
Essentially what we need is a Mex function that can be run on a Matlab array that will do the following:

Check whether the current array is sharing data with any other arrays that are waiting for copy-on-write.
If the array is shared, it must be unshared; the underlying memory must be copied and all the relevant pointers need to be fixed so that the array we want to work on is no longer accessible by anyone else.
If the array is not currently shared, simply proceed; the whole point is to avoid copying memory if we do not need to, so that we can benefit from the efficiencies of edit in-place.

If you think about it, this is exactly the operation that Matlab needs to run internally when it is deciding whether an assignment operation requires a copy-on-write. So it should come as no surprise that such a Mex function already exists in the form of a Matlab internal called mxUnshareArray. Here is how you use it:

extern "C" bool mxUnshareArray(mxArray *array_ptr, bool noDeepCopy);
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   mxUnshareArray(const_cast(prhs[0]), true);  //
   plhs[0] = run_fast_median(const_cast(prhs[0]));  //
}

This is the method actually used by fast_median_ip to efficiently edit in-place without risking conflicts with copy-on-write. Of course, if the array turns out to need to be unshared, then you do not get the benefit of edit in-place because the memory ends up getting copied. But at least things are safe and you get the in-place benefit as long as the input array is not being shared.

Further topics

The method shown here should allow you to edit normal Matlab numerical or character arrays in-place from Mex functions safely. For a Mex function in C rather than C++, omit the “C” in the extern declaration and of course you will have to use C-style casting rather than const_cast. If you need to edit cell or struct arrays in-place, or especially if you need to edit subsections of shared cell or struct arrays safely and efficiently while leaving the rest of the data shared, then you will need a few more tricks. A good place to get started is this article by Benjamin Schubert.
Unfortunately, over the last few years Mathworks seems to have decided to make it more difficult for users to access these kinds of internal methods to make our code more efficient. So be aware of the risk that in some future version of Matlab this method will no longer work in its current form.
Ultimately much of what is known about mxUnshareArray as well as the internal implementation details of how Matlab keeps track of which arrays are shared goes back to the work of Peter Boettcher, particularly his headerdump.c utility. Unfortunately, it appears that HeaderDump fails with Matlab releases >=R2010a, as Mathworks have changed some of the internal memory formats – perhaps some smart reader can pick up the work and adapt HeaderDump to the new memory format.
In a future article, I hope to discuss headerdump.c and its relevance for copy-on-write and edit in-place, and some other related tools for the latest Matlab releases that do not support HeaderDump.

The post Matlab mex in-place editing appeared first on Undocumented Matlab.