*I would like to welcome Matlab Mex power-user Peter Li to a first in a short series of articles about undocumented aspects of Mex programing*

Editing Matlab arrays in-place can be an important technique for optimizing calculations, especially when handling data that use large blocks of memory. The Matlab language itself has some limited support for in-place editing, but when we are really concerned with speed we often turn to writing C/C++ extensions using the Mex interface. Unfortunately, editing arrays in-place from Mex extensions is not officially supported in Matlab, and doing it incorrectly can cause data inconsistencies or can even cause Matlab to crash. In this article, I will introduce the problem and show a simple solution that exhibit the basic implementation details of Matlab’s internal copy-on-write mechanism.

### Why edit in-place?

To demonstrate the techniques in this article, I use the *fast_median* function, which is part of my nth_element package on Matlab’s File Exchange. You can download the package and play with the code if you want. The examples are fairly self-explanatory, so if you do not want to try the code you should be okay just following along.

Let us try a few function calls to see how editing in-place can save time and memory:

>> A = rand(100000000, 1); >> tic; median(A); toc Elapsed time is 4.122654 seconds. >> tic; fast_median(A); toc Elapsed time is 1.646448 seconds. >> tic; fast_median_ip(A); toc Elapsed time is 0.927898 seconds.

If you try running this, be careful not to make A too large; tune the example according to the memory available on your system. In terms of the execution time for the different functions, your mileage may vary depending on factors such as: your system, what Matlab version you are running, and whether your test data is arranged in a single vector or a multicolumn array.

This example illustrates a few general points: first, *fast_median* is significantly faster than Matlab’s * native* median function. This is because

*fast_median*uses a more efficient algorithm; see the nth_element page for more details. Besides being a shameless plug, this demonstrates why we might want to write a Mex function in the first place: rewriting the median function in pure Matlab would be slow, but using C++ we can significantly improve on the status quo.

The second point is that the in-place version, *fast_median_ip*, yields an additional speed improvement. What is the difference? Let us look behind the scenes; here are the CPU and memory traces from my system monitor after running the above:

You can see four spikes in CPU use, and some associated changes in memory allocation:

The first spike in CPU is when we created the test data vector; memory use also steps up at that time.

The second CPU spike is the largest; that is Matlab’s median function. You can see that over that period memory use stepped up again, and then stepped back down; the median function makes a copy of the entire input data, and then throws its copy away when it is finished; this is expensive in terms of time and resources.

The *fast_median* function is the next CPU spike; it has a similar step up and down in memory use, but it is much faster.

Finally, in the case of *fast_median_ip* we see something different; there is a spike in CPU use, but memory use stays flat; the in-place version is faster and more memory efficient because it does not make a copy of the input data.

There is another important difference with the in-place version; it modifies its input array. This can be demonstrated simply:

>> A = randi(100, [10 1]); >> A' ans = 39 42 98 25 64 75 6 56 71 89 >> median(A) ans = 60 >> fast_median(A) ans = 60 >> A' ans = 39 42 98 25 64 75 6 56 71 89 >> fast_median_ip(A) ans = 60 >> A' ans = 39 6 25 42 56 64 75 71 98 89

As you can see, all three methods get the same answer, but * median* and

*fast_median*do not modify A in the workspace, whereas after running

*fast_median_ip*, input array A has changed. This is how the in-place method is able to run without using new memory; it operates on the existing array rather than making a copy.

### Pitfalls with in-place editing

Modifying a function’s input is common in many languages, but in Matlab there are only a few special conditions under which this is officially sanctioned. This is not necessarily a bad thing; many people feel that modifying input data is bad programming practice and makes code harder to maintain. But as we have shown, it can be an important capability to have if speed and memory use are critical to an application.

Given that in-place editing is not officially supported in Matlab Mex extensions, what do we have to do to make it work? Let us look at the normal, input-copying *fast_median* function as a starting point:

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) { mxArray *incopy = mxDuplicateArray(prhs[0]); plhs[0] = run_fast_median(incopy); }

This is a pretty simple function (I have taken out a few lines of boiler plate input checking to keep things clean). It relies on helper function *run_fast_median* to do the actual calculation, so the only real logic here is copying the input array `prhs[0]`

. Importantly, *run_fast_median* edits its inputs in-place, so the call to *mxDuplicateArray* ensures that the Mex function is overall well behaved, i.e. that it does not change its inputs.

Who wants to be well behaved though? Can we save time and memory just by taking out the input duplication step? Let us try it:

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) { plhs[0] = run_fast_median(const_cast<mxarray *>(prhs[0])); // </mxarray> }

Very bad behavior; note that we cast the original `const mxArray*`

input to a `mxArray*`

so that the compiler will let us mess with it; naughty.

But does this accomplish edit in-place for *fast_median*? Be sure to save any work you have open and then try it:

>> mex fast_median_tweaked.cpp >> A = randi(100,[10 1]); >> fast_median_tweaked(A) ans = 43

Hmm, it looks like this worked fine. But in fact there are subtle problems:

>> A = randi(100,[10 1]); >> A' ans = 65 92 14 26 41 2 45 85 53 2 >> B = A; >> B' ans = 65 92 14 26 41 2 45 85 53 2 >> fast_median_tweaked(A) ans = 43 >> A' ans = 2 2 41 26 14 45 65 85 53 92 >> B' ans = 2 2 41 26 14 45 65 85 53 92

Uhoh, spooky; we expected that running *fast_median_tweaked* would change input A, but somehow it has also changed B, even though B is supposed to be an independent copy. Not good. In fact, under some conditions this kind of uncontrolled editing in-place can crash the entire Matlab environment with a segfault. What is going on?

### Matlab’s copy-on-write mechanism

The answer is that our simple attempt to edit in-place conflicts with Matlab’s internal copy-on-write mechanism. Copy-on-write is an optimization built into Matlab to help avoid expensive copying of variables in memory (actually similar to what we are trying to accomplish with edit in-place). We can see copy-on-write in action with some simple tests:

% Test #1: update, then copy >> tic; A = zeros(100000000, 1); toc Elapsed time is 0.588937 seconds. >> tic; A(1) = 0; toc Elapsed time is 0.000008 seconds. >> tic; B = A; toc Elapsed time is 0.000020 seconds. % Test #2: copy, then update >> clear >> tic; A = zeros(100000000, 1); toc Elapsed time is 0.588937 seconds. >> tic; B = A; toc Elapsed time is 0.000020 seconds. >> tic; A(1) = 0; toc Elapsed time is 0.678160 seconds.

In the first set of operations, time and memory are used to create A, but updating A and “copying” A into B take no memory and essentially no time. This may come as a surprise since supposedly we have made an independent copy of A in B; why does creating B take no time or memory when A is clearly a large, expensive block?

The second set of operations makes things more clear. In this case, we again create A and then copy it to B; again this operation is fast and cheap. But assigning into A at this point takes time and consumes a new block of memory, even though we are only assigning into a single index of A. This is copy-on-write: Matlab tries to save you from copying large blocks of memory unless you need to. So when you first assign B to equal A, nothing is copied; the variable B is simply set to point to the same memory location already used by A. Only after you try to change A (or B), does Matlab decide that you really need to have two copies of the large array.

There are some additional tricks Matlab does with copy-on-write. Here is another example:

>> clear >> tic; A{1} = zeros(100000000, 1); toc Elapsed time is 0.573240 seconds. >> tic; A{2} = zeros(100000000, 1); toc Elapsed time is 0.560369 seconds. >> tic; B = A; toc Elapsed time is 0.000016 seconds. >> tic; A{1}(1) = 0; toc Elapsed time is 0.690690 seconds. >> tic; A{2}(1) = 0; toc Elapsed time is 0.695758 seconds. >> tic; A{1}(1) = 0; toc Elapsed time is 0.000011 seconds. >> tic; A{2}(1) = 0; toc Elapsed time is 0.000004 seconds.

This shows that for the purposes of copy-on-write, different elements of cell array A are treated independently. When we assign B equal to A, nothing is copied. Then when we change any part of A{1}, that whole element must be copied over. When we subsequently change A{2}, that whole element must also be copied over; it was not copied earlier. At this point, A and B are truly independent of each other, as both elements have experienced copy-on-write, so further assignments into either A or B are fast and require no additional memory.

Try playing with some struct arrays and you will find that copy-on-write also works independently for the elements of structs.

### Reconciling edit in-place with copy-on-write: mxUnshareArray

Now it is clear why we cannot simply edit arrays in-place from Mex functions; not only is it naughty, it fundamentally conflicts with copy-on-write. Naively changing an array in-place can inadvertently change other variables that are still waiting for a copy-on-write, as we saw above when *fast_median_tweaked* inadvertently changed B in the workspace. This is, in the best case, an unmaintainable mess. Under more aggressive in-place editing, it can cause Matlab to crash with a segfault.

Luckily, there is a simple solution, although it requires calling internal, undocumented Matlab functions.

Essentially what we need is a Mex function that can be run on a Matlab array that will do the following:

- Check whether the current array is sharing data with any other arrays that are waiting for copy-on-write.
- If the array is shared, it must be unshared; the underlying memory must be copied and all the relevant pointers need to be fixed so that the array we want to work on is no longer accessible by anyone else.
- If the array is not currently shared, simply proceed; the whole point is to avoid copying memory if we do not need to, so that we can benefit from the efficiencies of edit in-place.

If you think about it, this is exactly the operation that Matlab needs to run internally when it is deciding whether an assignment operation requires a copy-on-write. So it should come as no surprise that such a Mex function already exists in the form of a Matlab internal called *mxUnshareArray*. Here is how you use it:

extern "C" bool mxUnshareArray(mxArray *array_ptr, bool noDeepCopy); void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) { mxUnshareArray(const_cast<mxarray *>(prhs[0]), true); //</mxarray> plhs[0] = run_fast_median(const_cast<mxarray *>(prhs[0])); //</mxarray> }

This is the method actually used by *fast_median_ip* to efficiently edit in-place without risking conflicts with copy-on-write. Of course, if the array turns out to need to be unshared, then you do not get the benefit of edit in-place because the memory ends up getting copied. But at least things are safe and you get the in-place benefit as long as the input array is not being shared.

### Further topics

The method shown here should allow you to edit normal Matlab numerical or character arrays in-place from Mex functions safely. For a Mex function in C rather than C++, omit the “C” in the `extern`

declaration and of course you will have to use C-style casting rather than `const_cast`

. If you need to edit cell or struct arrays in-place, or especially if you need to edit subsections of shared cell or struct arrays safely and efficiently while leaving the rest of the data shared, then you will need a few more tricks. A good place to get started is this article by Benjamin Schubert.

Unfortunately, over the last few years Mathworks seems to have decided to make it more difficult for users to access these kinds of internal methods to make our code more efficient. So be aware of the risk that in some future version of Matlab this method will no longer work in its current form.

Ultimately much of what is known about *mxUnshareArray* as well as the internal implementation details of how Matlab keeps track of which arrays are shared goes back to the work of Peter Boettcher, particularly his headerdump.c utility. Unfortunately, it appears that HeaderDump fails with Matlab releases >=R2010a, as Mathworks have changed some of the internal memory formats – perhaps some smart reader can pick up the work and adapt HeaderDump to the new memory format.

In a future article, I hope to discuss headerdump.c and its relevance for copy-on-write and edit in-place, and some other related tools for the latest Matlab releases that do not support HeaderDump.

Related posts:

- Matlab’s internal memory representation Matlab's internal memory structure is explored and discussed. ...
- Internal Matlab memory optimizations Copy-on-write and in-place data manipulations are very useful Matlab performance improvement techniques. ...
- Undocumented Matlab MEX API Matlab's MEX API contains numerous undocumented functions, that can be extremely useful. ...
- Explicit multi-threading in Matlab part 3 Matlab performance can be improved by employing POSIX threads in C/C++ code. ...
- Profiling Matlab memory usage mtic and mtoc were a couple of undocumented features that enabled users of past Matlab releases to easily profile memory usage. ...
- Accessing private object properties Private properties of Matlab class objects can be accessed (read and write) using some undocumented techniques. ...

Hello

Thank you for sharing. I’ve been struggling with this issue for a good few years now and was never aware of mxUnshareArray. I did however find a much simpler solution which so far seems to work fine.

basically, it goes like this:

I tend to modify the variables in place using openmp for loops with c mex functions. Will mxUnshareArray work when using openmp? Also, is there any benefit to using mxUnshareArray instead of my approach?

However I must stress that mathworks is strongly opposed to changing input variables within mex functions.

Please see email extract from MATLAB support:

Response from MathWorks – 19 Apr, 2010 02:13:48 PM :

Response from MathWorks – 23 Apr, 2010 08:21:50 AM :

Response from MathWorks – 31 Jan, 2011 09:10:10 AM :

@Jayveer – thanks for sharing

@Jayveer, that is cool you’ve found another way to keep variables unshared. I think there is still a clear advantage to using mxUnshareArray; mxUnshareArray uses Matlab’s internal knowledge about what things are currently shared so you don’t have to keep track of anything yourself. This makes bugs much less likely. mxUnshareArray may also be slightly more efficient than your method, but I haven’t tested.

It is definitely right to point out that this is not sanctioned by Mathworks and could crash Matlab or cause other unexpected behavior. I can say that I’ve tested this extensively for the situation of unsharing purely numeric arrays on Matlab versions 2010-2011 without difficulty. My understanding of the history of this method is that it has also been used without difficulty on versions going back to around Matlab 5. So it seems to be a pretty successful method. Finally, as I point out in the article, mxUnshareArray is an internal method Mathworks uses for essentially the same situation that we are trying to handle, so it makes sense that it should be safe unless Mathworks willfully breaks it. Nevertheless, it should always be tested carefully.

@Yair, would be nice to be able to use the “tt” HTML tag in comments.

@Peter – unfortunately, this is the way comments are handled in my WordPress variant, and I have little control over it except to drill into the PHP code and modify it, which I normally do only for critical things because it creates a hell of a mess and lots of workhours when I upgrade the WP version… Instead of

`<tt>`

, you can use the`<code>`

tag, as I have just demonstrated in this line.Yair, pretty cool entry. I’m very much interested in this as I’m currently working on high-speed video analysis.

Re: JAudio, please spare me a little while longer. I’m behind on several essential things now, forcing me to push others back a little.

Pingback: Matlab’s internal memory representation | Undocumented Matlab

Pingback: Internal Matlab memory optimizations | Undocumented Matlab

Hi all,

I wonder, whether anyone yet tried to figure out how java-objects are represented in the mxArray class?

I’m writing matlab code to work with a database, whose API is java.

So I often have to handle relatively large arrays, e.g. of simple Java-Strings, whose conversion to matlab-char takes quite some time.

I only very briefly played around with this, but getting the classname of the java object from the mxArray, I didn’t manage anything further.

I didn’t yet use java objects in C/C++, but knowing that it’s possible, I wonder whether this might be able to speed things up a little?

cheers,

sebastian

Thanks for the info … it helps alot!!!

Is there an analogous mxShareArray function that could be used to return arguments with necessarily replicating them?

Dan

Sorry after reading closer, I see the question is already answered in the schubert article you’re referencing.

Pingback: Undocumented Matlab | datainfer

Hello, does anybody know whether mxUnshareArray can still work in matlab versions >= 2013a ? I have a problem with a mex function which worked properly with an older matlab version, and it does not in the 2013a version: currently the input to the max function which is modified and unshared inside the mex function, remains unchanged after the mex function call. With the older version the input changed after the mex function call. Thanks