Matlab mex in-place editing

I would like to welcome Matlab Mex power-user Peter Li to a first in a short series of articles about undocumented aspects of Mex programing

Editing Matlab arrays in-place can be an important technique for optimizing calculations, especially when handling data that use large blocks of memory. The Matlab language itself has some limited support for in-place editing, but when we are really concerned with speed we often turn to writing C/C++ extensions using the Mex interface. Unfortunately, editing arrays in-place from Mex extensions is not officially supported in Matlab, and doing it incorrectly can cause data inconsistencies or can even cause Matlab to crash. In this article, I will introduce the problem and show a simple solution that exhibit the basic implementation details of Matlab’s internal copy-on-write mechanism.

Why edit in-place?

To demonstrate the techniques in this article, I use the fast_median function, which is part of my nth_element package on Matlab’s File Exchange. You can download the package and play with the code if you want. The examples are fairly self-explanatory, so if you do not want to try the code you should be okay just following along.

Let us try a few function calls to see how editing in-place can save time and memory:

>> A = rand(100000000, 1);
>> tic; median(A); toc    
Elapsed time is 4.122654 seconds.
 
>> tic; fast_median(A); toc
Elapsed time is 1.646448 seconds.
 
>> tic; fast_median_ip(A); toc
Elapsed time is 0.927898 seconds.

If you try running this, be careful not to make A too large; tune the example according to the memory available on your system. In terms of the execution time for the different functions, your mileage may vary depending on factors such as: your system, what Matlab version you are running, and whether your test data is arranged in a single vector or a multicolumn array.

This example illustrates a few general points: first, fast_median is significantly faster than Matlab’s native median function. This is because fast_median uses a more efficient algorithm; see the nth_element page for more details. Besides being a shameless plug, this demonstrates why we might want to write a Mex function in the first place: rewriting the median function in pure Matlab would be slow, but using C++ we can significantly improve on the status quo.

The second point is that the in-place version, fast_median_ip, yields an additional speed improvement. What is the difference? Let us look behind the scenes; here are the CPU and memory traces from my system monitor after running the above:

Memory and CPU usage for median() vs. fast_median_ip()

Memory and CPU usage for median vs. fast_median_ip

You can see four spikes in CPU use, and some associated changes in memory allocation:

The first spike in CPU is when we created the test data vector; memory use also steps up at that time.

The second CPU spike is the largest; that is Matlab’s median function. You can see that over that period memory use stepped up again, and then stepped back down; the median function makes a copy of the entire input data, and then throws its copy away when it is finished; this is expensive in terms of time and resources.

The fast_median function is the next CPU spike; it has a similar step up and down in memory use, but it is much faster.

Finally, in the case of fast_median_ip we see something different; there is a spike in CPU use, but memory use stays flat; the in-place version is faster and more memory efficient because it does not make a copy of the input data.

There is another important difference with the in-place version; it modifies its input array. This can be demonstrated simply:

>> A = randi(100, [10 1]);
>> A'
ans = 39    42    98    25    64    75     6    56    71    89
 
>> median(A)
ans = 60
 
>> fast_median(A)
ans = 60
>> A'
ans = 39    42    98    25    64    75     6    56    71    89
 
>> fast_median_ip(A)
ans = 60
>> A'
ans = 39     6    25    42    56    64    75    71    98    89

As you can see, all three methods get the same answer, but median and fast_median do not modify A in the workspace, whereas after running fast_median_ip, input array A has changed. This is how the in-place method is able to run without using new memory; it operates on the existing array rather than making a copy.

Pitfalls with in-place editing

Modifying a function’s input is common in many languages, but in Matlab there are only a few special conditions under which this is officially sanctioned. This is not necessarily a bad thing; many people feel that modifying input data is bad programming practice and makes code harder to maintain. But as we have shown, it can be an important capability to have if speed and memory use are critical to an application.

Given that in-place editing is not officially supported in Matlab Mex extensions, what do we have to do to make it work? Let us look at the normal, input-copying fast_median function as a starting point:

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   mxArray *incopy = mxDuplicateArray(prhs[0]);
   plhs[0] = run_fast_median(incopy);
}

This is a pretty simple function (I have taken out a few lines of boiler plate input checking to keep things clean). It relies on helper function run_fast_median to do the actual calculation, so the only real logic here is copying the input array prhs[0]. Importantly, run_fast_median edits its inputs in-place, so the call to mxDuplicateArray ensures that the Mex function is overall well behaved, i.e. that it does not change its inputs.

Who wants to be well behaved though? Can we save time and memory just by taking out the input duplication step? Let us try it:

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   plhs[0] = run_fast_median(const_cast<mxarray *>(prhs[0]));  // </mxarray>
}

Very bad behavior; note that we cast the original const mxArray* input to a mxArray* so that the compiler will let us mess with it; naughty.

But does this accomplish edit in-place for fast_median? Be sure to save any work you have open and then try it:

>> mex fast_median_tweaked.cpp
>> A = randi(100,[10 1]);
>> fast_median_tweaked(A)
ans = 43

Hmm, it looks like this worked fine. But in fact there are subtle problems:

>> A = randi(100,[10 1]);
>> A'
ans = 65    92    14    26    41     2    45    85    53     2
>> B = A;
>> B'
ans = 65    92    14    26    41     2    45    85    53     2
 
>> fast_median_tweaked(A)
ans = 43
>> A'
ans = 2     2    41    26    14    45    65    85    53    92
>> B'
ans = 2     2    41    26    14    45    65    85    53    92

Uhoh, spooky; we expected that running fast_median_tweaked would change input A, but somehow it has also changed B, even though B is supposed to be an independent copy. Not good. In fact, under some conditions this kind of uncontrolled editing in-place can crash the entire Matlab environment with a segfault. What is going on?

Matlab’s copy-on-write mechanism

The answer is that our simple attempt to edit in-place conflicts with Matlab’s internal copy-on-write mechanism. Copy-on-write is an optimization built into Matlab to help avoid expensive copying of variables in memory (actually similar to what we are trying to accomplish with edit in-place). We can see copy-on-write in action with some simple tests:

Matlab's Copy-on-Write memory and CPU usage

Matlab's Copy-on-Write memory and CPU usage

% Test #1: update, then copy
>> tic; A = zeros(100000000, 1); toc
Elapsed time is 0.588937 seconds.
>> tic; A(1) = 0; toc
Elapsed time is 0.000008 seconds.
>> tic; B = A; toc   
Elapsed time is 0.000020 seconds.
 
% Test #2: copy, then update
>> clear
>> tic; A = zeros(100000000, 1); toc
Elapsed time is 0.588937 seconds.
>> tic; B = A; toc   
Elapsed time is 0.000020 seconds.
>> tic; A(1) = 0; toc
Elapsed time is 0.678160 seconds.

In the first set of operations, time and memory are used to create A, but updating A and “copying” A into B take no memory and essentially no time. This may come as a surprise since supposedly we have made an independent copy of A in B; why does creating B take no time or memory when A is clearly a large, expensive block?

The second set of operations makes things more clear. In this case, we again create A and then copy it to B; again this operation is fast and cheap. But assigning into A at this point takes time and consumes a new block of memory, even though we are only assigning into a single index of A. This is copy-on-write: Matlab tries to save you from copying large blocks of memory unless you need to. So when you first assign B to equal A, nothing is copied; the variable B is simply set to point to the same memory location already used by A. Only after you try to change A (or B), does Matlab decide that you really need to have two copies of the large array.

There are some additional tricks Matlab does with copy-on-write. Here is another example:

>> clear
>> tic; A{1} = zeros(100000000, 1); toc
Elapsed time is 0.573240 seconds.
>> tic; A{2} = zeros(100000000, 1); toc
Elapsed time is 0.560369 seconds.
 
>> tic; B = A; toc                     
Elapsed time is 0.000016 seconds.
 
>> tic; A{1}(1) = 0; toc               
Elapsed time is 0.690690 seconds.
>> tic; A{2}(1) = 0; toc
Elapsed time is 0.695758 seconds.
 
>> tic; A{1}(1) = 0; toc
Elapsed time is 0.000011 seconds.
>> tic; A{2}(1) = 0; toc
Elapsed time is 0.000004 seconds.

This shows that for the purposes of copy-on-write, different elements of cell array A are treated independently. When we assign B equal to A, nothing is copied. Then when we change any part of A{1}, that whole element must be copied over. When we subsequently change A{2}, that whole element must also be copied over; it was not copied earlier. At this point, A and B are truly independent of each other, as both elements have experienced copy-on-write, so further assignments into either A or B are fast and require no additional memory.

Try playing with some struct arrays and you will find that copy-on-write also works independently for the elements of structs.

Reconciling edit in-place with copy-on-write: mxUnshareArray

Now it is clear why we cannot simply edit arrays in-place from Mex functions; not only is it naughty, it fundamentally conflicts with copy-on-write. Naively changing an array in-place can inadvertently change other variables that are still waiting for a copy-on-write, as we saw above when fast_median_tweaked inadvertently changed B in the workspace. This is, in the best case, an unmaintainable mess. Under more aggressive in-place editing, it can cause Matlab to crash with a segfault.

Luckily, there is a simple solution, although it requires calling internal, undocumented Matlab functions.

Essentially what we need is a Mex function that can be run on a Matlab array that will do the following:

  1. Check whether the current array is sharing data with any other arrays that are waiting for copy-on-write.
  2. If the array is shared, it must be unshared; the underlying memory must be copied and all the relevant pointers need to be fixed so that the array we want to work on is no longer accessible by anyone else.
  3. If the array is not currently shared, simply proceed; the whole point is to avoid copying memory if we do not need to, so that we can benefit from the efficiencies of edit in-place.

If you think about it, this is exactly the operation that Matlab needs to run internally when it is deciding whether an assignment operation requires a copy-on-write. So it should come as no surprise that such a Mex function already exists in the form of a Matlab internal called mxUnshareArray. Here is how you use it:

extern "C" bool mxUnshareArray(mxArray *array_ptr, bool noDeepCopy);
 
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   mxUnshareArray(const_cast<mxarray *>(prhs[0]), true);  //</mxarray>
   plhs[0] = run_fast_median(const_cast<mxarray *>(prhs[0]));  //</mxarray>
}

This is the method actually used by fast_median_ip to efficiently edit in-place without risking conflicts with copy-on-write. Of course, if the array turns out to need to be unshared, then you do not get the benefit of edit in-place because the memory ends up getting copied. But at least things are safe and you get the in-place benefit as long as the input array is not being shared.

Further topics

The method shown here should allow you to edit normal Matlab numerical or character arrays in-place from Mex functions safely. For a Mex function in C rather than C++, omit the “C” in the extern declaration and of course you will have to use C-style casting rather than const_cast. If you need to edit cell or struct arrays in-place, or especially if you need to edit subsections of shared cell or struct arrays safely and efficiently while leaving the rest of the data shared, then you will need a few more tricks. A good place to get started is this article by Benjamin Schubert.

Unfortunately, over the last few years Mathworks seems to have decided to make it more difficult for users to access these kinds of internal methods to make our code more efficient. So be aware of the risk that in some future version of Matlab this method will no longer work in its current form.

Ultimately much of what is known about mxUnshareArray as well as the internal implementation details of how Matlab keeps track of which arrays are shared goes back to the work of Peter Boettcher, particularly his headerdump.c utility. Unfortunately, it appears that HeaderDump fails with Matlab releases >=R2010a, as Mathworks have changed some of the internal memory formats – perhaps some smart reader can pick up the work and adapt HeaderDump to the new memory format.

In a future article, I hope to discuss headerdump.c and its relevance for copy-on-write and edit in-place, and some other related tools for the latest Matlab releases that do not support HeaderDump.

Related posts:

  1. Matlab’s internal memory representation Matlab's internal memory structure is explored and discussed. ...
  2. Internal Matlab memory optimizations Copy-on-write and in-place data manipulations are very useful Matlab performance improvement techniques. ...
  3. Undocumented Matlab MEX API Matlab's MEX API contains numerous undocumented functions, that can be extremely useful. ...
  4. Explicit multi-threading in Matlab part 3 Matlab performance can be improved by employing POSIX threads in C/C++ code. ...
  5. Profiling Matlab memory usage mtic and mtoc were a couple of undocumented features that enabled users of past Matlab releases to easily profile memory usage. ...
  6. Accessing private object properties Private properties of Matlab class objects can be accessed (read and write) using some undocumented techniques. ...

Categories: Guest bloggers, High risk of breaking in future versions, Memory, Mex, Stock Matlab function, Undocumented feature

Tags: , , , ,

Bookmark and SharePrint Print

12 Responses to Matlab mex in-place editing

  1. Hello

    Thank you for sharing. I’ve been struggling with this issue for a good few years now and was never aware of mxUnshareArray. I did however find a much simpler solution which so far seems to work fine.

    basically, it goes like this:

    A=rand(2,10);
    B=rand(2,10);
    B(:,:)=A(:,:);  % instead of B=A;
    abc(A); % mex function modifying A
    isequal(A,B) % they should be different

    I tend to modify the variables in place using openmp for loops with c mex functions. Will mxUnshareArray work when using openmp? Also, is there any benefit to using mxUnshareArray instead of my approach?

    However I must stress that mathworks is strongly opposed to changing input variables within mex functions.

    Please see email extract from MATLAB support:

    Hello

    I apologize for the late reply.
    Thank you for the detailed explanation.
    I understand your point regarding not changing input variables to mex functions. However, I’m not sure if you are aware of this, but there is a way around the problem. I’ve tested it.
    Consider the following

    strain =zeros(6,1e5);
    strainH=zeros(6,1e5);
    for nstep=1:steps
       strainH(:,:)=strain(:,:);
       cmexfunction(strain)   % the cmex function changes strain
       isequal(strainH,strain)
    end

    adding (:,:) on both sides prevents matlab from pointing strainH to strain. This way strainH is stored physically on RAM and hence not changes by the mex function when it’s changing strain. Also note that in the above scenario, neither strain doesn’t even have to be global variable!

    When dealing with very large variables, this capability is highly desirable. Using cmex functions in such a manner brought my simulation time from 4hrs to 3 mins on a dual core machine. Parfor from parallel toolbox only speeds it up 1.6 times.

    Is there anything wrong with this approach? Are there any other unexpected behaviours to expect? ”

    Response from MathWorks – 19 Apr, 2010 02:13:48 PM :

    Dear Jayveer,

    I am writing in reference to your Service Request # 1-CLRADX regarding ‘Global variables and mex files’.

    Please note that in general you should NEVER try to change the values of input variables in your MEX-file, no matter whether they are global or not. Why? Because seen from the outside a MEX-function should act in the same way as a MATLAB function and in MATLAB functions it is also not possible to change the values of input variables.

    If you do want to be able to work with global variables in your MEX-file though, you should use mexGetVariable first to get a copy of the variable, then make changes and put the variable back using mexPutVariable.

    Response from MathWorks – 23 Apr, 2010 08:21:50 AM :

    Dear Jayveer,

    What I tried to say is not that it is technically impossible to change the values of input variables, I tried saying that you should not do this. The reason for this is that MATLAB (and especially its accelerator) does not expect functions to do this and thus it may lead to unexpected behavior in certain situations. Although I have not tested whether something unexpected really happens in the following particular example, I can imagine the following:

    A = 3;    % MATLAB initializes a variable A and sets its value to 3
    B = A;    % MATLAB does not actually allocate a variable B, it just "remembers" that B has the same value as A
    mymex(A); % A is an input so MATLAB assumes A stays the same
    A = B;    % MATLAB assumed that A was not changed, further it already knew B = A so also A = B, so it could decide simply not the execute this line
    C = 3 * A % If mymex changed the value of A then this may not result in the expected answer 9

    So please be careful with changing values of input variables from a MEX-file.

    Response from MathWorks – 31 Jan, 2011 09:10:10 AM :

    Dear Jayveer,

    I am writing in reference to your Service Request # 1-EA7U0X regarding ‘Optimize memory efficiency, MEX, change input variables’.

    The problem with your approach here is that you now think you have found something safe to do based on some experiments. We at MathWorks however, will never guarantee this is the correct approach. MATLAB’s JIT and accelerator were developed under (amongst others) the assumption that variables cannot be changed in place. A workaround to fool them may work in one MATLAB version but may lead to unexpected behavior it the other.

    If you want such fine grained control about allocation and reuse, perhaps you should rethink your whole approach? I see your license includes MATLAB Compiler, so instead of sometimes calling C from MATLAB (MEX-file) you should do it the other way around sometimes call MATLAB from C (use MATLAB Compiler to create a shared library). Perhaps this offers the low level control which you want?

    • @Jayveer – thanks for sharing

    • Peter says:

      @Jayveer, that is cool you’ve found another way to keep variables unshared. I think there is still a clear advantage to using mxUnshareArray; mxUnshareArray uses Matlab’s internal knowledge about what things are currently shared so you don’t have to keep track of anything yourself. This makes bugs much less likely. mxUnshareArray may also be slightly more efficient than your method, but I haven’t tested.

      It is definitely right to point out that this is not sanctioned by Mathworks and could crash Matlab or cause other unexpected behavior. I can say that I’ve tested this extensively for the situation of unsharing purely numeric arrays on Matlab versions 2010-2011 without difficulty. My understanding of the history of this method is that it has also been used without difficulty on versions going back to around Matlab 5. So it seems to be a pretty successful method. Finally, as I point out in the article, mxUnshareArray is an internal method Mathworks uses for essentially the same situation that we are trying to handle, so it makes sense that it should be safe unless Mathworks willfully breaks it. Nevertheless, it should always be tested carefully.

    • Peter says:

      @Yair, would be nice to be able to use the “tt” HTML tag in comments.

    • @Peter – unfortunately, this is the way comments are handled in my WordPress variant, and I have little control over it except to drill into the PHP code and modify it, which I normally do only for critical things because it creates a hell of a mess and lots of workhours when I upgrade the WP version… Instead of <tt>, you can use the <code> tag, as I have just demonstrated in this line.

  2. Kesh says:

    Yair, pretty cool entry. I’m very much interested in this as I’m currently working on high-speed video analysis.

    Re: JAudio, please spare me a little while longer. I’m behind on several essential things now, forcing me to push others back a little.

  3. Pingback: Matlab’s internal memory representation | Undocumented Matlab

  4. Pingback: Internal Matlab memory optimizations | Undocumented Matlab

  5. sebbo says:

    Hi all,

    I wonder, whether anyone yet tried to figure out how java-objects are represented in the mxArray class?

    I’m writing matlab code to work with a database, whose API is java.
    So I often have to handle relatively large arrays, e.g. of simple Java-Strings, whose conversion to matlab-char takes quite some time.
    I only very briefly played around with this, but getting the classname of the java object from the mxArray, I didn’t manage anything further.

    I didn’t yet use java objects in C/C++, but knowing that it’s possible, I wonder whether this might be able to speed things up a little?

    cheers,
    sebastian

  6. DJP says:

    Thanks for the info … it helps alot!!!
    Is there an analogous mxShareArray function that could be used to return arguments with necessarily replicating them?
    Dan

    • DJP says:

      Sorry after reading closer, I see the question is already answered in the schubert article you’re referencing.

  7. Pingback: Undocumented Matlab | datainfer

Leave a Reply

Your email address will not be published. Required fields are marked *

*

<pre lang="matlab">
a = magic(3);
sum(a)
</pre>