Mex – Undocumented Matlab

Multi-threaded Mex

Yair Altman — Wed, 18 Jul 2018 14:56:44 +0000

I was recently asked by a consulting client to help speed up a Matlab process. Quite often there are various ways to improve the run-time, and in this particular case it turned out that the best option was to convert the core Matlab processing loop into a multi-threaded Mex function, while keeping the rest (vast majority of program code) in easy-to-maintain Matlab. This resulted in a 160x speedup (25 secs => 0.16 secs). Some of this speedup is attributed to C-code being faster in general than Matlab, another part is due to the multi-threading, and another due to in-place data manipulations that avoid costly memory access and re-allocations.

In today’s post I will share some of the insights relating to this MEX conversion, which could be adapted for many other similar use-cases. Additional Matlab speed-up techniques can be found in other performance-related posts on this website, as well in my book Accelerating MATLAB Performance.

There are quite a few online resources about creating Mex files, so I will not focus on this aspect. I’ll assume that the reader is already familiar with the concept of using Mex functions, which are simply dynamically-linked libraries that have a predefined entry-function syntax and predefined platform-specific extension. Instead, I’ll focus on how to create and debug a multi-threaded Mex function, so that it runs in parallel on all CPU cores.

The benefit of multi-threading is that threads are very light-weight objects, that have minimal performance and memory overheads. This contrasts to multi-tasking, which is what the Parallel Computing Toolbox currently does: launches duplicate copies of the entire Matlab engine process (“headless workers”) and then manages and coordinates the tasks to split up the processing work. Multi-tasking should be avoided wherever we can employ light-weight multi-threading instead. Unfortunately, Matlab does not currently have the ability to explicitly multi-thread Matlab code. But we can still use explicit multi-threading by invoking code in other languages, as I’ve already shown for Java, C# (and .NET in general), and C/C++. Today’s article will expand on the latter post (the one about C/C++ multi-threading), by showing a general framework for making a multi-threaded C-based Mex function.

There are several alternative implementation of threads. On non-Windows machines, POSIX threads (“pthreads”) are a de-facto standard; on Windows, which pthreads can still be used, they generally use native Windows threads under the hood, and we can use these native threads directly.

I have uploaded a file called max_in_place to the Matlab File Exchange. This function serves as an example showing a generic multi-threaded Mex function. A compiled version of this Mex file for Win64 is also included in the File Exchange entry, and you can run it directly on a Win64 machine.

The usage in Matlab is as follows (note how matrix1 is updated in-place):

>> matrix1 = rand(4)
matrix1 =
      0.89092      0.14929      0.81428      0.19664
      0.95929      0.25751      0.24352      0.25108
      0.54722      0.84072      0.92926      0.61604
      0.13862      0.25428      0.34998      0.47329
 
>> matrix2 = rand(4)
matrix2 =
      0.35166      0.91719      0.38045      0.53081
      0.83083      0.28584      0.56782      0.77917
      0.58526       0.7572     0.075854      0.93401
      0.54972      0.75373      0.05395      0.12991
 
>> max_in_place(matrix1, matrix2)
 
>> matrix1
matrix1 =
      0.89092      0.91719      0.81428      0.53081
      0.95929      0.28584      0.56782      0.77917
      0.58526      0.84072      0.92926      0.93401
      0.54972      0.75373      0.34998      0.47329

Source code and discussion

The pseudo-code for the MEX function is as follows:

mexFunction():
   validate the input/output args
   quick bail-out if empty inputs
   get the number of threads N from Matlab's maxNumCompThreads function
   if N == 1
       run main_loop directly
   else
       split input matrix #1 into N index blocks
       assign start/end index for each thread
       create and launch N new threads that run main_loop
       wait for all N threads to complete
       free the allocated memory resources
   end

Here’s the core source-code of this function, which was adapted from original work by Dirk-Jan Kroon:

/*====================================================================
 *
 * max_in_place.c  updates a data matrix in-place with the max value
 *                 of the matrix and a 2nd matrix of the same size
 *
 * The calling syntax is:
 *
 *		max_in_place(matrix1, matrix2)
 *
 * matrix1 will be updated with the maximal values from corresponding
 * indices of the 2 matrices
 *
 * Both inputs must be double 2D real non-sparse matrices of same size
 *
 * Yair Altman 2018-07-18
 * https://undocumentedmatlab.com/blog/multi-threaded-mex
 *
 * Adapted from original work by Dirk-Jan Kroon
 * http://mathworks.com/matlabcentral/profile/authors/1097878-dirk-jan-kroon
 *
 *==================================================================*/
 
#include 
#include "mex.h"
 
/* undef needed for LCC compiler */
#undef EXTERN_C
#ifdef _WIN32
    #include 
    #include 
#else
    #include 
#endif
 
/* Input Arguments */
#define	hMatrix1	prhs[0]
#define	hMatrix2	prhs[1]
 
/* Macros */
#if !defined(MAX)
#define	MIN(A, B)	((A) < (B) ? (A) : (B))
#endif
 
/* Main processing loop function */
void main_loop(const mxArray *prhs[], int startIdx, int endIdx)
{
    /* Assign pointers to the various parameters */
    double *p1 = mxGetPr(hMatrix1);
    double *p2 = mxGetPr(hMatrix2);
 
    /* Loop through all matrix coordinates */
    for (int idx=startIdx; idx<=endIdx; idx++)
    {
        /* Update hMatrix1 with the maximal value of hMatrix1,hMatrix2 */
        if (p1[idx] < p2[idx]) {
            p1[idx] = p2[idx];
        }
    }
}
 
/* Computation function in threads */
#ifdef _WIN32
  unsigned __stdcall thread_func(void *ThreadArgs_) {
#else
  void thread_func(void *ThreadArgs_) {
#endif
    double **ThreadArgs = ThreadArgs_;  /* void* => double** */
    const mxArray** prhs = (const mxArray**) ThreadArgs[0];
 
    int ThreadID = (int) ThreadArgs[1][0];
    int startIdx = (int) ThreadArgs[2][0];
    int endIdx   = (int) ThreadArgs[3][0];
    /*mexPrintf("Starting thread #%d: idx=%d:%d\n", ThreadID, startIdx, endIdx); */
 
    /* Run the main processing function */
    main_loop(prhs, startIdx, endIdx);
 
    /* Explicit end thread, helps to ensure proper recovery of resources allocated for the thread */
    #ifdef _WIN32
        _endthreadex( 0 );
        return 0;
    #else
        pthread_exit(NULL);
    #endif
}
 
/* validateInputs function here... */
 
/* Main entry function */
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
 
{
    /* Validate the inputs */
    validateInputs(nlhs, plhs, nrhs, prhs);
 
    /* Quick bail-out in the trivial case of empty inputs */
    if (mxIsEmpty(hMatrix1))  return;
 
    /* Get the number of threads from the Matlab engine (maxNumCompThreads) */
    mxArray *matlabCallOut[1] = {0};
    mxArray *matlabCallIn[1]  = {0};
    mexCallMATLAB(1, matlabCallOut, 0, matlabCallIn, "maxNumCompThreads");
    double *Nthreadsd = mxGetPr(matlabCallOut[0]);
    int Nthreads = (int) Nthreadsd[0];
 
    /* Get the number of elements to process */
    size_t n1 = mxGetNumberOfElements(hMatrix1);
 
    if (Nthreads == 1) {
 
        /* Process the inputs directly (not via a thread) */
        main_loop(prhs, 0, n1-1);
 
    } else {  /* multi-threaded */
 
        /* Allocate memory for handles of worker threads */
        #ifdef _WIN32
            HANDLE    *ThreadList = (HANDLE*)   malloc(Nthreads*sizeof(HANDLE));
        #else
            pthread_t *ThreadList = (pthread_t*)malloc(Nthreads*sizeof(pthread_t));
        #endif
 
        /* Allocate memory for the thread arguments (attributes) */
        double **ThreadID, **ThreadStartIdx, **ThreadEndIdx, ***ThreadArgs;
        double *ThreadID1, *ThreadStartIdx1, *ThreadEndIdx1, **ThreadArgs1;
 
        ThreadID       = (double **) malloc( Nthreads* sizeof(double *) );
        ThreadStartIdx = (double **) malloc( Nthreads* sizeof(double *) );
        ThreadEndIdx   = (double **) malloc( Nthreads* sizeof(double *) );
        ThreadArgs     = (double ***)malloc( Nthreads* sizeof(double **) );
 
        /* Launch the requested number of threads */
        int i;
        int threadBlockSize = ceil( ((double)n1) / Nthreads );
        for (i=0; i<Nthreads; i++)
        {
            /* Create thread ID */
            ThreadID1 = (double *)malloc( 1* sizeof(double) );
            ThreadID1[0] = i;
            ThreadID[i] = ThreadID1;
 
            /* Compute start/end indexes for this thread */
            ThreadStartIdx1 = (double *) malloc( sizeof(double) );
            ThreadStartIdx1[0] = i * threadBlockSize;
            ThreadStartIdx[i] = ThreadStartIdx1;
 
            ThreadEndIdx1 = (double *) malloc( sizeof(double) );
            ThreadEndIdx1[0] = MIN((i+1)*threadBlockSize, n1) - 1;
            ThreadEndIdx[i] = ThreadEndIdx1;
 
            /* Prepare thread input args */
            ThreadArgs1 = (double **) malloc( 4* sizeof(double*) );
            ThreadArgs1[0] = (double *) prhs;
            ThreadArgs1[1] = ThreadID[i];
            ThreadArgs1[2] = ThreadStartIdx[i];
            ThreadArgs1[3] = ThreadEndIdx[i];
 
            ThreadArgs[i] = ThreadArgs1;
 
            /* Launch the thread with its associated args */
            #ifdef _WIN32
                ThreadList[i] = (HANDLE)_beginthreadex(NULL, 0, &thread_func, ThreadArgs[i], 0, NULL);
            #else
                pthread_create ((pthread_t*)&ThreadList[i], NULL, (void *) &thread_func, ThreadArgs[i]);
            #endif
        }
 
        /* Wait for all the treads to finish working */
        #ifdef _WIN32
            for (i=0; i<Nthreads; i++) { WaitForSingleObject(ThreadList[i], INFINITE); }
            for (i=0; i<Nthreads; i++) { CloseHandle( ThreadList[i] ); }
        #else
            for (i=0; i<Nthreads; i++) { pthread_join(ThreadList[i],NULL); }
        #endif
 
        /* Free the memory resources allocated for the threads */
        for (i=0; i<Nthreads; i++)
        {
            free(ThreadArgs[i]);
            free(ThreadID[i]);
            free(ThreadStartIdx[i]);
            free(ThreadEndIdx[i]);
        }
 
        free(ThreadArgs);
        free(ThreadID );
        free(ThreadStartIdx);
        free(ThreadEndIdx);
        free(ThreadList);
    }
 
    return;
}

This file also includes a validateInputs function. I did not include it in the code snippet above for brevity; you can read it directly in the FEX entry (max_in_place.c). This function checks that there are exactly 0 output and 2 input args, that the input args are real non-sparse matrices and that they have the same number of elements.

Note that the threads run a generic thread_func function, which in turn runs the main_loop function with the thread’s specified startIndex, endIndex values. When this function completes, the thread ends itself explicitly, to ensure resource cleanup.

Also note how the thread code is using pthreads on non-Windows (!defined(_WIN32)) machines, and native Windows threads otherwise. This means that the same MEX source-code could be used on all Matlab platforms.

The important thing to note about this framework is that we no longer need to worry about the thread plumbing. If we wish to adapt this code for any other processing, we just need to modify the main_loop function with the new processing logic. In addition, you may wish to modify the validateInputs function based on your new setup of input/output args.

A few caveats:

On Windows machines with R2017b or older, we simply compile using mex max_in_place.c; on non-Windows we might need to add the –lpthread flag to link the pthreads library, depending on your specific compiler.
On R2018a or newer on all platforms, due to MEX’s new interleaved-complex memory format, we would need to compile with the -R2017b flag if we wish to use mexGetPr, as in the sample code above (in R2018a’s new data model, the corresponding function is mxGetDoubles). Note that updating data in-place becomes more difficult with the new MEX API, so if you want to preserve the performance boost that in-place data manipulation provides, it may be better to stick with the legacy data memory model.
The sample code above splits the data between the threads based on the first input matrix’s size. Instead, you may consider sending to the MEX function the loop indexes as extra input args, and then splitting those up between the threads.
In this specific implementation of max_in_place, I have updated the data locations directly. This is generally discouraged and risky, because it conflicts with Matlab’s standard Copy-on-Write mechanism. For example, if we assign the input to any other Matlab variable(s) before calling max_in_place, then that other variable(s) will also get their data updated. If we do not want this side-effect, we should mxUnshareArray the input matrix1, and return the resulting matrix as an output of the MEX function (plhs[0]).

Speed-up tips

The core logic in the specific case that I was asked to optimize was something similar to this:

main_process:
    initialize output matrix
    loop z over all slices in a 3D data matrix
        temp_data = data(:,:,z);
        temp_data = process(temp_data);
        output = max(output, temp_data);
    end z loop

The initial speed-up attempt was naturally focused on the process and max functions. Converting them to a MEX function improved the speed by a factor of ~8, but the resulting run-time (4-5 secs) was still too slow for real-time use. The reason that we did not see a larger speed-up was, I believe, due to several reasons:

temp_data was small enough such that the overheads associated with creating and then disposing separate threads were significant compared to the processing time of each thread.
temp_data was small enough such that each thread processed a relatively small portion of the memory, in contrast to single-threaded processing that accesses memory in larger blocks, more efficiently.
In each iteration of the z loop, the overheads associated with calling the MEX function, handling input variables and validation, creating/disposing threads, and allocating/deallocating memory for temp_data, were repeatedly paid.

So, while the profiling result showed that 98% of the time was spent in the MEX function (which would seem to indicate that not much additional speedup can be achieved), in fact the MEX function was under-performing because of the inefficiencies involved in repeatedly creating threads to process small data chunks. It turned out that running in single-thread mode was actually somewhat faster than multi-threaded mode.

I then moved the entire z loop (entire main_process) into the MEX function, where the threads were split to process separate adjacent blocks of z slices (i.e. different blocks of the z loop). This way, the MEX function was called, the inputs validated, and threads created/disposed only once for the entire process, making this overhead negligible compared to each thread’s processing time. Moreover, each thread now processed the entire temp_data belonging to its z slice, so memory access was more efficient, reducing the memory I/O wait time and improving the overall processing time. Additional benefits were due to the fact that some variables could be reused within each thread across loop iterations, minimizing memory allocations and deallocations. The overall effect was to reduce the total run-time down to ~0.16 secs, a 160x speedup compared to the original (25 secs). As my client said: “You sped up [the application] from practically useless to clinically useful.”

The lesson: try to minimize MEX invocation and thread creation/disposal overheads, and let the threads process as large adjacent memory blocks as possible.

Debugging MEX files

When debugging multi-threaded MEX functions, I find that it’s often easier to run the function in single-threaded mode to debug the core logic, and once this is ready we can switch back multi-threading. This can easily be done by setting the number of threads outside the MEX function using Matlab’s builtin maxNumCompThreads function:

Nthreads = maxNumCompThreads(1);  % temporarily use only 1 thread for debugging
max_in_place(matrix1, matrix2);
maxNumCompThreads(Nthreads);      % restore previous value
%maxNumCompThreads('automatic');  % alternative

Once debugging is done and the MEX function works properly, we should remove the maxNumCompThreads calls, so that the MEX function will use the regular number of Matlab computational threads, which should be the same as the number of cores: feature(‘numCores’).

I typically like to use Eclipse as my IDE for non-Matlab code (Java, C/C++ etc.). Unfortunately, there’s a problem attaching Eclipse to Matlab processes (which is necessary for interactive MEX debugging) if you’re using any recent (post-2015) version of MinGW and Eclipse. This problem is due to a known Eclipse bug, as user Lithe pointed out. The workaround is to install an old version of MinGW, *in addition* to your existing MinGW version. Reportedly, only versions 4.9.1 or older of MinGW include gdb 7.8 (which is still supported by Eclipse), whereas newer versions of MinGW include a newer gdb that is not supported. Download and install such an old MinGW version in a separate folder from your more-modern compiler. Don’t update your MEX to use the older MinGW – just tell Eclipse to use the version of gdb in the old MinGW bin/ folder when you set up a debug configuration for debugging your MEX files.

Once you have a compatible gdb, and ask Eclipse to attach to a process, the processes list will finally appear (it won’t with an incompatible gdb). Use feature('getPID') to get your Matlab process ID, which can then used to attach to the relevant process in the Eclipse Attach-to-process window. For example, if your Matlab’s PID is 4321, then the Matlab process will be called “Program – 4321” in Eclipse’s processes list.

I wish that MathWorks would update their official Answer and their MinGW support package on File Exchange to include this information, because without it debugging on Eclipse becomes impossible. Eclipse is so powerful, easy-to-use and ubiquitous that it’s a shame for most users not to be able to work with it just because the workarounds above are not readily explained.

N.B. If you don’t like Eclipse, you can also use Visual Studio Code (VS Code), as Andy Campbell recently explained in the MathWorks Developers’ blog.

Consulting

Do you have any Matlab code that could use a bit (or a lot) of speeding-up? If so, please contact me for a private consulting offer. I can’t promise to speed up your code by a similar factor of 160x, but you never know…

Faster csvwrite/dlmwrite

Yair Altman — Tue, 03 Oct 2017 15:00:05 +0000

Matlab’s builtin functions for exporting (saving) data to output files are quite sub-optimal (as in slowwwwww…). I wrote a few posts about this in the past (how to improve fwrite performance, and save performance). Today I extend the series by showing how we can improve the performance of delimited text output, for example comma-separated (CSV) or tab-separated (TSV/TXT) files.

The basic problem is that Matlab’s dlmwrite function, which can either be used directly, or via the csvwrite function which calls it internally, is extremely inefficient: It processes each input data value separately, in a non-vectorized loop. In the general (completely non-vectorized) case, each data value is separately converted into a string, and is separately sent to disk (using fprintf). In the specific case of real data values with simple delimiters and formatting, row values are vectorized, but in any case the rows are processed in a non-vectorized loop: A newline character is separately exported at the end of each row, using a separate fprintf call, and this has the effect of flushing the I/O to disk each and every row separately, which is of course disastrous for performance. The output file is indeed originally opened in buffered mode (as I explained in my fprintf performance post), but this only helps for outputs done within the row – the newline output at the end of each row forces an I/O flush regardless of how the file was opened. In general, when you read the short source-code of dlmwrite.m you’ll get the distinct feeling that it was written for correctness and maintainability, and some focus on performance (e.g., the vectorization edge-case). But much more could be done for performance it would seem.

This is where Alex Nazarovsky comes to the rescue.

Alex was so bothered by the slow performance of csvwrite and dlmwrite that he created a C++ (MEX) version that runs about enormously faster (30 times faster on my system). He explains the idea in his blog, and posted it as an open-source utility (mex-writematrix) on GitHub.

Usage of Alex’s utility is very easy:

mex_WriteMatrix(filename, dataMatrix, textFormat, delimiter, writeMode);

where the input arguments are:

filename – full path name for file to export
dataMatrix – matrix of numeric values to be exported
textFormat – format of output text (sprintf format), e.g. '%10.6f'
delimiter – delimiter, for example ',' or ';' or char(9) (=tab)
writeMode – 'w+' for rewriting file; 'a+' for appending (note the lowercase: uppercase will crash Matlab!)

Here is a sample run on my system, writing a simple CSV file containing 1K-by-1K data values (1M elements, ~12MB text files):

>> data = rand(1000, 1000);  % 1M data values, 8MB in memory, ~12MB on disk
 
>> tic, dlmwrite('temp1.csv', data, 'delimiter',',', 'precision','%10.10f'); toc
Elapsed time is 28.724937 seconds.
 
>> tic, mex_WriteMatrix('temp2.csv', data, '%10.10f', ',', 'w+'); toc   % 30 times faster!
Elapsed time is 0.957256 seconds.

Alex’s mex_WriteMatrix function is faster even in the edge case of simple formatting where dlmwrite uses vectorized mode (in that case, the file is exported in ~1.2 secs by dlmwrite and ~0.9 secs by mex_WriteMatrix, on my system).

Trapping Ctrl-C interrupts

Alex’s mex_WriteMatrix code includes another undocumented trick that could help anyone else who uses a long-running MEX function, namely the ability to stop the MEX execution using Ctrl-C. Using Ctrl-C is normally ignored in MEX code, but Wotao Yin showed how we can use the undocumented utIsInterruptPending() MEX function to monitor for user interrupts using Ctrl-C. For easy reference, here is a copy of Wotao Yin’s usage example (read his webpage for additional details):

/* A demo of Ctrl-C detection in mex-file by Wotao Yin. Jan 29, 2010. */
 
#include "mex.h"
 
#if defined (_WIN32)
    #include 
#elif defined (__linux__)
    #include 
#endif
 
#ifdef __cplusplus 
    extern "C" bool utIsInterruptPending();
#else
    extern bool utIsInterruptPending();
#endif
 
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
    int count = 0;    
    while(1) {
        #if defined(_WIN32)
            Sleep(1000);        /* Sleep one second */
        #elif defined(__linux__)
            usleep(1000*1000);  /* Sleep one second */
        #endif
 
        mexPrintf("Count = %d\n", count++);  /* print count and increase it by 1 */
        mexEvalString("drawnow;");           /* flush screen output */
 
        if (utIsInterruptPending()) {        /* check for a Ctrl-C event */
            mexPrintf("Ctrl-C Detected. END\n\n");
            return;
        }
        if (count == 10) {
            mexPrintf("Count Reached 10. END\n\n");
            return;
        }
    }
}

Matlab performance webinars

I am offering a couple of webinars about various ways to improve Matlab’s run-time performance:

Matlab performance tuning part 1 (3:39 hours, syllabus) – $195 (buy)
Matlab performance tuning part 2 (3:43 hours, syllabus) – $195 (buy)
==> or buy both Matlab performance tuning webinars for only $345 (buy)

Both the webinar videos and their corresponding slide-decks are available for download. The webinars content is based on onsite training courses that I presented at multiple client locations (details).

Additional Matlab performance tips can be found under the Performance tag in this blog, as well as in my book “Accelerating MATLAB Performance“.

Email me if you would like additional information on the webinars, or an onsite training course, or about my consulting.

MEX ctrl-c interrupt

Yair Altman — Wed, 15 Jun 2016 17:00:40 +0000

I recently became aware of a very nice hack by Wotao Yin (while at Rice in 2010; currently teaching at UCLA). The core problem is that unlike m-files that can be interrupted in mid-run using ctrl-c, MEX functions cannot be interrupted in the same way. Well, not officially, that is.

Interrupts are very important for long-running user-facing operations. They can even benefit performance by avoiding the need to periodically poll some external state. Interrupts are registered asynchronously, and the program can query the interrupt buffer at its convenience, in special locations of its code, and/or at specific times depending on the required responsiveness.

Yin reported that the libut library that ships with Matlab contain a large set of undocumented functions, including utIsInterruptPending() that can be used to detect ctrl-c interrupt events. The original report of this feature seems to be by Matlab old hand Peter Boettcher back in 2002 (with a Fortran wrapper reported in 2013). The importance of Yin’s post is that he clearly explained the use of this feature, with detailed coding and compilation instructions. Except for Peter’s original report, Yin’s post and the Fortran wrapper, precious few mentions can be found online (oddly enough, yours truly mentioned it in the very same CSSM newsletter post in which I outed this blog back in 2009). Apparently, this feature was supposed to have been made documented in R12.1, but for some reason it was not and people just moved on and forgot about it.

The relevant functions seem to be:

// Most important functions (C):
bool utIsInterruptEnabled(void)
bool utIsInterruptPending(void)
bool utWasInterruptHandled(void)
 
bool utSetInterruptHandled(bool)
bool utSetInterruptEnabled(bool)
bool utSetInterruptPending(bool)
 
// Related functions (C, signature unknown):
? utHandlePendingInterrupt(?)
? utRestoreInterruptEnabled(?)
? utLongjmpIfInterruptPending(?)
 
// utInterruptMode class (C++):
utInterruptMode::utInterruptMode(enum utInterruptMode::Mode)  // constructor
utInterruptMode::~utInterruptMode(void)  // destructor
bool utInterruptMode::isInterruptEnabled(void)
enum utInterruptMode::Mode utInterruptMode::CurrentMode
enum utInterruptMode::Mode utInterruptMode::GetCurrentMode(void)
enum utInterruptMode::Mode utInterruptMode::GetOriginalMode(void)
enum utInterruptMode::Mode utInterruptMode::SetMode(enum utInterruptMode::Mode)
 
// utInterruptState class (C++):
class utInterruptState::AtomicPendingFlags utInterruptState::flags_pending
void utInterruptState::HandlePeekMsgPending(void)
bool utInterruptState::HandlePendingInterrupt(void)
bool utInterruptState::interrupt_handled
bool utInterruptState::IsInterruptPending(void)
bool utInterruptState::IsPauseMsgPending(void)
class utInterruptState & utInterruptState::operator=(class utInterruptState const &)
void utInterruptState::PeekMessageIfPending(void)
bool utInterruptState::SetInterruptHandled(bool)
bool utInterruptState::SetInterruptPending(bool)
bool utInterruptState::SetIqmInterruptPending(bool)
bool utInterruptState::SetPauseMsgPending(bool)
bool utInterruptState::SetPeekMsgPending(bool)
void utInterruptState::ThrowIfInterruptPending(void)
bool utInterruptState::WasInterruptHandled(void)
unsigned int const utInterruptState::FLAG_PENDING_CTRLC
unsigned int const utInterruptState::FLAG_PENDING_INTERRUPT_MASK
unsigned int const utInterruptState::FLAG_PENDING_IQM_INTERRUPT
unsigned int const utInterruptState::FLAG_PENDING_PAUSE
unsigned int const utInterruptState::FLAG_PENDING_PEEKMSG

Of all these functions, we can make do with just utIsInterruptPending, as shown by Yin (complete with compilation instructions):

/* A demo of Ctrl-C detection in mex-file by Wotao Yin. Jan 29, 2010. */
 
#include "mex.h"
 
#if defined (_WIN32)
    #include 
#elif defined (__linux__)
    #include 
#endif
 
#ifdef __cplusplus 
    extern "C" bool utIsInterruptPending();
#else
    extern bool utIsInterruptPending();
#endif
 
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
    int count = 0;    
    while(1) {
        #if defined(_WIN32)
            Sleep(1000);        /* Sleep one second */
        #elif defined(__linux__)
            usleep(1000*1000);  /* Sleep one second */
        #endif
 
        mexPrintf("Count = %d\n", count++);  /* print count and increase it by 1 */
        mexEvalString("drawnow;");           /* flush screen output */
 
        if (utIsInterruptPending()) {        /* check for a Ctrl-C event */
            mexPrintf("Ctrl-C Detected. END\n\n");
            return;
        }
        if (count == 10) {
            mexPrintf("Count Reached 10. END\n\n");
            return;
        }
    }
}

After returning to Matlab, the Ctrl-C event will stop the execution of the caller function(s). However, sometimes we would like to keep the partial calculation, for example if the calculation can later be resumed from the point of interruption. It’s not possible to save partial result using only the utIsInterruptPending() function. However, it is possible to reset the interrupt state so that Matlab will not stop the execution after returning control from the mex function, and the caller function can get and use the partial calculation. The magic is done using another undocumented libut function named utSetInterruptPending(). A short example is included below (provided by Zvi Devir):

// Import libut functions
#pragma comment(lib, "libut.lib")
extern "C" bool utIsInterruptPending();
extern "C" bool utSetInterruptPending(bool); 
// Run calculation divided into steps
int n;
for (n = 0; n < count; n++) {
	// some expensive calculation
	a_long_calculation(..., n);
 
	if (utIsInterruptPending()) {		// check for a Ctrl-C event
		mexPrintf("Ctrl-C detected [%d/%d].\n\n", n, count);
		utSetInterruptPending(false);	// Got it... consume event		break;
	}
}
 
// Write back partial or full calculation
...

An elaboration of the idea of Ctrl-C detection was created by Ramon Casero (Oxford) for the Gerardus project. Ramon wrapped Yin’s code in C/C++ #define to create an easy-to-use pre-processor function ctrlcCheckPoint(fileName,lineNumber):

...
ctrlcCheckPoint(__FILE__, __LINE__);  // exit if user pressed Ctrl+C
...

Here’s the code for the preprocessor header file (GerardusCommon.h) that #defines ctrlcCheckPoint() (naturally, the __FILE__ and __LINE__ parts could also be made part of the #define, for even simpler usage):

 /*
  * Author: Ramon Casero 
  * Copyright © 2011-2013 University of Oxford
  * Version: 0.10.2
  *
  * University of Oxford means the Chancellor, Masters and Scholars of
  * the University of Oxford, having an administrative office at
  * Wellington Square, Oxford OX1 2JD, UK. 
  *
  * This file is part of Gerardus.
  *
  * This program is free software: you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published by
  * the Free Software Foundation, either version 3 of the License, or
  * (at your option) any later version.
  *
  * This program is distributed in the hope that it will be useful,
  * but WITHOUT ANY WARRANTY; without even the implied warranty of
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  * GNU General Public License for more details. The offer of this
  * program under the terms of the License is subject to the License
  * being interpreted in accordance with English Law and subject to any
  * action against the University of Oxford being under the jurisdiction
  * of the English Courts.
  *
  * You should have received a copy of the GNU General Public License
  * along with this program.  If not, see
  * .
  */
 
#ifndef GERARDUSCOMMON_H
#define GERARDUSCOMMON_H
 
/* mex headers */
#include 
 
/* C++ headers */
#include 
#include 
#include 
#include 
#include 
 
/* ITK headers */
#include "itkOffset.h"
 
/*
 * utIsInterruptPending(): "undocumented MATLAB API implemented in
 * libut.so, libut.dll, and included in the import library
 * libut.lib. To use utIsInterruptPending in a mex-file, one must
 * manually declare bool utIsInterruptPending() because this function
 * is not included in any header files shipped with MATLAB. Since
 * libut.lib, by default, is not linked by mex, one must explicitly
 * tell mex to use libut.lib." -- Wotao Yin, 
 * http://www.caam.rice.edu/~wy1/links/mex_ctrl_c_trick/
 *
 */
#ifdef __cplusplus 
    extern "C" bool utIsInterruptPending();
#else
    extern bool utIsInterruptPending();
#endif
 
/*
 * ctrlcCheckPoint(): function to check whether the user has pressed
 * Ctrl+C, and if so, terminate execution returning an error message
 * with a hyperlink to the offending function's help, and a hyperlink
 * to the line in the source code file this function was called from
 *
 * It is implemented as a C++ macro to check for the CTRL+C flag, and
 * a call to function ctrlcErrMsgTxt() inside, to throw the error. The
 * reason is that if ctrlcCheckPoint() were a function instead of a
 * macro, this would introduce a function call at every iteration of
 * the loop, which is very expensive. But then we don't want to put
 * the whole error message part inside a macro, it's bug-prone and bad
 * programming practice. And once the CTRL+C has been detected,
 * whether the error message is generated a bit faster or not is not
 * important.
 *
 * In practice, to use this function put a call like this e.g. inside
 * loops that may take for a very long time:
 *
 *    // exit if user pressed Ctrl+C
 *    ctrlcCheckPoint(__FILE__, __LINE__);
 *
 * sourceFile: full path and name of the C++ file that calls this
 *             function. This should usually be the preprocessor
 *             directive __FILE__
 *
 * lineNumber: line number where this function is called from. This
 *             should usually be the preprocessor directive __LINE__
 *
 */
inline
void ctrlcErrMsgTxt(std::string sourceFile, int lineNumber) {
 
  // run from here the following code in the Matlab side:
  //
  // >> path = mfilename('fullpath')
  //
  // this provides the full path and function name of the function
  // that called ctrlcCheckPoint()
  int nlhs = 1; // number of output arguments we expect
  mxArray *plhs[1]; // to store the output argument
  int nrhs = 1; // number of input arguments we are going to pass
  mxArray *prhs[1]; // to store the input argument we are going to pass
  prhs[0] = mxCreateString("fullpath"); // input argument to pass
  if (mexCallMATLAB(nlhs, plhs, nrhs, prhs, "mfilename")) { // run mfilename('fullpath')
    mexErrMsgTxt("ctrlcCheckPoint(): mfilename('fullpath') returned error");
  }
  if (plhs == NULL) {
    mexErrMsgTxt("ctrlcCheckPoint(): mfilename('fullpath') returned NULL array of outputs");
  }
  if (plhs[0] == NULL) {
    mexErrMsgTxt("ctrlcCheckPoint(): mfilename('fullpath') returned NULL output instead of valid path");
  }
 
  // get full path to current function, including function's name
  // (without the file extension)
  char *pathAndName = mxArrayToString(plhs[0]);
  if (pathAndName == NULL) {
    mexErrMsgTxt("ctrlcCheckPoint(): mfilename('fullpath') output cannot be converted to string");
  }
 
  // for some reason, using mexErrMsgTxt() to give this output
  // doesn't work. Instead, we have to give the output to the
  // standar error, and then call mexErrMsgTxt() to terminate
  // execution of the program
  std::cerr << "Operation terminated by user during "
	    << "\"matlab:helpUtils.errorDocCallback('"
	    << mexFunctionName()
	    << "', '" << pathAndName << ".m', " << lineNumber << ")\">"
	    << mexFunctionName()
	    << " (\"matlab:opentoline('"
	    << sourceFile
	    << "'," << lineNumber << ",0)\">line " << lineNumber
	    << ")"
	    << std::endl;
  mexErrMsgTxt("");
}
 
#define ctrlcCheckPoint(sourceFile, lineNumber)		\
  if (utIsInterruptPending()) {				\
    ctrlcErrMsgTxt(sourceFile, lineNumber);		\
  }

This feature has remained as-is since at least 2002 (when Peter first reported it), and apparently works to this day. Why then did I categorize this as “High risk for breaking in a future Matlab versions”? The reason is that internal undocumented MEX functions are prone to break in new Matlab releases (example). Hopefully my report today will prompt MathWorks to make this feature documented, rather than to remove it from a future release

By the way, if anyone knows any use for the other interrupt-related functions in libut that I listed above, and/or the missing signatures, please leave a note below and I will update here accordingly.

Addendum July 2, 2016: Pavel Holoborodko just posted an asynchronous version of this mechanism, which is an improvement of the synchronous code above. Pavel uses a separate Windows thread to check the Ctrl-C interrupt state. Readers can extend this idea to use threads for other asynchronous (multi-threaded) computations or I/O. In chapter 7 of my book “Accelerating MATLAB Performance” I explain how we can use Posix threads (pthreads) or OpenMP threads for similar multithreading in MEX (unlike Windows threads, pthreads and OpenMP are cross platform). Users can also use other multithreading solutions, such as the open-source Boost library (bundled with Matlab) or Intel’s commercial TBB.

Better MEX error messages

Yair Altman — Wed, 24 Feb 2016 18:00:25 +0000

I would like to introduce guest blogger Pavel Holoborodko, the developer of the Advanpix Multiprecision Computing Toolbox for MATLAB. Pavel has already posted here in the past as a guest blogger about undocumented Matlab MEX functions. Today he will discuss another little-known aspect of advanced MEX programming with Matlab.

MEX API provides two functions for proper handling of erroneous situations: the legacy mexErrMsgTxt and the newer mexErrMsgIdAndTxt. Both show error message, interrupt execution of a MEX module and return control to Matlab immediately.

Under the hood, these functions are implemented through C++ exceptions. The reason for this design choice is unclear. Throwing C++ exceptions across module boundary (e.g. from dynamic library to host process) is unsafe and generally considered as bad practice in software design. Exceptions are part of C++ run-time library and different versions of it might have incompatible implementations.

This restricts MEX to use only the same version of C++ run-time and GCC which were used to build that particular version of Matlab itself. This is one of the reasons why TMW distributes its own version of GCC/libstdc++ along with every release of Matlab and pushes developers to use it for MEX compilation.

Such unfortunate design decision have two unpleasant consequences:

developers must use some old version of GCC (forget all the fancy stuff from C++11, C++14, etc.)
compiled MEX modules most likely will not work on older/newer versions of MATLAB (re-compilation is required).

The good news is that both issues are solvable and I will write more about this in the future (it is actually possible to create single binary MEX module, which will work on any GNU Linux flavor regardless of the versions of libc/libstc++/gcc installed in the system or used in Matlab).

Here I propose just first step towards freedom – avoid direct usage of mexErrMsg** functions. Use a simple wrapper instead:

void mxShowCriticalErrorMessage(const char *msg)
{
    mxArray *arg;
    arg = mxCreateString(msg);
    mexCallMATLAB(0,0,1,&arg,"error");
}

The mxShowCriticalErrorMessage function calls Matlab’s built-in error function via the interpreter, with the error message as input parameter.

In addition to being safe, this approach potentially gives us better control over what additional information is shown together with error messages. Instead of a string, we can use an errorStruct as input argument to Matlab’s error function, with its fields tuned to our requirements (not shown here as I want to keep the example simple).

Even without tuning, output of mxShowCriticalErrorMessage is much more informative and user-friendly:

Error message from Matlab’s built-in functionality:
```
>> A = magic(3);
>> A(0)
Subscript indices must either be real positive integers or logicals.
```
Nice one-line message without any distracting information.
Error message from MEX using mexErrMsgTxt/mexErrMsgIdAndTxt:
```
>> A = mp(magic(3));   % convert matrix to arbitrary precision type, provided by our toolbox
>> A(0)                % subsref is called from toolbox, it is implemented in mpimpl.mex
Error using mpimpl
Subscript indices must either be real positive integers or logicals.
Error in mp/subsref (line 860)
        [varargout{1:nargout}] = mpimpl(170, varargin{:});
```
Intimidating four lines with actual error message lost in the middle. All the additional information is meaningless for end-user and actually misleading.
The worst thing is that such error message is very different from what user get used to (see above one-liner), which leads to confusion if MEX plugin is properly working at all.
Error message from MEX using mxShowCriticalErrorMessage:
```
>> A = magic(3);
>> A(0)
Error using mp/subsref (line 860)
Subscript indices must either be real positive integers or logicals.
```
Now the message is clear and short, with error description in the last line where the user focus is.

Undocumented feature list

Yair Altman — Wed, 19 Mar 2014 18:00:30 +0000

Three years ago I posted an article on Matlab’s undocumented feature function. feature is a Matlab function that enables access to undocumented internal Matlab functionality. Most of this functionality is not very useful, but in some cases it could indeed be very interesting. As sometimes happens, this innocent-enough article generated a lot of interest, both online and offline. Perhaps the reason was that in this article I listed the known list of supported features with a short explanation and references were available. At the time, this was the only comprehensive such listing, which I manually collected from numerous sources. For this reason I was delighted to receive Yves Piguet’s tip about the availability of a programmatic interface for a full listing of features:

>> list = feature('list')   % 260 features in R2013b
list = 
1x260 struct array with fields:
    name
    value
    has_callback
    has_builtin
    call_count

Which can be listed as follows:

for i = 1 : length(list)
   fprintf('%35s has_cb=%d has_bi=%d calls=%d val=%g\n', ...
      list(i).name, list(i).has_callback, list(i).has_builtin, list(i).call_count, list(i).value);
end
 
                    100 has_cb=0 has_bi=1 calls=0 val=1
                    102 has_cb=0 has_bi=1 calls=0 val=1
                     12 has_cb=0 has_bi=1 calls=0 val=1
                     14 has_cb=0 has_bi=1 calls=0 val=1
                     25 has_cb=0 has_bi=1 calls=0 val=1
                    300 has_cb=0 has_bi=0 calls=0 val=1
                    301 has_cb=0 has_bi=0 calls=0 val=1
                     44 has_cb=0 has_bi=1 calls=0 val=1
                     45 has_cb=0 has_bi=1 calls=0 val=1
                      7 has_cb=0 has_bi=0 calls=0 val=1
                      8 has_cb=0 has_bi=1 calls=0 val=1
                      9 has_cb=0 has_bi=0 calls=0 val=1
                  accel has_cb=0 has_bi=1 calls=0 val=0
         AccelBlockSize has_cb=0 has_bi=1 calls=0 val=0
          AccelMaxTemps has_cb=0 has_bi=1 calls=0 val=0
    AccelThreadBlockMin has_cb=0 has_bi=1 calls=0 val=0
              allCycles has_cb=0 has_bi=1 calls=0 val=0
 AllWarningsCanBeErrors has_cb=1 has_bi=0 calls=0 val=0
           ArrayEditing has_cb=0 has_bi=0 calls=0 val=1
       AutomationServer has_cb=0 has_bi=1 calls=0 val=0
              CachePath has_cb=0 has_bi=0 calls=0 val=1
     CaptureScreenCount has_cb=0 has_bi=0 calls=0 val=0
       CheckMallocClear has_cb=0 has_bi=0 calls=0 val=1
                    ... (etc. etc.)

Unfortunately, in the latest Matlab R2014a, which was released last week, this nice feature has been removed:

>> list = feature('list')
Error using feature
Feature list not found

Luckily, the list can still be retrieved programmatically, using an undocumented MEX library function. Place the following in a file called feature_list.cpp:

#include "mex.h"
void svListFeatures(int, mxArray_tag** const, int, mxArray_tag** const);
 
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    svListFeatures(1,plhs,0,NULL);
}

Now compile this MEX file:

if isunix
   mex('feature_list.cpp',['-L',matlabroot,'/bin/',computer('arch')],'-lmwservices');
else
   mex('feature_list.cpp','-llibmwservices');
end

As you can see, all this MEX function does is just call the svListFeatures() function from the libmwservices dynamic/shared library.

We can now run this MEX function directly in Matlab:

>> list = feature_list
list = 
1x273 struct array with fields:
    name
    value
    has_callback
    has_builtin
    call_count

Running both the new feature_list and the previous feature(‘list’) on previous Matlab releases produces exactly the same result, showing that under the hood, feature(‘list’) was basically just calling libmwservices‘s svListFeatures() function.

There are 273 undocumented features in R2014a: 20 were added and 7 were removed compared to R2013b. For those interested, the modified features in the past two releases are:

R2013b:
1. ConvertAllDoublesToHandles (added)
2. DrawnowNowaitFeature (added)
3. getsimd (added)
4. CoreDump (removed)
R2014a:
1. GPUAllowPartialSharedDataCopies (added)
2. GPUAllowSharedDataCopies (added)
3. GetOpenGLData (added)
4. GetOpenGLInfo (added)
5. HGUpdateErrorChecking (added)
6. JVMShutdown (added)
7. LoadHG1FigFileWithHG1Defaults (added)
8. OpenGLBitmapZbufferBug (added)
9. OpenGLClippedImageBug (added)
10. OpenGLDockingBug (added)
11. OpenGLEraseModeBug (added)
12. OpenGLLineSmoothingBug (added)
13. OpenGLPolygonOffsetBiasBug (added)
14. OpenGLVerbose (added)
15. OpenGLWobbleTesselatorBug (added)
16. SaveFigFileWithHG1Defaults (added)
17. ShutdownReportLevel (added)
18. UseGenericOpenGL (added)
19. crash_mode (added)
20. isLightweightEval (added)
21. EnableBangErrors (removed)
22. EnableJavaAsGObject (removed)
23. List (removed) – this is actually the reason for today’s article!
24. hwtic (removed)
25. hwtoc (removed)
26. oldtictoc (removed)
27. timing (removed)

Happy digging!

p.s. – Packt publishing, which publishes IT books, has a great deal on ebooks until March 26 – buy 1 get 1 free. Some of their books are Matlab-related. Check it out!

Addendum July 18, 2016: Since R2015a, feature(‘list’) returns an empty array. A reader of this blog, Mikhail P, reported the following workaround, by temporarily setting the MWE_INSTALL environment variable (that should then be reset, since it is used in other locations too):

% The following was run in R2016b
>> setenv('MWE_INSTALL','1'); list = feature('list'), setenv('MWE_INSTALL');
list = 
  1×266 struct array with fields:
    name
    value
    has_callback
    has_builtin
    call_count

Explicit multi-threading in Matlab part 3

Yair Altman — Wed, 05 Mar 2014 18:00:26 +0000

In the past weeks, I explained how we can start asynchronous Java threads to run in parallel to the main Matlab processing using Java and Dot-Net threads. Today I continue by examining C/C++ threads. This series will conclude next week, by discussing timer objects and process-spawning.

The alternatives that can be used to enable Matlab multithreading with C/C++ include standard POSIX threads, native OS threads, OpenMP, MPI (Message Passing Interface), TBB (Thread Building Blocks), Cilk, OpenACC, OpenCL or Boost. We can also use libraries targeting specific platforms/architectures: Intel MKL, C++ AMP, Bolt etc. Note that the Boost library is included in every relatively-modern Matlab release, so we can either use the built-in library (easier to deploy, consistency with Matlab), or download and install the latest version and use it separately. On Windows, we can also use .Net’s Thread class, as explained in last week’s article. This is a very wide range of alternatives, and it’s already been covered extensively elsewhere from the C/C++ side.

Today I will only discuss the POSIX alternative. The benefit of POSIX is that is is more-or-less cross-platform, enabling the same code to work on all MATLAB platforms, as well as any other POSIX-supported platform.

POSIX threads (Pthreads) is a standard API for multi-threaded programming implemented natively on many Unix-like systems, and also supported on Windows. Pthreads includes functionality for creating and managing threads, and provides a set of synchronization primitives such as mutexes, conditional variables, semaphores, read/write locks, and barriers. POSIX has extensive offline and online documentation.

Note that POSIX is natively supported on Macs & Linux, but requires a separate installation on Windows. Two of the leading alternatives are Pthreads_Win32 (also works on Win64, despite its name…), and winpthreads (part of the extensive MinGW open-source project).

When creating a C/C++ -based function, we can either compile/link it into a dynamic/shared library (loadable into Matlab using the loadlibrary & calllib functions), or into a MEX file that can be called directly from M-code. The code looks the same, except that a MEX file has a gateway function named mexFunction that has a predefined interface. Today I’ll show the MEX variant using C; the adaptation to C++ is easy. To create multi-threaded MEX, all it takes is to connect the thread-enabled C/C++ code into our mexFunction(), provide the relevant threading library to the mex linker and we’re done.

The example code below continues the I/O example used throughout this series, of asynchronously saving a vector of data to disk file. Place the following in a file called myPosixThread.c:

#include "mex.h"
#include "pthread.h"
#include "stdio.h"
 
char *filename;
double *data;
size_t numElementsExpected;
size_t numElementsWritten;
 
/* thread compute function */ 
void *thread_run(void *p)
{
    /* Open the file for binary output */
    FILE *fp = fopen(filename, "wb");
    if (fp == NULL)
        mexErrMsgIdAndTxt("YMA:MexIO:errorOpeningFile", "Could not open file %s", filename);
 
    /* Write the data to file */
    numElementsWritten = (size_t) fwrite(data, sizeof(double), numElementsExpected, fp);
    fclose(fp);
 
    /* Ensure that the data was correctly written */
    if (numElementsWritten != numElementsExpected)
        mexErrMsgIdAndTxt("YMA:MexIO:errorWritingFile",
                "Error writing data to %s: wrote %d, expected %d\n", 
                filename, numElementsWritten, numElementsExpected);
 
    /* Cleanup */
    pthread_exit(NULL);
}
 
/* The MEX gateway function */
void mexFunction(int nlhs,       mxArray *plhs[],
                 int nrhs, const mxArray *prhs[])
{
    pthread_t thread;
 
    /* Check for proper number of input and output arguments */
    if (nrhs != 2)
        mexErrMsgIdAndTxt("YMA:MexIO:invalidNumInputs", "2 input args required: filename, data");
    if (nlhs > 0)
        mexErrMsgIdAndTxt("YMA:MexIO:maxlhs", "Too many output arguments");
    if (!mxIsChar(prhs[0]))
        mexErrMsgIdAndTxt("YMA:MexIO:invalidInput", "Input filename must be of type string");
    if (!mxIsDouble(prhs[1]))
        mexErrMsgIdAndTxt("YMA:MexIO:invalidInput", "Input data must be of type double");
 
    /* Get the inputs: filename & data */
    filename = mxArrayToString(prhs[0]);
    data = mxGetPr(prhs[1]);  
    numElementsExpected = mxGetNumberOfElements(prhs[1]);
 
    /* Launch a new I/O thread using default attributes */
    if (pthread_create(&thread, NULL, thread_run, NULL))
        mexErrMsgIdAndTxt("YMA:MexIO:threadFailed", "Thread creation failed");
}

This source file can be compiled as follows on Macs/Linux:

mex myPosixThread.c –lpthread

Or on Windows, assuming we installed Pthreads-Win32, we need to set-up the environment:

% prepare the environment (we could also use the -I, -L flags)
pthreadsInstallFolder = 'C:\Program Files\Pthreads_Win32\';  % change this as needed
setenv('PATH',   [getenv('PATH')    ';' pthreadsInstallFolder 'dll\x64']);
setenv('LIB',    [getenv('LIB')     ';' pthreadsInstallFolder 'lib\x64']);
setenv('INCLUDE',[getenv('INCLUDE') ';' pthreadsInstallFolder 'include']);
 
% create a 64-bit MEX that uses the pthreads DLL
mex myPosixThread.c -lpthreadVC2
 
% copy the pthreadVC2.dll file to be accessible to the MEX file, otherwise it will not run
copyfile([pthreadsInstallFolder 'dll\x64\pthreadVC2.dll'], '.')

To run the MEX file from MATLAB, we use the following code snippet (note the similarity with our Java/.Net examples earlier in this series):

addpath('C:\Yair\Code\');  % location of our myPosixThread MEX file
data = rand(5e6,1);  % pre-processing (5M elements, ~40MB)
myPosixThread('F:\test.data',data);  % start running in parallel
data2 = fft(data);  % post-processing (pthread I/O runs in parallel)

Note that we cannot directly modify the data (data=fft(data)) while it is being accessed by the I/O thread. This would cause the data to be reallocated elsewhere in memory, causing the I/O thread to access invalid (stale) memory – this would cause a segmentation violation causing Matlab to crash. Read-only access (data2=fft(data)) is ok, just ensure not to update the data. This was not a problem with our earlier Java/.Net threads, since they received their data by value, but mexFunction() receives its data by reference (which is quicker and saves memory, but also has its limitations). Alternatively, we can memcpy() the Matlab data to a newly-allocated memory block within our thread and only use the memcpy‘ed data from then on. This will ensure that if any updates to the original data occur, the parallel thread will not be affected and no SEGV will occur.

Also note that we call a few MEX functions from within the parallel portion of our code (the thread’s run() function). This works without problems on recent Matlab releases since some MEX API functions have been made thread-safe, however it might not work in earlier (or future) Matlab versions. Therefore, to make our code portable, it is recommended to not interact with Matlab at all during parallel blocks, or to protect MEX API calls by critical sections. Alternatively, only use MEX API calls in the main thread (which is actually MT), defined as those parts of code that run in the same thread as mexFunction(). Synchronization with other threads can be done using POSIX mechanisms such as pthread_join().

To complete the picture, it is also possible to use native threads (rather than POSIX) on Windows: The MEX file should #include and call _beginthread(). In fact, since Microsoft for some reason decided to reinvent the wheel with its own native threads and not to support POSIX, all the POSIX implementations on Windows are basically a wrapper for the native Window threads. Using these native threads directly often proves to be the fastest alternative. Unfortunately, code using native threads is not portable to Macs/Linux, unlike POSIX-based code.

Yuval Tassa’s excellent mmx utility (which deserves a detailed review by its own right!) employs both Pthreads (Mac/Linux) and Windows threads in its MEX file. Readers are encouraged to review mmx’s code to see the specifics.

Another related utility on the Matlab File Exchange is Thomas Weibel’s MexThread, which uses C++11‘s std::threads.

Addendum 2018-07-18: A followup post about creating and debugging multi-threaded C-based MEX functions was just posted.

Serializing/deserializing Matlab data

Yair Altman — Wed, 22 Jan 2014 19:57:49 +0000

Last year I wrote an article on improving the performance of the save function. The article discussed various ways by which we can store Matlab data on disk. However, in many cases we are interested in a byte-stream serialization, in order to transmit information to external processes.

The request to get a serialized byte-stream of Matlab data has been around for many years (example), but MathWorks has never released a documented way of serializing and unserializing data, except by storing onto a disk file and later loading it from file. Naturally, using a disk file significantly degrades performance. We could always use a RAM-disk or flash memory for improved performance, but in any case this seems like a major overkill to such a simple requirement.

In last year’s article, I presented a File Exchange utility for such generic serialization/deserialization. However, that utility is limited in the types of data that it supports, and while it is relatively fast, there is a much better, more generic and faster solution.

The solution appears to use the undocumented built-in functions getByteStreamFromArray and getArrayFromByteStream, which are apparently used internally by the save and load functions. The usage is very simple:

byteStream = getByteStreamFromArray(anyData);  % 1xN uint8 array
anyData = getArrayFromByteStream(byteStream);

Many Matlab functions, documented and undocumented alike, are defined in XML files within the %matlabroot%/bin/registry/ folder; our specific functions can be found in %matlabroot%/bin/registry/hgbuiltins.xml. While other functions include information about their location and number of input/output args, these functions do not. Their only XML attribute is type = ":all:", which seems to indicate that they accept all data types as input. Despite the fact that the functions are defined in hgbuiltins.xml, they are not limited to HG objects – we can serialize basically any Matlab data: structs, class objects, numeric/cell arrays, sparse data, Java handles, timers, etc. For example:

% Simple Matlab data
>> byteStream = getByteStreamFromArray(pi)  % 1x72 uint8 array
byteStream =
  Columns 1 through 19
    0    1   73   77    0    0    0    0   14    0    0    0   56    0    0    0    6    0    0
  Columns 20 through 38
    0    8    0    0    0    6    0    0    0    0    0    0    0    5    0    0    0    8    0
  Columns 39 through 57
    0    0    1    0    0    0    1    0    0    0    1    0    0    0    0    0    0    0    9
  Columns 58 through 72
    0    0    0    8    0    0    0   24   45   68   84  251   33    9   64
 
>> getArrayFromByteStream(byteStream)
ans =
          3.14159265358979
 
% A cell array of several data types
>> byteStream = getByteStreamFromArray({pi, 'abc', struct('a',5)});  % 1x312 uint8 array
>> getArrayFromByteStream(byteStream)
ans = 
    [3.14159265358979]    'abc'    [1x1 struct]
 
% A Java object
>> byteStream = getByteStreamFromArray(java.awt.Color.red);  % 1x408 uint8 array
>> getArrayFromByteStream(byteStream)
ans =
java.awt.Color[r=255,g=0,b=0]
 
% A Matlab timer
>> byteStream = getByteStreamFromArray(timer);  % 1x2160 uint8 array
>> getArrayFromByteStream(byteStream)
 
   Timer Object: timer-2
 
   Timer Settings
      ExecutionMode: singleShot
             Period: 1
           BusyMode: drop
            Running: off
 
   Callbacks
           TimerFcn: ''
           ErrorFcn: ''
           StartFcn: ''
            StopFcn: ''
 
% A Matlab class object
>> byteStream = getByteStreamFromArray(matlab.System);  % 1x1760 uint8 array
>> getArrayFromByteStream(byteStream)
ans = 
  System: matlab.System

Serializing HG objects

Of course, we can also serialize/deserialize also HG controls, plots/axes and even entire figures. When doing so, it is important to serialize the handle of the object, rather than its numeric handle, since we are interested in serializing the graphic object, not the scalar numeric value of the handle:

% Serializing a simple figure with toolbar and menubar takes almost 0.5 MB !
>> hFig = handle(figure);  % a new default Matlab figure
>> length(getByteStreamFromArray(hFig))
ans =
      479128
 
% Removing the menubar and toolbar removes much of this amount:
>> set(hFig, 'menuBar','none', 'toolbar','none')
>> length(getByteStreamFromArray(hFig))
ans =
       11848   %!!!
 
% Plot lines are not nearly as "expensive" as the toolbar/menubar
>> x=0:.01:5; hp=plot(x,sin(x));
>> byteStream = getByteStreamFromArray(hFig);
>> length(byteStream)
ans =
       33088
 
>> delete(hFig);
>> hFig2 = getArrayFromByteStream(byteStream)
hFig2 =
	figure

The interesting thing here is that when we deserialize a byte-stream of an HG object, it is automatically rendered onscreen. This could be very useful for persistence mechanisms of GUI applications. For example, we can save the figure handles in file so that if the application crashes and relaunches, it simply loads the file and we get exactly the same GUI state, complete with graphs and what-not, just as before the crash. Although the figure was deleted in the last example, deserializing the data caused the figure to reappear.

We do not need to serialize the entire figure. Instead, we could choose to serialize only a specific plot line or axes. For example:

>> x=0:0.01:5; hp=plot(x,sin(x));
>> byteStream = getByteStreamFromArray(handle(hp));  % 1x13080 uint8 array
>> hLine = getArrayFromByteStream(byteStream)
ans =
	graph2d.lineseries

This could also be used to easily clone (copy) any figure or other HG object, by simply calling getArrayFromByteStream (note the corresponding copyobj function, which I bet uses the same underlying mechanism).

Also note that unlike HG objects, deserialized timers are NOT automatically restarted; perhaps the Running property is labeled transient or dependent. Properties defined with these attributes are apparently not serialized.

Performance aspects

Using the builtin getByteStreamFromArray and getArrayFromByteStream functions can provide significant performance speedups when caching Matlab data. In fact, it could be used to store otherwise unsupported objects using the save -v6 or savefast alternatives, which I discussed in my save performance article. Robin Ince has shown how this can be used to reduce the combined caching/uncaching run-time from 115 secs with plain-vanilla save, to just 11 secs using savefast. Robin hasn’t tested this in his post, but since the serialized data is a simple uint8 array, it is intrinsically supported by the save -v6 option, which is the fastest alternative of all:

>> byteStream = getByteStreamFromArray(hFig);
>> tic, save('test.mat','-v6','byteStream'); toc
Elapsed time is 0.001924 seconds.
 
>> load('test.mat')
>> data = load('test.mat')
data = 
    byteStream: [1x33256 uint8]
>> getArrayFromByteStream(data.byteStream)
ans =
	figure

Moreover, we can now use java.util.Hashtable to store a cache map of any Matlab data, rather than use the much slower and more limited containers.Map class provided in Matlab.

Finally, note that as built-in functions, these functions could change without prior notice on any future Matlab release.

MEX interface – mxSerialize/mxDeserialize

To complete the picture, MEX includes a couple of undocumented functions mxSerialize and mxDeserialize, which correspond to the above functions. getByteStreamFromArray and getArrayFromByteStream apparently call them internally, since they provide the same results. Back in 2007, Brad Phelan wrote a MEX wrapper that could be used directly in Matlab (mxSerialize.c, mxDeserialize.c). The C interface was very simple, and so was the usage:

#include "mex.h"
 
EXTERN_C mxArray* mxSerialize(mxArray const *);
EXTERN_C mxArray* mxDeserialize(const void *, size_t);
 
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    if (nlhs && nrhs) {
          plhs[0] = (mxArray *) mxSerialize(prhs[0]);
        //plhs[0] = (mxArray *) mxDeserialize(mxGetData(prhs[0]), mxGetNumberOfElements(prhs[0]));
    }
}

Unfortunately, MathWorks has removed the C interface for these functions from libmx in R2014a, keeping only their C++ interfaces:

mxArray* matrix::detail::noninlined::mx_array_api::mxSerialize(mxArray const *anyData)
mxArray* matrix::detail::noninlined::mx_array_api::mxDeserialize(void const *byteStream, unsigned __int64 numberOfBytes)
mxArray* matrix::detail::noninlined::mx_array_api::mxDeserializeWithTag(void const *byteStream, unsigned __int64 numberOfBytes, char const* *tagName)

These are not the only MEX functions that were removed from libmx in R2014a. Hundreds of other C functions were also removed with them, some of them quite important (e.g., mxCreateSharedDataCopy). A few hundred new C++ functions were added in their place, but I fear that these are not accessible to MEX users without a code change (see below). libmx has always changed between Matlab releases, but not so drastically for many years. If you rely on any undocumented MEX functions in your code, now would be a good time to recheck it, before R2014a is officially released.

Thanks to Bastian Ebeling, we can still use these interfaces in our MEX code by simply renaming the MEX file from .c to .cpp and modifying the code as follows:

#include "mex.h"
 
// MX_API_VER has unfortunately not changed between R2013b and R2014a,
// so we use the new MATRIX_DLL_EXPORT_SYM as an ugly hack instead
#if defined(__cplusplus) && defined(MATRIX_DLL_EXPORT_SYM)
    #define EXTERN_C extern
    namespace matrix{ namespace detail{ namespace noninlined{ namespace mx_array_api{
#endif
 
EXTERN_C mxArray* mxSerialize(mxArray const *);
EXTERN_C mxArray* mxDeserialize(const void *, size_t);
// and so on, for any other MEX C functions that migrated to C++ in R2014a
 
#if defined(__cplusplus) && defined(MATRIX_DLL_EXPORT_SYM)
    }}}}
    using namespace matrix::detail::noninlined::mx_array_api;
#endif
 
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    if (nlhs && nrhs) {
        plhs[0] = (mxArray *) mxSerialize(prhs[0]);
      //plhs[0] = (mxArray *) mxDeserialize(mxGetData(prhs[0]), mxGetNumberOfElements(prhs[0]));
    }
}

Unfortunately, pre-R2014a code cannot coexist with R2014a code (since libmx is different), so separate MEX files need to be used depending on the Matlab version being used. This highlights the risk of using such unsupported functions.

The roundabout alternative is of course to use mexCallMATLAB to invoke getByteStreamFromArray and getArrayFromByteStream. This is actually rather silly, but it works…

p.s. – Happy 30th anniversary, MathWorks!

Addendum March 9, 2014

Now that the official R2014a has been released, I am happy to report that most of the important MEX functions that were removed in the pre-release have been restored in the official release. These include mxCreateSharedDataCopy, mxFastZeros, mxCreateUninitDoubleMatrix, mxCreateUninitNumericArray, mxCreateUninitNumericMatrix and mxGetPropertyShared. Unfortunately, mxSerialize and mxDeserialize remain among the functions that were left out, which is a real pity considering their usefulness, but we can use one of the workarounds mentioned above. At least those functions that were critical for in-place data manipulation and improved MATLAB performance have been restored, perhaps in some part due to lobbying by yours truly and by others.

MathWorks should be commended for their meaningful dialog with users and for making the fixes in such a short turn-around before the official release, despite the fact that they belong to the undocumented netherworld. MathWorks may appear superficially to be like any other corporate monolith, but when you scratch the surface you discover that there are people there who really care about users, not just the corporate bottom line. I must say that I really like this aspect of their corporate culture.

Accessing private object properties

Yair Altman — Wed, 18 Dec 2013 18:00:00 +0000

Some time ago, I needed to modify a property value of a class object. The problem was that this property was declared as private and for some reason my client could not modify the originating classdef to make this property accessible.

Problem definition

We start with a very simple class definition:

Inaccessible private property (or is it?)

classdef MyClass
    properties (SetAccess=private)  %GetAccess = public
        y
    end
    properties (Access=private)  %GetAccess = SetAccess = private
        x
    end
 
    methods
        function obj = MyClass(x,y)  % constructor
            if nargin>0, obj.x = x; end
            if nargin>1, obj.y = y; end
        end
 
        function result = isPropEqualTo(obj,propName,value)
            result = (obj.(propName)==value);
        end
    end
end

The problem is simple: we need to both get and set the value of inaccessible private properties x,y. But following object construction, MyClass enables direct read access only to property y, and write access to neither of its properties:

>> obj = MyClass(3,4)
obj = 
  MyClass with properties:
 
    y: 4
 
>> obj.x
You cannot get the 'x' property of MyClass. 
 
>> obj.x=5
You cannot set the 'x' property of MyClass. 
 
>> obj.y=5
You cannot set the read-only property 'y' of MyClass.

A dead end, would you say? – Well, it never stopped us before, has it? After all, is it not the raison-d’être of this blog?

Reading private properties

Getting the value of x is simple enough when we recall that calling Matlab’s struct function on a class object reveals all its hidden treasures. I wrote about this a couple of years ago, and I’m not sure how many people realize the power of this undocumented feature:

>> s = struct(obj)
Warning: Calling STRUCT on an object prevents the object from hiding its implementation details and should thus be avoided.
Use DISP or DISPLAY to see the visible public details of an object. See 'help struct' for more information.
(Type "warning off MATLAB:structOnObject" to suppress this warning.)
 
s = 
    y: 4
    x: 3

As the warning mentions, we should not do this often (bad, bad boy!). If we must (I promise I had a good reason, ma!), then we can simply turn off the nagging warning:

warning off MATLAB:structOnObject

We can now read all the private internal properties of the object. Yummy!

Setting private properties

The natural attempt would now be to update the struct’s fields with new values. Unfortunately, this does not affect the original class properties, since our struct is merely a copy of the original. Even if our original object is a handle class, the struct would still be a shallow copy and not a real reference to the object data.

Mex’s standard mxGetProperty cannot be used on the original object, because mxGetProperty returns a copy of the property (not the original reference – probably to prevent exactly what I’m describing here…), and in any case it can’t access private properties. mxSetProperty is a dead-end for similar reasons.

The core idea behind the solution is Matlab’s Copy-on-Write mechanism (COW). This basically means that when our struct is created, the field values actually hold references (pointers) to the original object properties. It is only when trying to modify the struct fields that COW kicks in and a real copy is made. This is done automatically and we do not have any control over it. However, we can use this information to our advantage by retrieving the field references (pointers) before COW has a chance to ruin them. Once we have the reference to the private data, we can modify the data in-place using a bit of Mex.

So the trick is to get the reference address (pointer) of s.x and s.y. How do we do that?

We can use another trick here, which is a corollary to the COW mechanism: when we pass s.x into a function, it is not a data copy that is being passed (by value), but rather its pointer (by reference). So we can simply get this pointer in our Mex function and use it to modify the original property value. Easy enough, right?

Not so fast. Don’t forget that s.x is merely a reference copy of the original property data. If we modify s.x‘s reference we’re just killing the so-called cross-link of the shared-array. What we need to do (more or less) is to traverse this cross-link back to its source, to get the real reference to the data.

Sounds complicated? Well, it is a bit. Luckily, Mex guru James (Jim) Tursa comes to the rescue with his mxGetPropertyPtr function on the File Exchange, which does all that for us. Once we have it compiled (the utility automatically Mex-compiles itself on first use), we can use it as follows (note the highlighted line using mxGetPropertyPtr):

/* Place in mxMySetProperty.c and mex-compile*/
#include "mex.h"
#include "mxGetPropertyPtr.c"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
    mxArray *x;
    register double *xpr;
    int buflen;
    char *propName = "x";
    double newValue = -3.14159265358979;
 
    if ( nrhs > 1 ) {
        /* Get the specified property name (or "x" if not specified) */
        buflen = mxGetNumberOfElements(prhs[1]) + 1;
        propName = mxCalloc(buflen, sizeof(char));
        mxGetString(prhs[1], propName, buflen);
    }
 
    /* Get the pointer to the original property data */
    x = mxGetPropertyPtr(prhs[0],0,propName);    if ( !x ) {
        mexErrMsgTxt("Failed to get pointer to property.");
    }
 
    /* Display the existing property value */
    xpr = mxGetPr(x);
    mexPrintf("existing value of property %s = %f\n", propName, *xpr);
 
    /* Update the property with the new value */
    if ( nrhs > 2 ) {
        /* Get the specified new value (or -pi if not specified) */
        double *pr = mxGetPr(prhs[2]);
        newValue = *pr;
    }
    mexPrintf("setting value of property %s to %f\n", propName, newValue);
    *xpr = newValue;
}

Naturally, this simplistic Mex function should also be made to accept non-scalar values. This is left as an exercise to the reader.

The usage in Matlab of this mxMySetProperty function is super simple:

% Update obj.x from 3 => pi/2
>> mxMySetProperty(s,'x',pi/2);
existing value of property x = 3.000000
setting value of property x to 1.570796
 
% Update obj.y from 4 => -5
>> mxMySetProperty(s,'y',-5);  % here we can also use obj instead of s since obj.y is accessible
existing value of property y = 4.000000
setting value of property y to -5.000000
 
% Check that the struct copy has been updated correctly
>> s
s = 
    y: -5
    x: 1.5707963267949
 
% Check that the original object's private properties have been updated correctly
>> obj
obj = 
  MyClass with properties:
 
    y: -5
 
>> obj.isPropEqualTo('x',pi/2)
ans =
    1     % ==true

Jim Tursa has promised to supply a mxSetPropertyPtr variant of his mxGetPropertyPtr for the past two years (1,2,3,4). It will surely be more robust than my simplistic mxMySetProperty function, so I look forward to finally seeing it on FEX!

Conclusion

With some dirty tricks and undocumented hacks, we can both get and set private-access object properties. Please don’t do this unless you have a really good reason (such as a customer breathing down your neck, who doesn’t give a fig that his properties were declared private…).

The mechanism shown above can also be used to improve performance when updating public object properties, since it updates the data in-place rather than create a copy. This could be significant when the property size is very large (multi-MB), since it avoids unnecessary memory allocation and deallocation. You might think that with public properties we could use the standard mxGetProperty for this, but as I said above this function apparently returns a copy of the data, not a direct reference. Also note that last month I discussed additional performance aspects of accessing object properties.

This blog will now take a short break for the holidays. I hope you had a good ride this year, see you again on the other side of 2013.

Merry Christmas and happy New-Year everybody!

Undocumented Matlab MEX API

Yair Altman — Wed, 24 Jul 2013 18:00:30 +0000

I would like to welcome guest blogger Pavel Holoborodko of Advanpix, maker of the Multiprecision Computing Toolbox. Today, Pavel discusses undocumented MEX functions that he encountered in his work to improve the toolbox performance.

Advanpix logo

It is not a secret that Matlab has many undocumented (or deliberately hidden) features and commands. After all, Yair’s website & book are devoted specifically to this topic.

However most of the findings are related to Matlab language itself and investigations on undocumented MEX API seems to be scarce.

During development of our toolbox we have found lots of hidden functions which can be helpful for creating speed-efficient extensions for Matlab using native languages.

Here we want to explain some of them in detail and provide complete list of undocumented MEX API functions.

Please note that there is a risk in using the functions – MathWorks can change / remove some of them in next versions. It is an additional burden for developers to stay tuned and update their toolboxes on time.

Reduced OOP capabilities of Matlab

Starting with R2008b Matlab allows user to introduce custom-type classes by the classdef keyword. Matlab was late on adding object oriented features – I can only image how hard it was for developers at MathWorks to add OOP constructs to a purely procedural language, which follows entirely different philosophy. (Yes, objects could be created using structs and special folder structure in earlier version of Matlab – but that was just ugly design, MathWorks will not support it in the future).

They still don’t have full support for OOP features though. The most important missing features are:

It is prohibited to have a destructor for custom non-handle classes
It is not possible to overload assignment-without-subscripts operator (e.g. A = B)

I don’t know the reasons why these fundamental OOP paradigms are not implemented in Matlab – but they surely prevent creating powerful virtual machine-type of toolboxes.

In that case Matlab objects would have only one property field – ‘id’, identifier of variable stored in MEX module – virtual machine (e.g. pointer to C++/C object). MEX module would only need to know ‘id’ of objects and what operation to conduct with them (+, -, *, etc.) – all processing would be done in MEX. Heavy data exchange between Matlab and MEX libraries would be completely unnecessary. Matlab would act as just an interpreter in such scenario. Moreover MEX API could be simplified to several functions only.

Access to object properties from MEX

Unfortunately we are restricted to current architecture – where all the data are allocated / stored on Matlab side and we have to transfer it from Matlab to MEX library in order to work with it. The official MEX API provides two functions to access object properties from within MEX library: mxGetProperty and mxSetProperty.

Both functions share the same problem – they create a deep copy of the data!

Imagine the situation when your object is a huge matrix with high-precision elements and it occupies 800MB of RAM. If we want to access it in MEX library (e.g. transpose) we would call mxGetProperty which will do ENTIRE COPY of your object’s property – wasting another 800MB!

Obviously this cannot be accepted, not speaking of totally reduced performance (copying could take a while for such amount too).

In search for remedy we found a similar (but) undocumented functions we can use to get shared access to objects properties (32-bit):

extern mxArray* mxGetPropertyShared(const mxArray* pa, 
                                    unsigned int index, 
                                    const char * propname);
 
extern void mxSetPropertyShared(mxArray* pa, 
                                unsigned int index, 
                                const char * propname, 
                                const mxArray* value);

These functions can be used as one-to-one replacement of the official functions: mxGetPropertyShared just returns pointer to existing property without any copying; mxDestroyArray can still be called on returned pointer (thanks to James Tursa for the correction).

Full list of MEX API functions

We have extracted the full list of usable MEX functions from libmx.dll and libmex.dll (Matlab R2012b Windows 32-bit) – the two main MEX API dynamic libraries. It includes functions from the official API as well as undocumented ones (which are in fact the majority):

libmx (856 exported functions)
libmex (44 exported functions)

The distinctive feature of the list – it provides de-mangled C++ names of functions, with type of arguments and return value. This makes usage of undocumented functions much easier. You can also easily see this list using tools such as DLL Export Viewer or the well-known Dependency Walker. This list was not previously published to the knowledge of this author; a much shorter unannotated list was posted by Yair in 2011.

Take a look at the function list – there are a lot of interesting ones, like mxEye, mxIsInf, mxFastZeros, mxGetReferenceCount and many others.

Moreover it is possible to see high level C++ classes MathWorks developers use for work. Peter Li has posted an article about mxArray_tag‘s evolution and internal structure in 2012. Now it is clear that the fundamental data type mxArray_tag is not a plain-old-struct anymore, it has member-functions and behaves more like a class. It even has custom new/delete heap management functions and overloaded assignment operator=. Reverse-engineering of these functions might reveal the exact & complete data structure of mxArray_tag, but perhaps this is not allowed by the Matlab license agreement.

Anyway, with some effort, the internal mxArray_tag class from MathWorks might be used in third-party MEX files. How much more easier this would be instead of clumsy mx**** functions!

Please feel free to leave your requests or comments below.

Note: This article was originally posted on the Advanpix blog; reposted here with kind permission and a few changes.

Parsing mlint (Code Analyzer) output

Yair Altman — Wed, 10 Apr 2013 18:00:28 +0000

Mlint, Matlab’s static code-analysis parser, was written by Stephen Johnson (the original developer of the enormously successful lint parser for C/C++ back in 1977), when he was lured by MathWorks in 2002 to develop a similar tool for Matlab. Since its development (in R14 I believe), and especially since its incorporation in Matlab’s Editor in R2006a (Matlab 7.2), mlint has become a very important tool for reporting potential problems in m-files.

Unfortunately, to this day (R2013a), there is no documented manner of programmatically separating mlint warnings and errors, nor for accessing any of the multitude of features that are readily available in mlint. Naturally, there is (and has always been) an undocumented back door.

From its earliest beginnings, mlint has relied on C code (presumably modeled after lint). For many years mlint relied on a mex file (%matlabroot%/toolbox/matlab/codetools/mlintmex.mex*), which is basically just a wrapper for mlint.dll where the core algorithm resides. In recent releases, mlintmex, just like many other core mex files, was ported into a core Matlab library (libmwbuiltins.dll on Windows). However, the name and interface of the mlintmex function have remained unchanged over the years. Wrapping the core mlintmex function is the mlint m-function (%matlabroot%/toolbox/matlab/codetools/mlint.m) that calls mlintmex internally. In R2011b (Matlab 7.13) its official function name has changed to checkcode, although this was never documented in the release notes for some reason. However, using mlint still works even today. Wrapping all that is the mlintrpt function, which calls mlint/checkcode internally.

The core function mlintmex returns a long string with embedded newlines to separate the messages. For example:

>> str = mlintmex('perfTest.m')
str = 
L 3 (C 1): The value assigned to variable 'A' might be unused.
L 4 (C 1): The value assigned to variable 'B' might be unused.
L 5 (C 1-3): Variable 'ops', apparently a structure, is changed but the value seems to be unused.
L 12 (C 9): This statement (and possibly following ones) cannot be reached.
L 53 (C 19-25): The function 'subFunc' might be unused.
L 53 (C 27-35): Input argument 'iteration' might be unused. If this is OK, consider replacing it by ~.

We can parse this long string ourselves, but there is no need since mlint/checkcode do this for us, returning a struct array:

>> results = mlint('perfTest.m')
results = 
6x1 struct array with fields:
    message
    line
    column
    fix
>> results(5)
ans = 
    message: 'The function 'subFunc' might be unused.'
       line: 53
     column: [19 25]
        fix: 0

As can be seen, the message severity (warning/error) does not appear. This severity is obviously available since it is integrated in the Editor and the Code Analyzer report – orange for warnings, red for errors.

In one of my projects I needed to enable the user to dynamically create executable Matlab code that would then be run interactively. This enabled users to create dynamic data analyses functions without actually needing to know Matlab or to code all the nuts-and-bolts of a regular Matlab function. For this I needed to display warnings and errors-on-the-fly (the dynamic cell tooltips used a custom table cell-renderer). Here’s the end-result:

Analysis definition panel

Dynamic analysis alert tooltips

My solution was to use mlintmex, as follows:

% Get the relevant message strings
errMsgs = mlintmex('-m2', srcFileName);
allMsgs = mlintmex('-m0', srcFileName);
 
% Parse the strings to find newline characters
numErrors = length(strfind(regexprep(errMsgs,'\*\*\*.*',''),char(10)));
numAllMsg = length(strfind(regexprep(allMsgs,'\*\*\*.*',''),char(10)));
numWarns = numAllMsg - numErrors;

(and from the messages themselves [errMsgs,allMsgs] I extracted the actual error/warning location)

Alternatively, I could have used mlint directly, as I have recently explained:

% Note that mlint returns struct arrays, so the following are all structs, not strings
errMsgs = mlint('-m2',srcFileNames); % m2 = errors only
m1Msgs  = mlint('-m1',srcFileNames); % m1 = errors and severe warnings only
allMsgs = mlint('-m0',srcFileNames); % m0 = all errors and warnings

The original information about mlintmex and the undocumented -m0/m1/m2 options came from Urs (us) Schwartz, whose contributions are an endless source of such gems. Urs also provided a list of other undocumented mlint options (the comment annotations are mostly mine):

'-all'        % ???
'-allmsg'     % display the full list of possible mlint messages and their codes
'-amb'        % display all possibly-ambiguous identifiers (variable/function)
'-body'       % ???
'-callops'    % display the internal call tree, with nesting levels and function types
'-calls'      % (looks similar to -callops, not sure what the difference is)
'-com'        % ???
'-cyc'        % display McCabe complexity value of all functions in the analyzed file
% '-db'       % == -set + -ud + -tab
'-dty'        % debug info for the mlint parsing tree
'-edit'       % display all encountered identifiers and their assumed types
'-en'         % messages in English
'-id'         % display the mlint code associated with each message
'-ja'         % messages in Japanese
'-lex'        % display the LEX parse-tree for the analyzed file
'-m0'         % + other opt
'-m1'         % + other opt
'-m2'         % + other opt
'-m3'         % + other opt
'-mess'       % debug info for mlint message-reporting (start/end locations etc.)
'-msg'        % (looks similar to -allmsg above, not sure what the difference is)
'-notok'      % disregard %#ok directives and report messages on lines having them
'-pf'         % ???
'-set'        % debug info for the mlint parsing tree
'-spmd'       % ??? (presumably display SPMD-related messages)
'-stmt'       % display the number of statements in each function within the analyzed file
'-tab'        % set-by/used-by table for all identifiers (see -edit)
'-tmtree'     % not valid anymore
'-tmw'        % not valid anymore
'-toks'       % ???
'-tree'       % debug info for the mlint parsing tree
'-ty'         % display the line numbers where each of the file's identifiers are used
'-ud'         % debug info for the mlint parsing tree
'-yacc'       % ONLY: !mlint FILE -yacc -...

to which were added in recent years ‘-eml’, ‘-codegen’ etc. – see the checkcode doc page. Also note that not all Matlab releases support all options. For example, ‘-tmw’ is ignored in R2013a, returning the same data as ‘-all’ plus a warning about the ignored option.

Urs prepared a short utility called doli that accepts an m-file name and returns a struct whose fields are the respective outputs of mlint for each of the corresponding options:

>> results = doli('perfTest.m')
MLINT >   C:\Yair\Books\MATLAB Performance Tuning\Code\perfTest.m
OPTION>   -all       6
OPTION>   -allmsg    501
OPTION>   -amb       17
OPTION>   -body      6
OPTION>   -callops   15
OPTION>   -calls     15
OPTION>   -com       6
OPTION>   -cyc       8
OPTION>   -dty       162
OPTION>   -edit      92
OPTION>   -en        7
...

Some of these options are used by Urs’ farg and fdep utilities. Their usage of mlint rather than direct m-code parsing, is part of the reason that these functions are so lightningly fast.

For example, we can use the ‘-calls’ options to parse an m-file and get the names, type, and code location of its contained functions (explanation):

>> mlint('-calls','perfTest.m')
M0 1 10 perfTest
E0 51 3 perfTest
U1 3 5 randi
U1 4 5 num2cell
U1 4 14 randn
U1 6 1 whos
U1 7 1 tic
U1 7 6 save
U1 7 45 toc
U1 9 6 savefast
S0 53 19 subFunc
E0 60 3 subFunc
U1 55 8 isempty
U1 56 20 load
U1 57 29 sin

With so many useful features, I really cannot understand why they were never exposed to the public in a documented manner. After all, they have remained pretty-much unchanged for many years and can provide enormous benefits for developers of unit-tests and interactive analysis frameworks (as I have shown above).

As a side-note, in R2010a (Matlab 7.10), mlint was renamed “Code Analyzer”, but this was really just a name change – its core functionality has changed little in the past decade. Some might argue that new checks were added and the Editor interface has improved by allowing auto-fixes and message suppression. But for a tool that is over a decade old (much more, if you count lint’s development), I contend that these are not much. Don’t get me wrong – I have the utmost respect for Steve. Serious unix C/C++ development relies on his lint and yacc tools on a regular basis. I think they show astonishing ingenuity and intelligence. It’s just that I had expected more after a decade of mlint development (I bet it’s not due to Steve suddenly losing the touch).

Addendum: A little birdie tells me that Steve left MathWorks a few years ago, which does explain things… I apologize to Steve for any misguided snide on my part. As I said above, I have nothing but the utmost respect for his work. The question of why MathWorks left his mlint work hanging without serious continuation remains open.

Addendum 2: Additional and much more detailed information about the nature of functions can be found using the semi-documented mtree function (or rather, Matlab class: %matlabroot%/toolbox/matlab/codetools/@mtree/mtree.m). This is a huge class-file (3200+ lines of code) that is well worth a dedicated future article, so stay tuned…

Matlab’s internal memory representation

Yair Altman — Thu, 15 Mar 2012 18:11:23 +0000

Once again I’d like to welcome guest blogger Peter Li. Peter wrote about Matlab Mex in-place editing last month. Today, Peter pokes around in Matlab’s internal memory representation to the greater good and glory of Matlab Mex programming.

Disclaimer: The information in this article is provided for informational purposes only. Be aware that poking into Matlab’s internals is not condoned or supported by MathWorks, and is not recommended for any regular usage. Poking into memory has the potential to crash your computer so save your data! Moreover, be advised (as the text below will show) that the information is highly prone to change without any advance notice in future Matlab releases, which could lead to very adverse effects on any program that relies on it. On the scale of undocumented Matlab topics, this practically breaks the scale, so be EXTREMELY careful when using this.

A few weeks ago I discussed Matlab’s copy-on-write mechanism as part of my discussion of editing Matlab arrays in-place. Today I want to explore some behind-the-scenes details of how the copy-on-write mechanism is implemented. In the process we will learn a little about Matlab’s internal array representation. I will also introduce some simple tools you can use to explore more of Matlab’s internals. I will only cover basic information, so there are plenty more details left to be filled in by others who are interested.

Brief review of copy-on-write and mxArray

Copy-on-write is Matlab’s mechanism for avoiding unnecessary duplication of data in memory. To implement this, Matlab needs to keep track internally of which sets of variables are copies of each other. As described in MathWorks’s article, “the Matlab language works with a single object type: the Matlab array. All Matlab variables (including scalars, vectors, matrices, strings, cell arrays, structures, and objects) are stored as Matlab arrays. In C/C++, the Matlab array is declared to be of type mxArray“. This means that mxArray defines how Matlab lays out all the information about an array (its Matlab data type, its size, its data, etc.) in memory. So understanding Matlab’s internal array representation basically boils down to understanding mxArray.

Unfortunately, MathWorks also tells us that “mxArray is a C language opaque type“. This means that MathWorks does not expose the organization of mxArray to users (i.e. Matlab or Mex programmers). Instead, MathWorks defines mxArray internally, and allows users to interact with it only through an API, a set of functions that know how to handle mxArray in their back end. So, for example, a Mex programmer does not get the dimensions of an mxArray by directly accessing the relevant field in memory. Instead, the Mex programmer only has a pointer to the mxArray, and passes this pointer into an API function that knows where in memory to find the requested information and then passes the result back to the programmer.

This is generally a good thing: the API provides an abstraction layer between the programmer and the memory structures so that if MathWorks needs to change the back end organization (to add a new feature for example), we programmers do not need to modify our code; instead MathWorks just updates the API to reflect the new internal organization. On the other hand, being able to look into the internal structure of mxArray on occasion can help us understand how Matlab works, and can help us write more efficient code if we are careful as in the example of editing arrays in-place.

So how do we get a glimpse inside mxArray? The first step is simply to find the region of memory where the mxArray lives: its beginning and end. Finding where in memory the mxArray begins is pretty easy: it is given by its pointer value. Here is a simple Mex function that takes a Matlab array as input and prints its memory address:

/* printaddr.cpp */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   if (nrhs < 1) mexErrMsgTxt("One input required.");
   printf("%p\n", prhs[0]);
}

This function is nice as it prints the address in a standard hexadecimal format. The same information can also be received directly in Matlab (i.e., without needing printaddr), using the undocumented format debug command (here’s another reference):

>> format debug
 
>> A = 1:10
A =
Structure address = 7fc3b8869ae0
m = 1
n = 10
pr = 7fc44922c890
pi = 0
     1     2     3     4     5     6     7     8     9    10
 
>> printaddr(A)
7fc3b8869ae0

To play with this further from within Matlab however, it’s nice to have the address returned to us as a 64-bit unsigned integer; here’s a Mex function that does that:

/* getaddr.cpp */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   if (nrhs < 1) mexErrMsgTxt("One input required.");
   plhs[0] = mxCreateNumericMatrix(1, 1, mxUINT64_CLASS, mxREAL);
   unsigned long *out = static_cast<unsigned long *>(mxGetData(plhs[0]));
   out[0] = (unsigned long) prhs[0];
}

Here’s getaddr in action:

>> getaddr(A)
ans = 
           139870853618400
 
% And using pure Matlab:
>> hex2dec('7f36388b5ae0')  % output of printaddr or format debug
ans =
           139870853618400

So now we know where to find our array in memory. With this information we can already learn a lot. To make our exploration a little cleaner though, it would be nice to know where the array ends in memory too, in other words we would like to know the size of the mxArray.

Finding the structure of mxArray

The first thing to understand is that the amount of memory taken by an mxArray does not have anything to do with the dimensions of the array in Matlab. So a 1×1 Matlab array and a 100×100 Matlab array have the same size mxArray representation in memory. As you will know if you have experience programming in Mex, this is simply because the Matlab array’s data contents are not stored directly within mxArray. Instead, mxArray only stores a pointer to another memory location where the actual data reside. This is fine; the internal information we want to poke into is all still in mxArray, and it is easy to get the pointer to the array’s data contents using the API functions mxGetData or mxGetPr.

So we are still left with trying to figure out the size of mxArray. There are a couple paths forward. First I want to talk about a historical tool that used to make a lot of this internal information easily available. This was a function called headerdump, by Peter Boetcher (described here and here). headerdump was created for exactly the goal we are currently working towards: to understand Matlab’s copy-on-write mechanism. Unfortunately, as Matlab has evolved, newer versions have incrementally broken this useful tool. So our goal here is to create a replacement. Still, we can learn a lot from the earlier work.

One of the things that helped people figure out Matlab’s internals in the past is that in older versions of Matlab mxArray is not a completely opaque type. Even in recent versions up through at least R2010a, if you look into $MATLAB/extern/include/matrix.h you can find a definition of mxArray_tag that looks something like this:

/* R2010a */
struct mxArray_tag {
   void  *reserved;
   int    reserved1[2];
   void  *reserved2;
   size_t  number_of_dims;
   unsigned int reserved3;
   struct {
       unsigned int  flag0  : 1;
       unsigned int  flag1  : 1;
       unsigned int  flag2  : 1;
       unsigned int  flag3  : 1;
       unsigned int  flag4  : 1;
       unsigned int  flag5  : 1;
       unsigned int  flag6  : 1;
       unsigned int  flag7  : 1;
       unsigned int  flag7a : 1;
       unsigned int  flag8  : 1;
       unsigned int  flag9  : 1;
       unsigned int  flag10 : 1;
       unsigned int  flag11 : 4;
       unsigned int  flag12 : 8;
       unsigned int  flag13 : 8;
   }   flags;
   size_t reserved4[2];
   union {
       struct {
           void  *pdata;
           void  *pimag_data;
           void  *reserved5;
           size_t reserved6[3];
       }   number_array;
   }   data;
};

This is what you could call murky or obfuscated, but not completely opaque. The fields mostly have unhelpful names like “reserved”, but on the other hand we at least have a sense for what fields there are and their layout.

A more informative (yet unofficial) definition was provided by James Tursa and Peter Boetcher:

#include "mex.h"
/* Definition of structure mxArray_tag for debugging purposes. Might not be fully correct 
 * for Matlab 2006b or 2007a, but the important things are. Thanks to Peter Boettcher.
 */
struct mxArray_tag {
  const char *name;
  mxClassID class_id;
  int vartype;
  mxArray    *crosslink;
  int      number_of_dims;
  int      refcount;
  struct {
    unsigned int    scalar_flag : 1;
    unsigned int    flag1 : 1;
    unsigned int    flag2 : 1;
    unsigned int    flag3 : 1;
    unsigned int    flag4 : 1;
    unsigned int    flag5 : 1;
    unsigned int    flag6 : 1;
    unsigned int    flag7 : 1;
    unsigned int    private_data_flag : 1;
    unsigned int    flag8 : 1;
    unsigned int    flag9 : 1;
    unsigned int    flag10 : 1;
    unsigned int    flag11 : 4;
    unsigned int    flag12 : 8;
    unsigned int    flag13 : 8;
  }   flags;
  int  rowdim;
  int  coldim;
  union {
    struct {
      double  *pdata;       // original: void*
      double  *pimag_data;  // original: void*
      void    *irptr;
      void    *jcptr;
      int     nelements;
      int     nfields;
    }   number_array;
    struct {
      mxArray **pdata;
      char    *field_names;
      void    *dummy1;
      void    *dummy2;
      int     dummy3;
      int     nfields;
    }   struct_array;
    struct {
      void  *pdata;  /*mxGetInfo*/
      char  *field_names;
      char  *name;
      int   checksum;
      int   nelements;
      int   reserved;
    }  object_array;
  }   data;
};

For comparison, here is another definition from an earlier version of Matlab.

/* R11 aka Matlab 5.0 (1999) */
struct mxArray_tag {
  char name[mxMAXNAM];
  int  class_id;
  int  vartype;
  mxArray *crosslink;
  int  number_of_dims;
  int  nelements_allocated;
  int  dataflags;
  int  rowdim;
  int  coldim;
  union {
    struct {
      void *pdata;
      void *pimag_data;
      void *irptr;
      void *jcptr;
      int   reserved;
      int   nfields;
    }   number_array;
  }   data;
};

I took this R11 definition from the source code to headerdump (specifically, from mxinternals.h, which also has mxArray_tag definitions for R12 (Matlab 6.0) and R13 (Matlab 6.5)), and you can see that it is much more informative, because many fields have been given useful names thanks to the work of Peter Boetcher and others. Note also that the definition from this old version of Matlab is quite different from the version from R2010a.

At this point, if you are running a much earlier version of Matlab like R11 or R13, you can break off from the current article and start playing around with headerdump directly to try to understand Matlab’s internals. For more recent versions of Matlab, we have more work to do. Getting back to our original goal, if we take the mxArray_tag definition from R2010a and run sizeof, we get an answer for the amount of memory taken up by an mxArray in R2010a: 104 bytes.

Determining the size of mxArray

It was nice to derive the size of mxArray from actual MathWorks code, but unfortunately this information is no longer available as of R2011a. Somewhere between R2010a and R2011a, MathWorks stepped up their efforts to make mxArray completely opaque. So we should find another way to get the size of mxArray for current and future Matlab versions.

One ugly trick that works is to create many new arrays quickly and see where their starting points end up in memory:

>> A = num2cell(1:100)';
>> addrs = sort(cellfun(@getaddr, A));

What we did here is create 100 new arrays, and then get all their memory addresses in sorted order. Now we can take a look at how far apart these new arrays ended up in memory:

>> semilogy(diff(addrs));

The resulting plot will look different each time you run this; it is not really predictable where Matlab will put new arrays into memory. Here is an example from my system:

Plot of memory addresses

Your results may look different, and you might have to increase the number of new arrays from 100 to 1000 to get the qualitative result, but the important feature of this plot is that there is a minimum distance between new arrays of about 10². In fact, if we just go straight for this minimum distance:

>> min(diff(addrs))
ans = 
            104

we find that although mxArray has gone completely opaque from R2010a to R2011a, the full size of mxArray in memory has stayed the same: 104 bytes.

Dumping mxArray from memory

We now have all the information we need to start looking into Matlab’s array representation. There are many tools available that allow you to browse memory locations or dump memory contents to disk. For our purposes though, it is nice to be able to do everything from within Matlab. Therefore I introduce a simple tool that prints memory locations into the Matlab console:

/* printmem.cpp */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
  if (nrhs < 1 || !mxIsUint64(prhs[0]) || mxIsEmpty(prhs[0]))
    mexErrMsgTxt("First argument must be a uint64 memory address");
  unsigned long *addr = static_cast<unsigned long *>(mxGetData(prhs[0]));
  unsigned char *mem = (unsigned char *) addr[0];
 
  if (nrhs < 2 || !mxIsDouble(prhs[1]) || mxIsEmpty(prhs[1]))
    mexErrMsgTxt("Second argument must be a double-type integer byte size.");      
  unsigned int nbytes = static_cast<unsigned int>(mxGetScalar(prhs[1]));
 
  for (int i = 0; i < nbytes; i++) {
    printf("%.2x ", mem[i]);
    if ((i+1) % 16 == 0) printf("\n");
 }
 printf("\n");
}

Here is how you use it in Matlab:

>> A = 0;
>> printmem(getaddr(A), 104)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 01 02 00 00 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 70 fa 33 df 6f 7f 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00

And there you have it: the inner guts of mxArray laid bare. I have printed each byte as a two character hexadecimal value, as is standard, so there are 16 bytes printed per row.

What does it mean?

So now we have 104 bytes of Matlab internals to dig into. We can start playing with this with a few simple examples:

>> A = 0; B = 1;
>> printmem(getaddr(A), 104)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 01 02 00 00 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 c0 b0 27 df 6f 7f 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00

>> printmem(getaddr(B), 104)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 01 02 00 00 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 70 fa 33 df 6f 7f 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00

In this and subsequent examples, I will highlight bytes that are different or that are of interest. What we can see from this example is that although arrays A and B have different content, almost nothing is different between their mxArray representations. What is different, is the memory address stored in the highlighted bytes. This confirms our earlier assertion that mxArray does not store the array contents, but only a pointer to the content location.

Now let us try to figure out some of the other fields:

>> A = 1:3; B = 1:10; C = (1:10)';
>> printmem(getaddr(A), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
03 00 00 00 00 00 00 00 60 80 22 df 6f 7f 00 00

>> printmem(getaddr(B), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 80 83 29 df 6f 7f 00 00

>> printmem(getaddr(C), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 0a 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 80 83 29 df 6f 7f 00 00

(Note that this time I only printed the first four lines of each array as this is where the interesting differences are for this example.)

In red I highlighted the bytes in each array that give its number of rows and columns (note that hexadecimal 0a is 10 in decimal). In blue I highlighted areas that store the value “02”, which could be the location for storing the number of dimensions. Let us look into this more:

>> A = rand([3 3 3]);
>> printmem(getaddr(A), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 03 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 30 4a 3f df 6f 7f 00 00
09 00 00 00 00 00 00 00 b0 d3 24 df 6f 7f 00 00

Two interesting results here: The first highlighted region changed from 02 to 03, so this must be the place where mxArray indicates a 3-dimensional array rather than 2D. Another important thing also changed though: we can see in the second highlighted region that there is a new memory address stored where we used to find the number of rows. And in the third highlighted region we now have the number 09 instead of the number of columns.

Clearly, Matlab has a different way of representing a 2D matrix versus arrays of higher dimension such as 3D. In the 2D case, mxArray simply holds the nrows and ncols directly, but for a higher dimension case we hold only the number of dimensions (03), the total number of elements (09), and a pointer to another memory location (0x7f6fdf3f4a30) which holds the array of sizes for each dimension.

The copy-on-write mechanism

Finally, we are in a position to understand how Matlab internally implements copy-on-write:

>> A = 1:10;
>> printmem(getaddr(A), 64);
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 90 f3 24 df 6f 7f 00 00

>> B = A;
>> printaddr(B);
0x7f6f4c7b6810

>> printmem(getaddr(A), 64);
10 68 7b 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
10 68 7b 4c 6f 7f 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 90 f3 24 df 6f 7f 00 00

What we see is that by setting B = A, we change the internal representation of A itself. Two new memory address pointers are added to the mxArray for A. As it turns out, both of these point to the address for array B, which makes sense; this is how Matlab internally keeps track of arrays that are copies of each other. Note that because byte order is little-endian, the memory addresses from printmem are byte-wise, i.e. every two characters, reversed relative to the address from printaddr.

We can also look into array B:

>> printmem(getaddr(B), 64);
f0 41 7a 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
f0 41 7a 4c 6f 7f 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 90 f3 24 df 6f 7f 00 00

>> printaddr(A);
0x7f6f4c7a41f0

As I have highlighted, there are two interesting points here. First the red highlights show that array B has pointers back to array A. Second the blue highlight shows that the Matlab data for array B actually just points back to the same memory as the data for array A (the values 1:10).

Finally, we would like to understand why there are two pointers added. Let us see what happens if we add a third linked variable:

>> C = B;
>> printaddr(A); printaddr(B); printaddr(C);
0x7f6f4c7a41f0
0x7f6f4c7b6810
0x7f6f4c7b69b0

>> printmem(getaddr(A), 32)
b0 69 7b 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
10 68 7b 4c 6f 7f 00 00 02 00 00 00 00 00 00 00

>> printmem(getaddr(B), 32)
f0 41 7a 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
b0 69 7b 4c 6f 7f 00 00 02 00 00 00 00 00 00 00

>> printmem(getaddr(C), 32)
10 68 7b 4c 6f 7f 00 00 06 00 00 00 00 00 00 00
f0 41 7a 4c 6f 7f 00 00 02 00 00 00 00 00 00 00

So it turns out that Matlab keeps track of a set of linked variables with a kind of cyclical, doubly-linked list structure; array A is linked to B in the forward direction and is also linked to C in the reverse direction by looping back around, etc. The cyclical nature of this makes sense, as we need to be able to start from any of A, B, or C and find all the linked arrays. But it is still not entirely clear why the list needs to be cyclical AND linked in both directions. In fact, in earlier versions of Matlab this cyclical list was only singly-linked.

Conclusions

Obviously there is a lot more to mxArray and Matlab internals than what we have delved into here. Still, with this basic introduction I hope to have whet your appetite for understanding more about Matlab internals, and provided some simple tools to help you explore. I want to reiterate that in general MathWorks’s approach of an opaque mxArray type with access abstracted through an API layer is a good policy. The last thing you would want to do is take the information here and write a bunch of code that relies on the structure of mxArray to work; next time MathWorks needs to add a new feature and change mxArray, all your code will break. So in general we are all better off playing within the API that MathWorks provides. And remember: poking into memory can crash your computer, so save your data!

On the other hand, occasionally there are cases, like in-place editing, where it is useful to push the capabilities of Matlab a little beyond what MathWorks envisioned. In these cases, having an understanding of Matlab’s internals can be critical, for example in understanding how to avoid conflicting with copy-on-write. Therefore I hope the information presented here will prove useful. Ideally, someone will be motivated to take this starting point and repair some of the tools like headerdump that made Matlab’s internal workings more transparent in the past. I believe that having more of this information out in the Matlab community is good for the community as a whole.

Matlab mex in-place editing

Yair Altman — Wed, 08 Feb 2012 17:00:25 +0000

I would like to welcome Matlab Mex power-user Peter Li to a first in a short series of articles about undocumented aspects of Mex programing

Editing Matlab arrays in-place can be an important technique for optimizing calculations, especially when handling data that use large blocks of memory. The Matlab language itself has some limited support for in-place editing, but when we are really concerned with speed we often turn to writing C/C++ extensions using the Mex interface. Unfortunately, editing arrays in-place from Mex extensions is not officially supported in Matlab, and doing it incorrectly can cause data inconsistencies or can even cause Matlab to crash. In this article, I will introduce the problem and show a simple solution that exhibit the basic implementation details of Matlab’s internal copy-on-write mechanism.

Why edit in-place?

To demonstrate the techniques in this article, I use the fast_median function, which is part of my nth_element package on Matlab’s File Exchange. You can download the package and play with the code if you want. The examples are fairly self-explanatory, so if you do not want to try the code you should be okay just following along.

Let us try a few function calls to see how editing in-place can save time and memory:

>> A = rand(100000000, 1);
>> tic; median(A); toc    
Elapsed time is 4.122654 seconds.
 
>> tic; fast_median(A); toc
Elapsed time is 1.646448 seconds.
 
>> tic; fast_median_ip(A); toc
Elapsed time is 0.927898 seconds.

If you try running this, be careful not to make A too large; tune the example according to the memory available on your system. In terms of the execution time for the different functions, your mileage may vary depending on factors such as: your system, what Matlab version you are running, and whether your test data is arranged in a single vector or a multicolumn array.

This example illustrates a few general points: first, fast_median is significantly faster than Matlab’s native median function. This is because fast_median uses a more efficient algorithm; see the nth_element page for more details. Besides being a shameless plug, this demonstrates why we might want to write a Mex function in the first place: rewriting the median function in pure Matlab would be slow, but using C++ we can significantly improve on the status quo.

The second point is that the in-place version, fast_median_ip, yields an additional speed improvement. What is the difference? Let us look behind the scenes; here are the CPU and memory traces from my system monitor after running the above:

Memory and CPU usage for median vs. fast_median_ip

You can see four spikes in CPU use, and some associated changes in memory allocation:

The first spike in CPU is when we created the test data vector; memory use also steps up at that time.

The second CPU spike is the largest; that is Matlab’s median function. You can see that over that period memory use stepped up again, and then stepped back down; the median function makes a copy of the entire input data, and then throws its copy away when it is finished; this is expensive in terms of time and resources.

The fast_median function is the next CPU spike; it has a similar step up and down in memory use, but it is much faster.

Finally, in the case of fast_median_ip we see something different; there is a spike in CPU use, but memory use stays flat; the in-place version is faster and more memory efficient because it does not make a copy of the input data.

There is another important difference with the in-place version; it modifies its input array. This can be demonstrated simply:

>> A = randi(100, [10 1]);
>> A'
ans = 39    42    98    25    64    75     6    56    71    89
 
>> median(A)
ans = 60
 
>> fast_median(A)
ans = 60
>> A'
ans = 39    42    98    25    64    75     6    56    71    89
 
>> fast_median_ip(A)
ans = 60
>> A'
ans = 39     6    25    42    56    64    75    71    98    89

As you can see, all three methods get the same answer, but median and fast_median do not modify A in the workspace, whereas after running fast_median_ip, input array A has changed. This is how the in-place method is able to run without using new memory; it operates on the existing array rather than making a copy.

Pitfalls with in-place editing

Modifying a function’s input is common in many languages, but in Matlab there are only a few special conditions under which this is officially sanctioned. This is not necessarily a bad thing; many people feel that modifying input data is bad programming practice and makes code harder to maintain. But as we have shown, it can be an important capability to have if speed and memory use are critical to an application.

Given that in-place editing is not officially supported in Matlab Mex extensions, what do we have to do to make it work? Let us look at the normal, input-copying fast_median function as a starting point:

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   mxArray *incopy = mxDuplicateArray(prhs[0]);
   plhs[0] = run_fast_median(incopy);
}

This is a pretty simple function (I have taken out a few lines of boiler plate input checking to keep things clean). It relies on helper function run_fast_median to do the actual calculation, so the only real logic here is copying the input array prhs[0]. Importantly, run_fast_median edits its inputs in-place, so the call to mxDuplicateArray ensures that the Mex function is overall well behaved, i.e. that it does not change its inputs.

Who wants to be well behaved though? Can we save time and memory just by taking out the input duplication step? Let us try it:

void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   plhs[0] = run_fast_median(const_cast<mxArray *>(prhs[0]));  // 
}

Very bad behavior; note that we cast the original const mxArray* input to a mxArray* so that the compiler will let us mess with it; naughty.

But does this accomplish edit in-place for fast_median? Be sure to save any work you have open and then try it:

>> mex fast_median_tweaked.cpp
>> A = randi(100,[10 1]);
>> fast_median_tweaked(A)
ans = 43

Hmm, it looks like this worked fine. But in fact there are subtle problems:

>> A = randi(100,[10 1]);
>> A'
ans = 65    92    14    26    41     2    45    85    53     2
>> B = A;
>> B'
ans = 65    92    14    26    41     2    45    85    53     2
 
>> fast_median_tweaked(A)
ans = 43
>> A'
ans = 2     2    41    26    14    45    65    85    53    92
>> B'
ans = 2     2    41    26    14    45    65    85    53    92

Uhoh, spooky; we expected that running fast_median_tweaked would change input A, but somehow it has also changed B, even though B is supposed to be an independent copy. Not good. In fact, under some conditions this kind of uncontrolled editing in-place can crash the entire Matlab environment with a segfault. What is going on?

Matlab’s copy-on-write mechanism

The answer is that our simple attempt to edit in-place conflicts with Matlab’s internal copy-on-write mechanism. Copy-on-write is an optimization built into Matlab to help avoid expensive copying of variables in memory (actually similar to what we are trying to accomplish with edit in-place). We can see copy-on-write in action with some simple tests:

Matlab's Copy-on-Write memory and CPU usage

% Test #1: update, then copy
>> tic; A = zeros(100000000, 1); toc
Elapsed time is 0.588937 seconds.
>> tic; A(1) = 0; toc
Elapsed time is 0.000008 seconds.
>> tic; B = A; toc   
Elapsed time is 0.000020 seconds.
 
% Test #2: copy, then update
>> clear
>> tic; A = zeros(100000000, 1); toc
Elapsed time is 0.588937 seconds.
>> tic; B = A; toc   
Elapsed time is 0.000020 seconds.
>> tic; A(1) = 0; toc
Elapsed time is 0.678160 seconds.

In the first set of operations, time and memory are used to create A, but updating A and “copying” A into B take no memory and essentially no time. This may come as a surprise since supposedly we have made an independent copy of A in B; why does creating B take no time or memory when A is clearly a large, expensive block?

The second set of operations makes things more clear. In this case, we again create A and then copy it to B; again this operation is fast and cheap. But assigning into A at this point takes time and consumes a new block of memory, even though we are only assigning into a single index of A. This is copy-on-write: Matlab tries to save you from copying large blocks of memory unless you need to. So when you first assign B to equal A, nothing is copied; the variable B is simply set to point to the same memory location already used by A. Only after you try to change A (or B), does Matlab decide that you really need to have two copies of the large array.

There are some additional tricks Matlab does with copy-on-write. Here is another example:

>> clear
>> tic; A{1} = zeros(100000000, 1); toc
Elapsed time is 0.573240 seconds.
>> tic; A{2} = zeros(100000000, 1); toc
Elapsed time is 0.560369 seconds.
 
>> tic; B = A; toc                     
Elapsed time is 0.000016 seconds.
 
>> tic; A{1}(1) = 0; toc               
Elapsed time is 0.690690 seconds.
>> tic; A{2}(1) = 0; toc
Elapsed time is 0.695758 seconds.
 
>> tic; A{1}(1) = 0; toc
Elapsed time is 0.000011 seconds.
>> tic; A{2}(1) = 0; toc
Elapsed time is 0.000004 seconds.

This shows that for the purposes of copy-on-write, different elements of cell array A are treated independently. When we assign B equal to A, nothing is copied. Then when we change any part of A{1}, that whole element must be copied over. When we subsequently change A{2}, that whole element must also be copied over; it was not copied earlier. At this point, A and B are truly independent of each other, as both elements have experienced copy-on-write, so further assignments into either A or B are fast and require no additional memory.

Try playing with some struct arrays and you will find that copy-on-write also works independently for the elements of structs.

Reconciling edit in-place with copy-on-write: mxUnshareArray

Now it is clear why we cannot simply edit arrays in-place from Mex functions; not only is it naughty, it fundamentally conflicts with copy-on-write. Naively changing an array in-place can inadvertently change other variables that are still waiting for a copy-on-write, as we saw above when fast_median_tweaked inadvertently changed B in the workspace. This is, in the best case, an unmaintainable mess. Under more aggressive in-place editing, it can cause Matlab to crash with a segfault.

Luckily, there is a simple solution, although it requires calling internal, undocumented Matlab functions.

Essentially what we need is a Mex function that can be run on a Matlab array that will do the following:

Check whether the current array is sharing data with any other arrays that are waiting for copy-on-write.
If the array is shared, it must be unshared; the underlying memory must be copied and all the relevant pointers need to be fixed so that the array we want to work on is no longer accessible by anyone else.
If the array is not currently shared, simply proceed; the whole point is to avoid copying memory if we do not need to, so that we can benefit from the efficiencies of edit in-place.

If you think about it, this is exactly the operation that Matlab needs to run internally when it is deciding whether an assignment operation requires a copy-on-write. So it should come as no surprise that such a Mex function already exists in the form of a Matlab internal called mxUnshareArray. Here is how you use it:

extern "C" bool mxUnshareArray(mxArray *array_ptr, bool noDeepCopy);
 
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   mxUnshareArray(const_cast<mxArray *>(prhs[0]), true);  //
   plhs[0] = run_fast_median(const_cast<mxArray *>(prhs[0]));  //
}

This is the method actually used by fast_median_ip to efficiently edit in-place without risking conflicts with copy-on-write. Of course, if the array turns out to need to be unshared, then you do not get the benefit of edit in-place because the memory ends up getting copied. But at least things are safe and you get the in-place benefit as long as the input array is not being shared.

Further topics

The method shown here should allow you to edit normal Matlab numerical or character arrays in-place from Mex functions safely. For a Mex function in C rather than C++, omit the “C” in the extern declaration and of course you will have to use C-style casting rather than const_cast. If you need to edit cell or struct arrays in-place, or especially if you need to edit subsections of shared cell or struct arrays safely and efficiently while leaving the rest of the data shared, then you will need a few more tricks. A good place to get started is this article by Benjamin Schubert.

Unfortunately, over the last few years Mathworks seems to have decided to make it more difficult for users to access these kinds of internal methods to make our code more efficient. So be aware of the risk that in some future version of Matlab this method will no longer work in its current form.

Ultimately much of what is known about mxUnshareArray as well as the internal implementation details of how Matlab keeps track of which arrays are shared goes back to the work of Peter Boettcher, particularly his headerdump.c utility. Unfortunately, it appears that HeaderDump fails with Matlab releases >=R2010a, as Mathworks have changed some of the internal memory formats – perhaps some smart reader can pick up the work and adapt HeaderDump to the new memory format.

In a future article, I hope to discuss headerdump.c and its relevance for copy-on-write and edit in-place, and some other related tools for the latest Matlab releases that do not support HeaderDump.

Reasons for undocumented Matlab aspects

Yair Altman — Wed, 30 Nov 2011 18:00:10 +0000

Why are there so many undocumented aspects in Matlab?

This is a great question, recently asked by a reader of this blog, and I wanted to expand on it in today’s article.

Unofficial explanations

There appear to be several different reasons for Matlab’s undocumented/unsupported features, whose types I categorized in last week’s article. Note that the following are all my personal interpretation and are not officially sanctioned in any manner:

Pre-release (beta mode)
MathWorks takes pride in only releasing features/functions for mass use after extensive quality testing. Some features/functions are simply not at the requested level for public use under this criterion. For example, they may work only in a certain situation and not another; may not have bullet-proof error handling etc.
Internal use within Matlab
These are functions or properties which are used by other (documented) Matlab functions, but deemed unsuitable for mass use for similar reasons, or because TMW did not believe they might be of any practical use to users.
Grandfathered (deprecated)
Some features and functions are replaced with newer versions in newer Matlab versions. The old features/functions are often preserved for the sake of backward compatibility, for one or more such future Matlab versions.
Mis-documentation / documentation error / oversight
Matlab is an extensively documented product, and as in any such technical documentation of a rapidly-evolving product, documentation errors are bound to appear. Unfortunately, Matlab users rely on this documentation in order to use Matlab, so any mis- or un-documentation directly affects usage.
Java
This is perhaps the largest source of undocumented features within Matlab. For a variety of reasons, the Matlab-Java interface has not been documented to the same extent as the interface with other programming languages like Fortran or C/C++. Matlab itself is based on Java to a large extent, and perhaps TMW does not feel comfortable with users playing around with Matlab internals. Or perhaps TMW is afraid that improper use of the EDT (discussed later in this book) would result in system hangs or crashes and/or an overall bad user experience.
Mex
Mex is the Matlab interface to pre-compiled external functions, typically coded in C/C++. Mex functions have access to important Matlab engine functionality, but many of these functionalities are undocumented.

Official explanations

Here is an important CSSM thread snippet about the official reasons, with a rare addendum from Cleve Moler – Matlab’s inventor and TMW founder & chairman. It is over a decade old, yet still just as relevant today as it was back then:

There are parts of MATLAB that we *don’t* want users to use. This includes new functionality that’s not yet completely implemented, experimental code, and code that we know will change in the future.
The functionality that we don’t document is not ready for use and *shouldn’t* be used unless its users are aware of its experimental nature. These undocumented functions/options are ones whose behavior is likely to change between versions. Future uses of it may be either invalid or produce completely different results, even between minor revisions of MATLAB. These functions are essentially “developmentally unstable;” they are not fit for use by anyone who needs a reliable development environment.
We have absolutely no plans to document these under-development areas of MATLAB until we feel they are ready for general use. Until then, we only share these features to our test audience and interested parties on CSSM. Both of these groups consist of our most dedicated MATLAB users who give excellent feedback concerning their use. But it would be irresponsible of us to document these features to the general MATLAB public when we know their use and implementation may change dramatically in the short term, making the development environment unstable. We don’t want to mislead you into using features we know not to be properly implemented.
This situation is not unique to The MathWorks. What is unique to us, however, is our honesty in letting you know these features are there. If you are in a situation where you may need to use a yet unsupported feature and we feel that it’s appropriate for us to do so, we’ll let you know that it exists. We have no intention of trying to make you look foolish; pointing out these hidden functions is meant to be helpful. We could adopt the policies taken by other software companies and simply not reveal any of these features in a public forum such as this one. However, we feel that would be counterproductive to our intentions. So instead we document the features we support, and are very willing to discuss with you unsupported and experimental features that we feel may be useful to you in your work. But we won’t document functionality whose use we aren’t sure about supporting.
We also have a responsibility to our customers to make sure that MATLAB is a stable development environment that can be used consistently from version to version. It’s difficult enough for us to advance the functionality of the code that we do support while still remaining backwards compatible with code written 10-15 years ago (check comments made in a few earlier threads). The last thing we’d want is for the general population of MATLAB users to rely on code that we don’t intend to fully support even in the near future. These undocumented features/ functions/etc fall into that category.
Naturally, this isn’t the only stance we could take. We could document everything or just not even include these features in released versions. But we’ve chosen to provide them to individuals on a case-by-case basis as we see fit; this seems to be most in line with our goals. You may not agree that this is the best choice, but it’s the one we’ve currently taken. Naturally, this could change in the future.
…
— Nabeel

It’s not so much the time to do complete documention. Rather, it’s the committment to continued support of a particular feature. We have things in MATLAB that are experimental, unfinished or incomplete. They may be good ideas, but they may not. If we fully advertise and document them, then we feel strongly obligated to support them, and to keep them in MATLAB forever.
— Cleve Moler

A not unsimilar explanation was provided by Doug Hull in a comment on one of Matlab’s official blogs:

UITable was undocumented, but available, for some releases in the past.
Some features like this are available, but undocumented, for various reasons. Sometimes they are there so that we can use them in making tools for MATLAB. These features are undocumented because we need the flexibility to change the interface for a while. If you find undocumented features, you need to be aware that the interface can change or be removed in future releases.
-Doug

Note: For all the people who have noted about cprintf‘s issues with R2011b, a new version was uploaded on Monday (November 28), that fixes those problems, as well as the issue of the space (on R2011b only). In R2011b, colored text can now be placed immediately adjacent to each other, without the technical need for an intervening space character. Unfortunately, this relies on a fix that MathWorks made to the Command Window in R2011b, and which is not available in R2011a and earlier. cprintf automatically checks the Matlab release version and adds a space if necessary.

While I was at it, I also updated the findjobj utility (a few bug fixes).