ismembc – undocumented helper function

Matlab has a variety of internal helper functions which are used by the main (documented) functions. Some of these helper functions are undocumented and unsupported, but may be helpful in their own right – not just as internal support functions.

In this post I want to present Matlab’s built-in ismembc helper function. This function is used within the stock Matlab ismember and setxor functions for fast processing of the core ismember functionality in “regular” cases: arrays of sorted, non-sparse, non-NaN data in which we’re only interested in the logical membership information (not the index locations of the found members). In such cases, ismembc can be used directly, saving ismember‘s sanity-checks overhead. ismembc uses the same interface (two inputs, single logical output) as ismember and can be a drop-in replacement for ismember for these “regular” cases.

The performance improvement may be significant: In a recent post, MathWorks’ Loren Shure presented different approaches for fast data retrieval, highlighting the ismember function. Let’s compare:

>> % Initial setup
>> n=2e6; a=ceil(n*rand(n,1)); b=ceil(n*rand(n,1));

>> % Run ismember several times, to rule-out JIT compilation overheads
>> tic;ismember(a,b);toc;
Elapsed time is 2.882907 seconds.
>> tic;ismember(a,b);toc;
Elapsed time is 2.818318 seconds.
>> tic;ismember(a,b);toc;
Elapsed time is 3.005967 seconds.

>> % Now use ismembc:
>> tic;ismembc(a,b);toc;
Elapsed time is 0.162108 seconds.
>> tic;ismembc(a,b);toc;
Elapsed time is 0.204108 seconds.
>> tic;ismembc(a,b);toc;
Elapsed time is 0.156963 seconds.

ismembc is actually a MEX file (%matlabroot%\toolbox\matlab\ops\ismembc.mexw32). Its source code is included in the same folder (%matlabroot%\toolbox\matlab\ops\ismembc.cpp) and is actually very readable. From the source code comments we learn that the comment in setxor about ismembc usage is misleading: that comment stated that the inputs must be real, but the source-code indicates that imaginary numbers are also accepted and that only the real-part should be sorted.

ismembc should not be used carelessly: as noted, its inputs must be sorted non-sparse non-NaN values. In the general case we should either ensure this programmatically (as done in setxor) or use ismember, which handles this for us.

The nice thing about ismembc is that its source code (ismembc.cpp) is included, so even if future Matlab releases stop using this function, you can always mex-compile the source code and use it.

Readers interested in ismembc might also be interested in its sibling help function, ismembc2, which is also a mex file located (with source-code) in the same folder as ismembc. Whereas ismembc returns an array of logical values, ismembc2 returns the index locations of the found members.

Related posts:

  1. sprintfc – undocumented helper function The built-in sprintfc function can be used to quickly generate a cell-array of formatted strings. ...
  2. Undocumented feature() function Matlab's undocumented feature function enables access to some internal experimental features...
  3. tic / toc – undocumented option Matlab's built-in tic/toc functions have an undocumented option enabling multiple nested clockings...
  4. Function call timeline profiling A new utility enables to interactively explore Matlab function call timeline profiling. ...
  5. cellfun – undocumented performance boost Matlab's built-in cellfun function has an undocumented option to significantly improve performance in some cases....
  6. Function definition meta-info There are multiple ways to retrieve meta-info about functions in Matlab. ...

Categories: Medium risk of breaking in future versions, Stock Matlab function, Undocumented function

Tags: , , , ,

Bookmark and SharePrint Print

8 Responses to ismembc – undocumented helper function

  1. Pingback: Datenum performance | Undocumented Matlab

  2. Rob says:

    interestingly, it appears that the variable “a” does not have to be sorted for using ismembc(), but the algorithm runs much more quickly if it is.

  3. Pingback: Undocumented Matlab at Nordt Blog

  4. Vivien says:

    It doesn’t work if it’s not sorted. The function stops as soon as it meets a higher value. The given example is bad because the function stops most of the time too early (but ismembc is indeed faster).

    >> a = [3,5]; b = [1,2,3,4,9,5];
    >> ismembc(a,b)
    ans =
         1     0
    • @Vivien – the preconditions required by ismembc were indeed mentioned in the article:

      ismembc should not be used carelessly: as noted, its inputs must be sorted non-sparse non-NaN values. In the general case we should either ensure this programmatically (as done in setxor) or use ismember, which handles this for us.

  5. Pingback: sprintfc – undocumented helper function | Undocumented Matlab

  6. Ramy says:

    There is almost no difference in the new 2013b version of Matlab:

    >> n=2e6; a=ceil(n*rand(n,1)); b=ceil(n*rand(n,1));
    >> tic;ismember(a,b);toc;
    Elapsed time is 0.846894 seconds.
    >> tic;ismember(a,b);toc;
    Elapsed time is 0.817701 seconds.
    >> tic;ismember(a,b);toc;
    Elapsed time is 0.808824 seconds.
    >> tic;ismember(a,b);toc;
    Elapsed time is 0.817153 seconds.
    >> tic;ismember(a,b);toc;
    Elapsed time is 0.817318 seconds.
    >> tic;ismember(a,b);toc;
    Elapsed time is 0.810535 seconds.
    
    • @Ramy – I am not sure I understand – you ran the same ismember command several times, of course it would take a similar amount of time. If you run the same thing with ismembc you’ll see that it’s much faster. On the other hand, remember that ismembc must have sorted inputs, and your inputs are currently random…

Leave a Reply

Your email address will not be published. Required fields are marked *

*

<pre lang="matlab">
a = magic(3);
sum(a)
</pre>