cellfun - undocumented performance boost

Matlab’s built-in cellfun function has traditionally enabled several named (string) processing functions such as ‘isempty’. The relevant code would look like this:

data = cellfun('isempty',cellArray);

In recent years, newer Matlab releases has added support for function handles, so the previous code snippet can now be written as follows:

data = cellfun(@isempty,cellArray);

The newer function-handle format is “cleaner” and more extensive than the former format, accepting any function, not just the limited list of pre-specified processing function names (’isreal’, ‘islogical’, ‘length’, ‘ndims’, ‘prodofsize’). Some have even reported that the older format has limitations vis-a-vis compilation etc.

All this is well known and documented. However, it turns out that, counter-intuitively (and undocumented), the older format is actually much faster than the newer format for those pre-specified processing function names. The reason appears to be that ‘isempty’ (as well as the other predefined string functions) uses specific code-branches optimized for performance:

>> c = mat2cell(1:1e6,1,repmat(1,1,1e6));
>> tic, d=cellfun('isempty',c); toc
Elapsed time is 0.115583 seconds.
>> tic, d=cellfun(@isempty,c); toc
Elapsed time is 7.493989 seconds.

Perhaps a future Matlab release will improve cellfun’s internal code, to check for function-handle equality to the optimized functions, and use the optimized code branch if possible. When I posted this issue today as a correction to a reader’s misconception, Matlab’s Loren Shure commented as follows:

We could improve cellfun to check function handles to see if they match specified strings. Even then MATLAB would have to be careful in case the user has overridden the built-in version of whatever the string points to.

While this comment seems to imply that the performance boost feature will be maintained and possibly improved in future releases, users should note that this is not guarantied. One could even argue that future code optimizations would be applied to the new (function-handle) format rather than the old string format. The performance pendulum might also change based on user platform. Therefore, users for whom performance is critical should always test both versions on their target system: ‘isempty’ vs. @isempty etc. (note that the corresponding function for ‘prodofsize’ is @numel).

Bookmark and Share

Related posts:

  1. Performance: scatter vs. line In many circumstances, the line function can generate visually-identical plots as the scatter function, much faster...
  2. tic / toc - undocumented option Matlab's built-in tic/toc functions have an undocumented option enabling multiple nested clockings...
  3. Undocumented scatter plot behavior The scatter plot function has an undocumented behavior when plotting more than 100 points: it returns a single unified patch object handle, rather than a patch handle for each specific point as it returns with 100 or less points....
  4. ismembc - undocumented helper function Matlab has several undocumented internal helper functions that can be useful on their own in some cases. This post presents the ismembc function....
  5. uiundo - Matlab’s undocumented undo/redo manager The built-in uiundo function provides easy yet undocumented access to Matlab's powerful undo/redo functionality. This article explains its usage....
  6. Undocumented profiler options The Matlab profiler has some undocumented options that facilitate debugging memory bottlenecks and JIT (Just-In-Time Java compilation) problems....

Tags: , ,

PoorSo-soHelpfulVery helpfulExcellent! (No Ratings Yet)
Loading ... Loading ...
Bookmark and Share Print Print

5 Responses to “cellfun - undocumented performance boost”

  1. Ashish Sadanadnan Ashish Sadanadnan says:

    They seem to already have improved it quite a bit in R2009a; here are my results from running your code:

    >> c = mat2cell(1:1e6,1,repmat(1,1,1e6));
    >> tic, d=cellfun(’isempty’,c); toc
    Elapsed time is 0.032880 seconds.
    >> tic, d=cellfun(@isempty,c); toc
    Elapsed time is 0.563284 seconds.

  2. Yair Altman Yair Altman says:

    Ashish - actually your results show a factor of 17 between the slower @isempty and the faster ‘isempty’, consistent with the results I posted above (my reported factor of 65 is almost the same order of magnitude as 17, and may be due to external platform-dependent factors).

    The absolute values of the results of course depend on the platform: my results were for a run-down heavily-loaded laptop… The important thing here is the factor between @isempty and ‘isempty’ - not the absolute values. And a factor of 17 is still high enough to be taken into consideration in a performance-critical application.

  3. Naor Naor says:

    wow. that’s a pretty significant unnecessary slowdown. at least this would be easy to catch with the profiler.

  4. Loren S Loren S says:

    Yair-

    As I noted to you on my blog, MATLAB doesn’t convert from FH to string method because the user might have overridden whatever the method, e.g., isempty. MATLAB could, at runtime, see if it’s overridden, and if not, call the optimized version. But it can’t do that blindly without risk of wrong answers.

    –Loren

  5. Ashish Sadanadnan Ashish Sadanadnan says:

    Yair,
    I wasn’t disputing your results. Just wanted to show that the factor has improved significantly in newer version (65 to 17). Of course, 17 times faster is still very significant as you pointed out.

    - Ashish.

Leave a Reply


Wrap code fragments inside <pre lang="matlab"> tags, like this:

   <pre lang="matlab">
   a = magic(3);
   sum(a)
   </pre>