Undocumented Matlab
  • SERVICES
    • Consulting
    • Development
    • Training
    • Gallery
    • Testimonials
  • PRODUCTS
    • IQML: IQFeed-Matlab connector
    • IB-Matlab: InteractiveBrokers-Matlab connector
    • EODML: EODHistoricalData-Matlab connector
    • Webinars
  • BOOKS
    • Secrets of MATLAB-Java Programming
    • Accelerating MATLAB Performance
    • MATLAB Succinctly
  • ARTICLES
  • ABOUT
    • Policies
  • CONTACT
  • SERVICES
    • Consulting
    • Development
    • Training
    • Gallery
    • Testimonials
  • PRODUCTS
    • IQML: IQFeed-Matlab connector
    • IB-Matlab: InteractiveBrokers-Matlab connector
    • EODML: EODHistoricalData-Matlab connector
    • Webinars
  • BOOKS
    • Secrets of MATLAB-Java Programming
    • Accelerating MATLAB Performance
    • MATLAB Succinctly
  • ARTICLES
  • ABOUT
    • Policies
  • CONTACT

rmfield performance

May 25, 2016 5 Comments

Once again I would like to introduce guest blogger Hanan Kavitz of Applied Materials. Several months ago Hanan discussed some quirks with compiled Matlab DLLs. Today Hanan will discuss how they overcame a performance bottleneck with Matlab’s builtin rmfield function, exemplifying the general idea that we can sometimes improve performance by profiling the core functionality that causes a performance hotspot and optimizing it, even when it is part of a builtin Matlab function. For additional ideas of improving Matlab peformance, search this blog for “Performance” articles, and/or get the book “Accelerating MATLAB Performance“.
Accelerating MATLAB Performance
I’ve been using Matlab for many years now and from time to time I need to profile low-throughput code. When I profile this code sometimes I realize that a computational ‘bottleneck’ is due to a builtin Matlab function (part of the core language). I can often find ways to accelerate such builtin functions and get significant speedup in my code.
I recently found Matlab’s builtin rmfield function being too slow for my needs. It works great when one needs to remove a few fields from a small structure, but in our case we needed to remove thousands of fields from a structure containing about 5000 fields – and this is executed in a function that is called many times inside an external loop. The program was significantly sluggish.
It started when a co-worker asked me to look at a code that looked just slightly more intelligent than this:

for i = 1:5000
    myStruct = rmfield(myStruct,fieldNames{i});
end

for i = 1:5000 myStruct = rmfield(myStruct,fieldNames{i}); end

Running this code within a tic/toc pair yielded the following results:

>> tic; myFunc(); t1 = toc
t1 =
      25.7713

>> tic; myFunc(); t1 = toc t1 = 25.7713

In my opinion 25.77 secs for such a simple functionality seems like an eternity…

The obvious thing was to change the code to the documented faster (vectorized) version:

>> tic; myStruct = rmfield(myStruct,fieldNames); t2 = toc
t2 =
      0.6097

>> tic; myStruct = rmfield(myStruct,fieldNames); t2 = toc t2 = 0.6097

This is obviously much better but since rmfield is called many times in my application, I needed something even better. So I profiled rmfield and was not happy with the result.
The original code of rmfield (%matlabroot%/toolbox/matlab/datatypes/rmfield.m) looks something like this (I deleted some non-essential code for brevity):

function t = rmfield(s,field)
% get fieldnames of struct
f = fieldnames(s);
% Determine which fieldnames to delete.
idxremove = [];
for i=1:length(field)
   j = find(strcmp(field{i},f) == true);
   idxremove = [idxremove;j];
end% set indices of fields to keep
idxkeep = 1:length(f);
idxkeep(idxremove) = [];
% remove the specified fieldnames from the list of fieldnames.
f(idxremove,:) = [];
% convert struct to cell array
c = struct2cell(s);
% find size of cell array
sizeofarray = size(c);
newsizeofarray = sizeofarray;
% adjust size for fields to be removed
newsizeofarray(1) = sizeofarray(1) - length(idxremove);
% rebuild struct
t = cell2struct(reshape(c(idxkeep,:),newsizeofarray),f);

function t = rmfield(s,field) % get fieldnames of struct f = fieldnames(s); % Determine which fieldnames to delete. idxremove = []; for i=1:length(field) j = find(strcmp(field{i},f) == true); idxremove = [idxremove;j]; end % set indices of fields to keep idxkeep = 1:length(f); idxkeep(idxremove) = []; % remove the specified fieldnames from the list of fieldnames. f(idxremove,:) = []; % convert struct to cell array c = struct2cell(s); % find size of cell array sizeofarray = size(c); newsizeofarray = sizeofarray; % adjust size for fields to be removed newsizeofarray(1) = sizeofarray(1) - length(idxremove); % rebuild struct t = cell2struct(reshape(c(idxkeep,:),newsizeofarray),f);

When I profiled the code, the highlighted row was the bottleneck I was looking for.
First, I noticed the string comparison equals to true part – while '==true' is not the cause of the bottleneck, it does leave an impression of bad coding style πŸ™ Perhaps this code was created as some apprentice project, which might also explain its suboptimal performance.
The real performance problem here is that for each field that we wish to remove, rmfield compares it to all existing fields to find its location in a cell array of field names. This is algorithmically inefficient and makes the code hard to understand (just try – it took me hard, long minutes).
So, I created a variant of rmfield.m called fast_rmfield.m, as follows (again, omitting some non-essential code):

function t = fast_rmfield(s,field)
% get fieldnames of struct
f = fieldnames(s);
[f,ia] = setdiff(f,field,'R2012a');
% convert struct to cell array
c = squeeze(struct2cell(s));
% rebuild struct
t = cell2struct(c(ia,:),f)';

function t = fast_rmfield(s,field) % get fieldnames of struct f = fieldnames(s); [f,ia] = setdiff(f,field,'R2012a'); % convert struct to cell array c = squeeze(struct2cell(s)); % rebuild struct t = cell2struct(c(ia,:),f)';

This code is much shorter, easier to explain and maintain, but also (and most importantly) much faster:

>> tic; myStruct = fast_rmfield(myStruct,fieldNames); t3 = toc
t3 =
      0.0302
>> t2/t3
ans =
      20.1893

>> tic; myStruct = fast_rmfield(myStruct,fieldNames); t3 = toc t3 = 0.0302 >> t2/t3 ans = 20.1893

This resulted in a speedup of ~850x compared to the original version (of 25.77 secs), and ~20x compared to the vectorized version. A nice improvement in my humble opinion…
The point in all this is that we can and should rewrite Matlab builtin functions when they are too slow for our needs, whether it is found to be an algorithmic flaw (as in this case), extraneous sanity checks (as in the case of ismember or datenum), bad default parameters (as in the case of fopen/fwrite or scatter), or merely slow implementation (as in the case of save, cellfun, or the conv family of functions).
A good pattern is to save such code pieces in file names that hint to the original code. In our case, I used fast_rmfield to suggest that it is a faster alternative to rmfield.
Do you know of any other example of a slow implementation in a built-in Matlab function that can be optimized? If so, please leave a comment below.

Related posts:

  1. Matlab-Java memory leaks, performance – Internal fields of Java objects may leak memory - this article explains how to avoid this without sacrificing performance. ...
  2. Callback functions performance – Using anonymous functions in Matlab callbacks can be very painful for performance. Today's article explains how this can be avoided. ...
  3. Array resizing performance – Several alternatives are explored for dynamic array growth performance in Matlab loops. ...
  4. Improving save performance – There are many different ways of improving Matlab's standard save function performance. ...
  5. Plot performance – Undocumented inner plot mechanisms can significantly improve plotting performance ...
  6. uicontextmenu performance – Matlab uicontextmenus are not automatically deleted with their associated objects, leading to leaks and slow-downs. ...
Hanan Kavitz Performance Pure Matlab
Print Print
« Previous
Next »
5 Responses
  1. Peter May 26, 2016 at 03:08 Reply

    I recently wrote a function that needed to calculate many many thousands of dot products. When I profiled my function, it was spending a ton of time in the dot function. When I opened dot.m, it was mostly sanity checks I didn’t need, so I just inlined the dot calculation. The function went from minutes to about a second to complete.

    • Yair Altman May 26, 2016 at 13:02 Reply

      @Peter – excellent usage example. Thanks for sharing.

  2. Fernando June 5, 2016 at 21:36 Reply

    This is really cool. I remember years ago having to accelerate an algorithm that used Matlab’s bultin kronecker tensor product. Luckily, I was able to find this: http://www.mathworks.com/matlabcentral/fileexchange/23606-fast-and-efficient-kronecker-multiplication

  3. Malcolm Lidierth June 6, 2016 at 11:34 Reply

    I found this too with a package that was heavily profiled up to R2012a. Maybe things have changed as JIT acceleration has improved but there were two ‘tricks’ I used often.
    Conditional statements were frequently the bottleneck, but served little purpose in the specific context e.g.

    if ~isa(x,'double')
       x=double(x);
    end

    if ~isa(x,'double') x=double(x); end

    could often safely be replaced with

    x=double(x);

    x=double(x);

    Also, a try-catch sequence in place of conditional tests was often faster.

    ML

  4. Hoi Wong January 30, 2017 at 21:32 Reply

    Thanks for pointing that out. I didn’t even notice (or expect) that rmfield() is not a built-in low level function.

    Since the rmfield() code calls struct2cell() then cell2struct(), it looks like it’s saying that behind the scene, struct() is basically a high level wrapper around cells using hash keys to map the name to indices: a useful piece of information to keep in mind for performance tuning. Actually, table() or dataset() object deals with cells under the hood, I just wasn’t expecting struct() to be the same given its origins in C.

    I found rmfield() so slow that I’ve actually written a keepField() long time ago for the exact same reason as your application scenario: if I need to remove 5000 fields, I might as well keep what I want by adding to an empty (fieldless) struct one-field at a time. i.e.

    for k=1:length(fieldsToKeep)
      Y.(fieldsToKeep{k}) = X.(fieldsToKeep{k});
    end

    for k=1:length(fieldsToKeep) Y.(fieldsToKeep{k}) = X.(fieldsToKeep{k}); end

    It turned out to be much faster too because there are no names to search for. Dynamic field names are done with hash table (checked with TMW, it’s not documented), so it’s on average O(1) time. It boils down to the same O(nlog(n)) time as the setdiff() proposed if you ultimately have to identify the fields to remove instead.

    Unfortunately MATLAB has only rmfield(), so I suspect a lot of people might have done a set-op (spent O(nlog(n)) time) to get the list to keep (the complementary set to remove) then run through the O(n) algorithm in rmfield() when they could have done it in average O(1) time by just transferring the wanted fields.

Leave a Reply
HTML tags such as <b> or <i> are accepted.
Wrap code fragments inside <pre lang="matlab"> tags, like this:
<pre lang="matlab">
a = magic(3);
disp(sum(a))
</pre>
I reserve the right to edit/delete comments (read the site policies).
Not all comments will be answered. You can always email me (altmany at gmail) for private consulting.

Click here to cancel reply.

Useful links
  •  Email Yair Altman
  •  Subscribe to new posts (feed)
  •  Subscribe to new posts (reader)
  •  Subscribe to comments (feed)
 
Accelerating MATLAB Performance book
Recent Posts

Speeding-up builtin Matlab functions – part 3

Improving graphics interactivity

Interesting Matlab puzzle – analysis

Interesting Matlab puzzle

Undocumented plot marker types

Matlab toolstrip – part 9 (popup figures)

Matlab toolstrip – part 8 (galleries)

Matlab toolstrip – part 7 (selection controls)

Matlab toolstrip – part 6 (complex controls)

Matlab toolstrip – part 5 (icons)

Matlab toolstrip – part 4 (control customization)

Reverting axes controls in figure toolbar

Matlab toolstrip – part 3 (basic customization)

Matlab toolstrip – part 2 (ToolGroup App)

Matlab toolstrip – part 1

Categories
  • Desktop (45)
  • Figure window (59)
  • Guest bloggers (65)
  • GUI (165)
  • Handle graphics (84)
  • Hidden property (42)
  • Icons (15)
  • Java (174)
  • Listeners (22)
  • Memory (16)
  • Mex (13)
  • Presumed future risk (394)
    • High risk of breaking in future versions (100)
    • Low risk of breaking in future versions (160)
    • Medium risk of breaking in future versions (136)
  • Public presentation (6)
  • Semi-documented feature (10)
  • Semi-documented function (35)
  • Stock Matlab function (140)
  • Toolbox (10)
  • UI controls (52)
  • Uncategorized (13)
  • Undocumented feature (217)
  • Undocumented function (37)
Tags
ActiveX (6) AppDesigner (9) Callbacks (31) Compiler (10) Desktop (38) Donn Shull (10) Editor (8) Figure (19) FindJObj (27) GUI (141) GUIDE (8) Handle graphics (78) HG2 (34) Hidden property (51) HTML (26) Icons (9) Internal component (39) Java (178) JavaFrame (20) JIDE (19) JMI (8) Listener (17) Malcolm Lidierth (8) MCOS (11) Memory (13) Menubar (9) Mex (14) Optical illusion (11) Performance (78) Profiler (9) Pure Matlab (187) schema (7) schema.class (8) schema.prop (18) Semi-documented feature (6) Semi-documented function (33) Toolbar (14) Toolstrip (13) uicontrol (37) uifigure (8) UIInspect (12) uitools (20) Undocumented feature (187) Undocumented function (37) Undocumented property (20)
Recent Comments
  • Sunham (2 days 20 hours ago): This is an old article, but the issue persists even in 2023. 2023a: z = mat2cell(1:1e6,1,repmat(1,1,1e 6)); f = @() cellfun(‘isempty’, z); g = @() cellfun(@isempty,z);...
  • Yair Altman (12 days 10 hours ago): Robot only runs when you tell it to run a command such as keyPress. If you don’t tell it to run a command, it uses no CPU, so there’s no need to remove the Robot...
  • Eric (12 days 21 hours ago): Hey @Kevin, can you share your code about create group of figures in the AppContainer? The container of multiples uifigures could be an amazing improvement over AppDesigner and its...
  • Elsa Smith (13 days 12 hours ago): I recently used java.awt.Robot to perform GUI testing on MATLAB and found it to be an extremely easy and useful way to control mouse movements.
  • Elsa Smith (13 days 12 hours ago): I’m suspecting that the slow performance of my GUI may be due to the use of java.awt.Robot. Is there a way to cancel/stop/remove the robot after it has been created, or is...
  • Michelle Kline (14 days 5 hours ago): *edit* tip about fopen(), not about fwrite(). ‘Wb’ vs. ‘wb’
  • Michelle Kline (14 days 5 hours ago): Thank you, Yair! With this previously-unknown-to-me tip about fwrite() performance, you have saved me literally hours of processing time. Michelle Kline Department of...
  • Alessandro Beda (26 days 17 hours ago): I found what I think is a bug related to this (tested in R2022 and R2023a). If I add a “ButtonDownFcn” to the plots (see example below), then the modified...
  • Nicholas (28 days 8 hours ago): Yair, Changing the desktop help options did not solve the issue. Though, it’s unclear how I could change these options in the Runtime, if that’s what you meant? I should...
  • Yair Altman (32 days 3 hours ago): @Francisco – this is one of those cases where you should ask MathWorks support. After all, you’re trying to use a supported Matlab functionality when you encountered...
  • Francisco Campos (32 days 15 hours ago): Hello, thanks for all your work that has been immensely useful for those working in the Matlab environment. I have been trying to replace matlabcontrol with the official...
  • Yair Altman (36 days 15 hours ago): Kei – this is possible, I believe that I saw this ability somewhere, a few years ago. I don’t remember exactly where, it will require a bit of research, but...
  • Kei (36 days 18 hours ago): Hello Yair Thank you for this great article. I would like to freeze first two columns in uitable. Do you know if such option is available? Since looks like this option is not available...
  • AndrΓ©s Aguilar (40 days 6 hours ago): Hello, has anyone tried to change the language of the DateComboBox? For example English -> French ————&# 8212;—- January -> Janvier April...
  • Yair Altman (49 days 3 hours ago): I posted my treeTable utility 10 years ago for anyone to use freely, on an as-is basis, without any warranty, updates or support. If you need any customization or assistance...
Contact us
Captcha image for Custom Contact Forms plugin. You must type the numbers shown in the image
Undocumented Matlab Β© 2009 - Yair Altman
This website and Octahedron Ltd. are not affiliated with The MathWorks Inc.; MATLAB® is a registered trademark of The MathWorks Inc.
Scroll to top