Undocumented Matlab
  • SERVICES
    • Consulting
    • Development
    • Training
    • Gallery
    • Testimonials
  • PRODUCTS
    • IQML: IQFeed-Matlab connector
    • IB-Matlab: InteractiveBrokers-Matlab connector
    • EODML: EODHistoricalData-Matlab connector
    • Webinars
  • BOOKS
    • Secrets of MATLAB-Java Programming
    • Accelerating MATLAB Performance
    • MATLAB Succinctly
  • ARTICLES
  • ABOUT
    • Policies
  • CONTACT
  • SERVICES
    • Consulting
    • Development
    • Training
    • Gallery
    • Testimonials
  • PRODUCTS
    • IQML: IQFeed-Matlab connector
    • IB-Matlab: InteractiveBrokers-Matlab connector
    • EODML: EODHistoricalData-Matlab connector
    • Webinars
  • BOOKS
    • Secrets of MATLAB-Java Programming
    • Accelerating MATLAB Performance
    • MATLAB Succinctly
  • ARTICLES
  • ABOUT
    • Policies
  • CONTACT

Speeding-up builtin Matlab functions – part 2

May 6, 2018 8 Comments

Last week I showed how we can speed-up built-in Matlab functions, by creating local copies of the relevant m-files and then optimizing them for improved speed using a variety of techniques.Accelerating MATLAB Performance Today I will show another example of such speed-up, this time of the Financial Toolbox’s maxdrawdown function, which is widely used to estimate the relative risk of a trading strategy or asset. One might think that such a basic indicator would be optimized for speed, but experience shows otherwise. In fact, this function turned out to be the main run-time performance hotspot for one of my clients. The vast majority of his code was optimized for speed, and he naturally assumed that the built-in Matlab functions were optimized as well, but this was not the case. Fortunately, I was able to create an equivalent version that was 30-40 times faster (!), and my client remains a loyal Matlab fan.
In today’s post I will show how I achieved this speed-up, using different methods than the ones I showed last week. A combination of these techniques can be used in a wide range of other Matlab functions. Additional speed-up techniques can be found in other performance-related posts on this website, as well in my book Accelerating MATLAB Performance.

Profiling

As I explained last week, the first step in speeding up any function is to profile its current run-time behavior using Matlab’s builtin Profiler tool, which can either be started from the Matlab Editor toolstrip (“Run and Time”) or via the profile function.
The profile report for the client’s function showed that it had two separate performance hotspots:

  1. Code that checks the drawdown format (optional 2nd input argument) against a set of allowed formats
  2. Main code section that iteratively loops over the data-series values to compute the maximal drawdown

In order top optimize the code, I copied %matlabroot%/toolbox/finance/finance/maxdrawdown.m to a local folder on the Matlab path, renaming the file (and the function) maxdrawdn, in order to avoid conflict with the built-in version.

Optimizing input args pre-processing

The main problem with the pre-processing of the optional format input argument is the string comparisons, which are being done even when the default format is used (which is by far the most common case). String comparison are often more expensive than numerical computations. Each comparison by itself is very short, but when maxdrawdown is run in a loop (as it often is), the run-time adds up.
Here’s a snippet of the original code:

if nargin < 2 || isempty(Format)
    Format = 'return';
end
if ~ischar(Format) || size(Format,1) ~= 1
    error(message('finance:maxdrawdown:InvalidFormatType'));
end
choice = find(strncmpi(Format,{'return','arithmetic','geometric'},length(Format)));
if isempty(choice)
    error(message('finance:maxdrawdown:InvalidFormatValue'));
end

if nargin < 2 || isempty(Format) Format = 'return'; end if ~ischar(Format) || size(Format,1) ~= 1 error(message('finance:maxdrawdown:InvalidFormatType')); end choice = find(strncmpi(Format,{'return','arithmetic','geometric'},length(Format))); if isempty(choice) error(message('finance:maxdrawdown:InvalidFormatValue')); end

An equivalent code, which avoids any string processing in the common default case, is faster, simpler and more readable:

if nargin < 2 || isempty(Format)
    choice = 1;
elseif ~ischar(Format) || size(Format,1) ~= 1
    error(message('finance:maxdrawdown:InvalidFormatType'));
else
    choice = find(strncmpi(Format,{'return','arithmetic','geometric'},length(Format)));
    if isempty(choice)
        error(message('finance:maxdrawdown:InvalidFormatValue'));
    end
end

if nargin < 2 || isempty(Format) choice = 1; elseif ~ischar(Format) || size(Format,1) ~= 1 error(message('finance:maxdrawdown:InvalidFormatType')); else choice = find(strncmpi(Format,{'return','arithmetic','geometric'},length(Format))); if isempty(choice) error(message('finance:maxdrawdown:InvalidFormatValue')); end end

The general rule is that whenever you have a common case, you should check it first, avoiding unnecessary processing downstream. Moreover, for improved run-time performance (although not necessarily maintainability), it is generally preferable to work with numbers rather than strings (choice rather than Format, in our case).

Vectorizing the main loop

The main processing loop uses a very simple yet inefficient iterative loop. I assume that the code was originally developed this way in order to assist debugging and to ensure correctness, and that once it was ready nobody took the time to also optimize it for speed. It looks something like this:

MaxDD = zeros(1,N);
MaxDDIndex = ones(2,N);
...
if choice == 1   % 'return' format
    MaxData = Data(1,:);
    MaxIndex = ones(1,N);
    for i = 1:T
        MaxData = max(MaxData, Data(i,:));
        q = MaxData == Data(i,:);
        MaxIndex(1,q) = i;
        DD = (MaxData - Data(i,:)) ./ MaxData;
        if any(DD > MaxDD)
            p = DD > MaxDD;
            MaxDD(p) = DD(p);
            MaxDDIndex(1,p) = MaxIndex(p);
            MaxDDIndex(2,p) = i;
        end
    end
else             % 'arithmetic' or 'geometric' formats
    ...

MaxDD = zeros(1,N); MaxDDIndex = ones(2,N); ... if choice == 1 % 'return' format MaxData = Data(1,:); MaxIndex = ones(1,N); for i = 1:T MaxData = max(MaxData, Data(i,:)); q = MaxData == Data(i,:); MaxIndex(1,q) = i; DD = (MaxData - Data(i,:)) ./ MaxData; if any(DD > MaxDD) p = DD > MaxDD; MaxDD(p) = DD(p); MaxDDIndex(1,p) = MaxIndex(p); MaxDDIndex(2,p) = i; end end else % 'arithmetic' or 'geometric' formats ...

This loop can relatively easily be vectorized, making the code much faster, and arguably also simpler, more readable, and more maintainable:

if choice == 3
    Data = log(Data);
end
MaxDDIndex = ones(2,N);
MaxData = cummax(Data);
MaxIndexes = find(MaxData==Data);
DD = MaxData - Data;
if choice == 1	% 'return' format
    DD = DD ./ MaxData;
end
MaxDD = cummax(DD);
MaxIndex2 = find(MaxDD==DD,1,'last');
MaxIndex1 = MaxIndexes(find(MaxIndexes<=MaxIndex2,1,'last'));
MaxDDIndex(1,:) = MaxIndex1;
MaxDDIndex(2,:) = MaxIndex2;
MaxDD = MaxDD(end,:);

if choice == 3 Data = log(Data); end MaxDDIndex = ones(2,N); MaxData = cummax(Data); MaxIndexes = find(MaxData==Data); DD = MaxData - Data; if choice == 1 % 'return' format DD = DD ./ MaxData; end MaxDD = cummax(DD); MaxIndex2 = find(MaxDD==DD,1,'last'); MaxIndex1 = MaxIndexes(find(MaxIndexes<=MaxIndex2,1,'last')); MaxDDIndex(1,:) = MaxIndex1; MaxDDIndex(2,:) = MaxIndex2; MaxDD = MaxDD(end,:);

Let’s make a short run-time and accuracy check – we can see that we achieved a 31-fold speedup (YMMV), and received the exact same results:

>> data = rand(1,1e7);
>> tic, [MaxDD1, MaxDDIndex1] = maxdrawdown(data); toc  % builtin Matlab function
Elapsed time is 7.847140 seconds.
>> tic, [MaxDD2, MaxDDIndex2] = maxdrawdn(data); toc  % our optimized version
Elapsed time is 0.253130 seconds.
>> speedup = round(7.847140 / 0.253130)
speedup =
    31
>> isequal(MaxDD1,MaxDD2) && isequal(MaxDDIndex1,MaxDDIndex2)  % check accuracy
ans =
  logical
   1

>> data = rand(1,1e7); >> tic, [MaxDD1, MaxDDIndex1] = maxdrawdown(data); toc % builtin Matlab function Elapsed time is 7.847140 seconds. >> tic, [MaxDD2, MaxDDIndex2] = maxdrawdn(data); toc % our optimized version Elapsed time is 0.253130 seconds. >> speedup = round(7.847140 / 0.253130) speedup = 31 >> isequal(MaxDD1,MaxDD2) && isequal(MaxDDIndex1,MaxDDIndex2) % check accuracy ans = logical 1

Disclaimer: The code above seems to work (quite well in fact) for a 1D data vector. You’d need to modify it a bit to handle 2D data – the returned maximal drawdown are still computed correctly but not the returned indices, due to their computation using the find function. This modification is left as an exercise for the reader…

Related functions

Very similar code could be used for the corresponding maxdrawup function. Although this function is used much less often than maxdrawdown, it is in fact widely used and very similar to maxdrawdown, so it is surprising that it is missing in the Financial Toolbox. Here is the corresponding code snippet:

% Code snippet for maxdrawup (similar to maxdrawdn)
MaxDUIndex = ones(2,N);
MinData = cummin(Data);
MinIndexes = find(MinData==Data);
DU = Data - MinData;
if choice == 1	% 'return' format
    DU = DU ./ MinData;
end
MaxDU = cummax(DU);
MaxIndex = find(MaxDD==DD,1,'last');
MinIndex = MinIndexes(find(MinIndexes<=MaxIndex,1,'last'));
MaxDUIndex(1,:) = MinIndex;
MaxDUIndex(2,:) = MaxIndex;
MaxDU = MaxDU(end,:);

% Code snippet for maxdrawup (similar to maxdrawdn) MaxDUIndex = ones(2,N); MinData = cummin(Data); MinIndexes = find(MinData==Data); DU = Data - MinData; if choice == 1 % 'return' format DU = DU ./ MinData; end MaxDU = cummax(DU); MaxIndex = find(MaxDD==DD,1,'last'); MinIndex = MinIndexes(find(MinIndexes<=MaxIndex,1,'last')); MaxDUIndex(1,:) = MinIndex; MaxDUIndex(2,:) = MaxIndex; MaxDU = MaxDU(end,:);

Similar vectorization could be applied to the emaxdrawdown function. This too is left as an exercise for the reader…

Conclusions

Matlab is composed of thousands of internal functions. Each and every one of these functions was meticulously developed and tested by engineers, who are after all only human. Whereas supreme emphasis is always placed with Matlab functions on their accuracy, run-time performance sometimes takes a back-seat. Make no mistake about this: code accuracy is almost always more important than speed (an exception are cases where some accuracy may be sacrificed for improved run-time performance). So I’m not complaining about the current state of affairs.
But when we run into a specific run-time problem in our Matlab program, we should not despair if we see that built-in functions cause slowdown. We can try to avoid calling those functions (for example, by reducing the number of invocations, or limiting the target accuracy, etc.), or optimize these functions in our own local copy, as I’ve shown last week and today. There are multiple techniques that we could employ to improve the run time. Just use the profiler and keep an open mind about alternative speed-up mechanisms, and you’d be half-way there.
Let me know if you’d like me to assist with your Matlab project, either developing it from scratch or improving your existing code, or just training you in how to improve your Matlab code’s run-time/robustness/usability/appearance. I will be visiting Boston and New York in early June and would be happy to meet you in person to discuss your specific needs.

Related posts:

  1. Speeding-up builtin Matlab functions – part 1 – Built-in Matlab functions can often be profiled and optimized for improved run-time performance. This article shows a typical example. ...
  2. Speeding up Matlab-JDBC SQL queries – Fetching SQL ResultSet data from JDBC into Matlab can be made significantly faster. ...
  3. Callback functions performance – Using anonymous functions in Matlab callbacks can be very painful for performance. Today's article explains how this can be avoided. ...
  4. Undocumented mouse pointer functions – Matlab contains several well-documented functions and properties for the mouse pointer. However, some very-useful functions have remained undocumented and unsupported. This post details their usage....
  5. Speeding up compiled apps startup – The MCR_CACHE_ROOT environment variable can reportedly help to speed-up deployed Matlab executables....
  6. Builtin PopupPanel widget – We can use a built-in Matlab popup-panel widget control to display lightweight popups that are attached to a figure window. ...
Performance Pure Matlab Toolbox
Print Print
« Previous
Next »
8 Responses
  1. John May 18, 2018 at 01:42 Reply

    Hi Yair

    just wondering if the find function is really required
    for example is: MaxIndexes = MaxData==Data; equivalent to MaxIndexes = find(MaxData==Data);

    best regards

    • Yair Altman May 18, 2018 at 02:01 Reply

      @John – the reason is that MaxIndexes is compared to another indices array, MaxIndex2, below. For this comparison we need numeric indices, not a logical array.

    • John May 18, 2018 at 03:19 Reply

      Yair, thanks for the quick reply and explanation

  2. Marshall May 19, 2018 at 20:58 Reply

    Hi Yair,

    One of the lines in your code

    MaxIndex2 = find(MaxDD==DD,1,'last');

    MaxIndex2 = find(MaxDD==DD,1,'last');

    is one that is incredibly common but is undoubtedly inefficient because the == compares every element, despite the fact that our call to find requests that we stop. As an example, the following two are roughly comparable in time:

    % compare using vectorization
    X = zeros(1e9,1); X(5e8) = 9;
    tic; find(X==9,1); t_vec = toc;
     
    % compare using a for-loop
    tic;
    for i = 1:length(X)
        if(X(i)==9), break; end
    end
    t_iter = toc;
     
    fprintf('Vectorized: %2.2fn',t_vec);
    fprintf('Iterative: %2.2fn',t_iter);
     
    Vectorized: 3.94
    Iterative: 3.58

    % compare using vectorization X = zeros(1e9,1); X(5e8) = 9; tic; find(X==9,1); t_vec = toc; % compare using a for-loop tic; for i = 1:length(X) if(X(i)==9), break; end end t_iter = toc; fprintf('Vectorized: %2.2fn',t_vec); fprintf('Iterative: %2.2fn',t_iter); Vectorized: 3.94 Iterative: 3.58

    However, in the following scenario, the issue becomes obvious:

    X = zeros(1e9,1); X(1) = 9;
    tic; find(X==9,1); t_vec = toc;
     
    % compare using a for-loop
    tic;
    for i = 1:length(X)
        if(X(i)==9), break; end
    end
    t_iter = toc;
     
    fprintf('Vectorized: %2.2fn',t_vec);
    fprintf('Iterative: %2.2fn',t_iter);
     
    Vectorized: 2.19
    Iterative: 0.00

    X = zeros(1e9,1); X(1) = 9; tic; find(X==9,1); t_vec = toc; % compare using a for-loop tic; for i = 1:length(X) if(X(i)==9), break; end end t_iter = toc; fprintf('Vectorized: %2.2fn',t_vec); fprintf('Iterative: %2.2fn',t_iter); Vectorized: 2.19 Iterative: 0.00

    From this we can deduce that the call to find() in the first example takes about 2 seconds, with the other 2 seconds due to the comparison X==9. However, in the second example, the entirety of X is still compared to 9 before the find is called

    A smart compiler would optimize away the common code pattern of find(X==y), but it appears Matlab’s JIT doesn’t. Is there a more obvious idiomatic Matlab method of avoiding performing a full comparison when it’s not necessary?

    • Yair Altman May 20, 2018 at 20:53 Reply

      @Marshal – I accept that additional speedup is possible for the line you noted, in cases involving 1e9 data elements (as in your example). As I noted in my post, the new function is ~40 times faster than Matlab’s built-in function, to the point where additional speedups might not have been very cost-effective, especially since my use-case involved up to a few million (not billions of) data elements. In such cases, shaving off a few dozen millisecs did not seem to be worth the extra coding time and risk.

      If you are interested in extra speedup, an additional tip would be to enclose the entire section that computes MaxIndex1, MaxIndex2 and MaxDDIndex with a check that the MaxDDIndex output arg is in fact requested – in many use cases it is not, and in such cases there is no need to compute these indexes:

      if nargout > 1   MaxIndex2 = find(MaxDD==DD,1,'last');
         MaxIndex1 = MaxIndexes(find(MaxIndexes<=MaxIndex2,1,'last'));
         MaxDDIndex(1,:) = MaxIndex1;
         MaxDDIndex(2,:) = MaxIndex2;
      end

      if nargout > 1 MaxIndex2 = find(MaxDD==DD,1,'last'); MaxIndex1 = MaxIndexes(find(MaxIndexes<=MaxIndex2,1,'last')); MaxDDIndex(1,:) = MaxIndex1; MaxDDIndex(2,:) = MaxIndex2; end

    • Marshall May 21, 2018 at 04:47 Reply

      Thanks Yair–I wasn’t trying to say “look you could have sped this up!”, I was instead asking if there was a better way of performing the loop I made in the second example, using vectorization instead of a for loop, which is the un-Matlab way to do it. I was mainly disappointed that Matlab wasn’t optimizing away vectorization of X==9 in conjunction with find(X==9,1).

  3. Michelle Hirsch June 25, 2018 at 21:06 Reply

    Yair – just a quick note to let you know that the development team investigated these suggestions for maxdrawdown and emaxdrawdown and the changes look promising. Assuming everything pans out as expected, we should see the performance improvement in R2019a. (Usual disclaimers about the reality of software development apply).

    Thanks for bringing the issue to our attention.

    • Yair Altman June 25, 2018 at 21:19 Reply

      @Michelle – thank you for the update.

      Please note the extra tip here, in case the developers missed it, since it was not included in the main post body.

      Also, please consider adding corresponding vectorized functions for maxdrawup/emaxdrawup – Whereas draw-ups are used less frequently than draw-downs, they too are widely used. If you wish, I’ll be happy to send you my version of maxdrawup to serve as a baseline for your developers.

Leave a Reply
HTML tags such as <b> or <i> are accepted.
Wrap code fragments inside <pre lang="matlab"> tags, like this:
<pre lang="matlab">
a = magic(3);
disp(sum(a))
</pre>
I reserve the right to edit/delete comments (read the site policies).
Not all comments will be answered. You can always email me (altmany at gmail) for private consulting.

Click here to cancel reply.

Useful links
  •  Email Yair Altman
  •  Subscribe to new posts (feed)
  •  Subscribe to new posts (reader)
  •  Subscribe to comments (feed)
 
Accelerating MATLAB Performance book
Recent Posts

Speeding-up builtin Matlab functions – part 3

Improving graphics interactivity

Interesting Matlab puzzle – analysis

Interesting Matlab puzzle

Undocumented plot marker types

Matlab toolstrip – part 9 (popup figures)

Matlab toolstrip – part 8 (galleries)

Matlab toolstrip – part 7 (selection controls)

Matlab toolstrip – part 6 (complex controls)

Matlab toolstrip – part 5 (icons)

Matlab toolstrip – part 4 (control customization)

Reverting axes controls in figure toolbar

Matlab toolstrip – part 3 (basic customization)

Matlab toolstrip – part 2 (ToolGroup App)

Matlab toolstrip – part 1

Categories
  • Desktop (45)
  • Figure window (59)
  • Guest bloggers (65)
  • GUI (165)
  • Handle graphics (84)
  • Hidden property (42)
  • Icons (15)
  • Java (174)
  • Listeners (22)
  • Memory (16)
  • Mex (13)
  • Presumed future risk (394)
    • High risk of breaking in future versions (100)
    • Low risk of breaking in future versions (160)
    • Medium risk of breaking in future versions (136)
  • Public presentation (6)
  • Semi-documented feature (10)
  • Semi-documented function (35)
  • Stock Matlab function (140)
  • Toolbox (10)
  • UI controls (52)
  • Uncategorized (13)
  • Undocumented feature (217)
  • Undocumented function (37)
Tags
AppDesigner (9) Callbacks (31) Compiler (10) Desktop (38) Donn Shull (10) Editor (8) Figure (19) FindJObj (27) GUI (141) GUIDE (8) Handle graphics (78) HG2 (34) Hidden property (51) HTML (26) Icons (9) Internal component (39) Java (178) JavaFrame (20) JIDE (19) JMI (8) Listener (17) Malcolm Lidierth (8) MCOS (11) Memory (13) Menubar (9) Mex (14) Optical illusion (11) Performance (78) Profiler (9) Pure Matlab (187) schema (7) schema.class (8) schema.prop (18) Semi-documented feature (6) Semi-documented function (33) Toolbar (14) Toolstrip (13) uicontrol (37) uifigure (8) UIInspect (12) uitable (6) uitools (20) Undocumented feature (187) Undocumented function (37) Undocumented property (20)
Recent Comments
Contact us
Captcha image for Custom Contact Forms plugin. You must type the numbers shown in the image
Undocumented Matlab © 2009 - Yair Altman
This website and Octahedron Ltd. are not affiliated with The MathWorks Inc.; MATLAB® is a registered trademark of The MathWorks Inc.
Scroll to top