Undocumented Matlab
  • SERVICES
    • Consulting
    • Development
    • Training
    • Gallery
    • Testimonials
  • PRODUCTS
    • IQML: IQFeed-Matlab connector
    • IB-Matlab: InteractiveBrokers-Matlab connector
    • EODML: EODHistoricalData-Matlab connector
    • Webinars
  • BOOKS
    • Secrets of MATLAB-Java Programming
    • Accelerating MATLAB Performance
    • MATLAB Succinctly
  • ARTICLES
  • ABOUT
    • Policies
  • CONTACT
  • SERVICES
    • Consulting
    • Development
    • Training
    • Gallery
    • Testimonials
  • PRODUCTS
    • IQML: IQFeed-Matlab connector
    • IB-Matlab: InteractiveBrokers-Matlab connector
    • EODML: EODHistoricalData-Matlab connector
    • Webinars
  • BOOKS
    • Secrets of MATLAB-Java Programming
    • Accelerating MATLAB Performance
    • MATLAB Succinctly
  • ARTICLES
  • ABOUT
    • Policies
  • CONTACT

datestr performance

October 5, 2011 5 Comments

A few months ago, I posted an article showing how we can use some internal help functions to significantly improve the performance of the commonly-used datenum function. The catch is that we must be certain of certain preconditions before we can use this method, otherwise we might get incorrect results.
Today I wish to share a related experience that happened to me yesterday, when I needed to improve the performance of a client’s application. When profiling the application, I found that a major performance hotspot was a repeated call to the datestr function, each time with several thousand date items.
datestr is the opposite function of datenum: it receives date values and returns date strings. Unlike datenum, however, datestr does not use a highly optimized native-code library function that we could use directly. Instead, it loops over all date values and sequentially applies the requested string pattern.
The natural reaction in such a case would perhaps be to vectorize the code (something that MathWorks should have done in the first place I guess). But in this case I used a different solution, that I would like to share today:
In any programming languages, Matlab included, the most effective performance tip is to cache processing results. Caching often makes the code slightly more complex and less maintainable, but the performance benefits are immediate and significant. In Matlab, benefits of caching can often surpass even those of vectorization (using both vectorization and caching is of course even better).
In the case of datestr, if we can be certain of the precondition that the output string format is the same, we can cache the results, and even use vectorization. In my case, I plotted historical daily stock quotes data and so I was assured that (1) all dates are integers and that (2) I always use the same date-string format ‘dd-mmm-yyyy’.
First, let’s define the wrapper function datestr2 with the necessary caching and vectorization:

% datestr2 - faster variant of datestr, for integer date values since 1/1/2000
function dateStrs = datestr2(dateVals,varargin)
  persistent dateStrsCache
  persistent dateValsCache
  if isempty(dateStrsCache)
      origin = datenum('1-Jan-2000');
      dateValsCache = origin:(now+100);
      dateStrsCache = datestr(dateValsCache,varargin{:});
  end
  [tf,loc] = ismember(dateVals, dateValsCache);
  if all(tf)
      dateStrs = dateStrsCache(loc,:);
  else
      dateStrs = datestr(dateVals,varargin{:});
  end
end  % datestr2

% datestr2 - faster variant of datestr, for integer date values since 1/1/2000 function dateStrs = datestr2(dateVals,varargin) persistent dateStrsCache persistent dateValsCache if isempty(dateStrsCache) origin = datenum('1-Jan-2000'); dateValsCache = origin:(now+100); dateStrsCache = datestr(dateValsCache,varargin{:}); end [tf,loc] = ismember(dateVals, dateValsCache); if all(tf) dateStrs = dateStrsCache(loc,:); else dateStrs = datestr(dateVals,varargin{:}); end end % datestr2

As can be seen, the first time that datestr2 is called, it computes and caches all datestr values for all the dates since Jan 1, 2000. Subsequent calls to datestr2 simply retrieve the relevant cache values. Note that the input date entries need not be sorted.
In case that an input date number is not found in the cache, datestr2 automatically falls-back to using the built-in datestr for the entire input list. This could of course be improved to add the new entries to the cache – I leave this as a reader exercise.
The bottom line was a 150-times (!!!) speed improvement for a 1000-item date vector (50mS => 0.3mS on my system):

% Prepare a 1000-vector of dates, starting 3 years ago until today
>> dateVals = fix(now)+(-1000:0);
% Run the standard datestr function => 50mS
>> tic; s1=datestr(dateVals); toc
Elapsed time is 0.049089 seconds.
>> tic; s1=datestr(dateVals); toc
Elapsed time is 0.048086 seconds.
% Now run our datestr2 function (caching already done before) => 0.3 mS
>> tic; s2=datestr2(dateVals); toc
Elapsed time is 0.222031 seconds.   % initial cache preparation takes 222 mS
>> tic; s2=datestr2(dateVals); toc
Elapsed time is 0.000313 seconds.   % subsequent datestr2 calls take 0.3 mS
>> tic; s2=datestr2(dateVals); toc
Elapsed time is 0.000296 seconds.
% Ensure that the two functions give exactly the same results
>> isequal(s1,s2)
ans =
     1

% Prepare a 1000-vector of dates, starting 3 years ago until today >> dateVals = fix(now)+(-1000:0); % Run the standard datestr function => 50mS >> tic; s1=datestr(dateVals); toc Elapsed time is 0.049089 seconds. >> tic; s1=datestr(dateVals); toc Elapsed time is 0.048086 seconds. % Now run our datestr2 function (caching already done before) => 0.3 mS >> tic; s2=datestr2(dateVals); toc Elapsed time is 0.222031 seconds. % initial cache preparation takes 222 mS >> tic; s2=datestr2(dateVals); toc Elapsed time is 0.000313 seconds. % subsequent datestr2 calls take 0.3 mS >> tic; s2=datestr2(dateVals); toc Elapsed time is 0.000296 seconds. % Ensure that the two functions give exactly the same results >> isequal(s1,s2) ans = 1

So what have we learned from this?

  1. To improve performance we must profile the code. Very often the performance bottlenecks occur in non-intuitive very specific places that can be surgically handled without requiring any major redesign. In this case, I simply had to replace calls to datestr with datestr2 in the application’s code.
  2. Vectorization is not always as cost-effective as caching
  3. Major performance improvements do NOT necessarily involve undocumented functions or tricks: In fact, today’s post about caching uses fully-documented pure-Matlab code.
  4. Different performance hotspots can have different solutions: caching, vectorization, internal library functions, undocumented graphics properties, smart property selection, smart function selection, smart indexing, smart parameter selection etc.

In a later post I will show how similar modifications to internal Matlab functions can dramatically improve the performance of the uitable function. Anyone who has tried using uitable with more than a few dozen cells will surely understand why this is important…
Do you have a favorite performance trick not mentioned above? If so, please post a comment.

Related posts:

  1. Zero-testing performance – Subtle changes in the way that we test for zero/non-zero entries in Matlab can have a significant performance impact. ...
  2. Performance: accessing handle properties – Handle object property access (get/set) performance can be significantly improved using dot-notation. ...
  3. Datenum performance – The performance of the built-in Matlab function datenum can be significantly improved by using an undocumented internal help function...
  4. Plot performance – Undocumented inner plot mechanisms can significantly improve plotting performance ...
  5. rmfield performance – The performance of the builtin rmfield function (as with many other builtin functions) can be improved by simple profiling. ...
  6. Preallocation performance – Preallocation is a standard Matlab speedup technique. Still, it has several undocumented aspects. ...
Performance Pure Matlab
Print Print
« Previous
Next »
5 Responses
  1. Rory October 6, 2011 at 14:23 Reply

    I love this blog. This is fantastic.

  2. MJJ October 7, 2011 at 02:11 Reply

    Also, logical indexing is fast, both to write and to process, but if you are only after a select few values within a large/huge array find may be faster (and less memory hungry) especially with the ‘k’ parameter.

  3. marco October 15, 2011 at 10:05 Reply

    Really cool idea. I normally cringe when I see persistent variables, but this seems like a great use of them. Interesting indentation scheme you have there – kind of c-like to have your function guts indented like that.

  4. EBS October 28, 2011 at 11:22 Reply

    Yair, good tip – what about separating out the cache creation into its own function, and putting that call into startup.m? Then you’d move the first-run time hit to startup, which is innocuous… You could next try making the cache creation one-time by saving it into an mat-file stored in the prefdir(?), using similar logic to check if it exists at the entry to the cache_creation function. This may not be any faster though 🙂

    One power-user tip that used to work is to move any often-called/performance-critical ‘user library’ functions into the MATLAB library directory, and then update the function cache. This used to decrease the time required by ML to lookup the function, but recent improvements may make this untrue in recent versions? A related but little-known tip is that ‘clear all’ also deletes the internal function cache and causes it to be rebuilt, which can reduce performance esp. for user functions until it has been built over time as funcs are called.

    Cheers,
    EBS

    • Yair Altman October 29, 2011 at 08:52 Reply

      @Eric – Thanks for the new ideas

Leave a Reply
HTML tags such as <b> or <i> are accepted.
Wrap code fragments inside <pre lang="matlab"> tags, like this:
<pre lang="matlab">
a = magic(3);
disp(sum(a))
</pre>
I reserve the right to edit/delete comments (read the site policies).
Not all comments will be answered. You can always email me (altmany at gmail) for private consulting.

Click here to cancel reply.

Useful links
  •  Email Yair Altman
  •  Subscribe to new posts (feed)
  •  Subscribe to new posts (reader)
  •  Subscribe to comments (feed)
 
Accelerating MATLAB Performance book
Recent Posts

Speeding-up builtin Matlab functions – part 3

Improving graphics interactivity

Interesting Matlab puzzle – analysis

Interesting Matlab puzzle

Undocumented plot marker types

Matlab toolstrip – part 9 (popup figures)

Matlab toolstrip – part 8 (galleries)

Matlab toolstrip – part 7 (selection controls)

Matlab toolstrip – part 6 (complex controls)

Matlab toolstrip – part 5 (icons)

Matlab toolstrip – part 4 (control customization)

Reverting axes controls in figure toolbar

Matlab toolstrip – part 3 (basic customization)

Matlab toolstrip – part 2 (ToolGroup App)

Matlab toolstrip – part 1

Categories
  • Desktop (45)
  • Figure window (59)
  • Guest bloggers (65)
  • GUI (165)
  • Handle graphics (84)
  • Hidden property (42)
  • Icons (15)
  • Java (174)
  • Listeners (22)
  • Memory (16)
  • Mex (13)
  • Presumed future risk (394)
    • High risk of breaking in future versions (100)
    • Low risk of breaking in future versions (160)
    • Medium risk of breaking in future versions (136)
  • Public presentation (6)
  • Semi-documented feature (10)
  • Semi-documented function (35)
  • Stock Matlab function (140)
  • Toolbox (10)
  • UI controls (52)
  • Uncategorized (13)
  • Undocumented feature (217)
  • Undocumented function (37)
Tags
AppDesigner (9) Callbacks (31) Compiler (10) Desktop (38) Donn Shull (10) Editor (8) Figure (19) FindJObj (27) GUI (141) GUIDE (8) Handle graphics (78) HG2 (34) Hidden property (51) HTML (26) Icons (9) Internal component (39) Java (178) JavaFrame (20) JIDE (19) JMI (8) Listener (17) Malcolm Lidierth (8) MCOS (11) Memory (13) Menubar (9) Mex (14) Optical illusion (11) Performance (78) Profiler (9) Pure Matlab (187) schema (7) schema.class (8) schema.prop (18) Semi-documented feature (6) Semi-documented function (33) Toolbar (14) Toolstrip (13) uicontrol (37) uifigure (8) UIInspect (12) uitable (6) uitools (20) Undocumented feature (187) Undocumented function (37) Undocumented property (20)
Recent Comments
Contact us
Captcha image for Custom Contact Forms plugin. You must type the numbers shown in the image
Undocumented Matlab © 2009 - Yair Altman
This website and Octahedron Ltd. are not affiliated with The MathWorks Inc.; MATLAB® is a registered trademark of The MathWorks Inc.
Scroll to top