Undocumented Matlab
  • SERVICES
    • Consulting
    • Development
    • Training
    • Gallery
    • Testimonials
  • PRODUCTS
    • IQML: IQFeed-Matlab connector
    • IB-Matlab: InteractiveBrokers-Matlab connector
    • EODML: EODHistoricalData-Matlab connector
    • Webinars
  • BOOKS
    • Secrets of MATLAB-Java Programming
    • Accelerating MATLAB Performance
    • MATLAB Succinctly
  • ARTICLES
  • ABOUT
    • Policies
  • CONTACT
  • SERVICES
    • Consulting
    • Development
    • Training
    • Gallery
    • Testimonials
  • PRODUCTS
    • IQML: IQFeed-Matlab connector
    • IB-Matlab: InteractiveBrokers-Matlab connector
    • EODML: EODHistoricalData-Matlab connector
    • Webinars
  • BOOKS
    • Secrets of MATLAB-Java Programming
    • Accelerating MATLAB Performance
    • MATLAB Succinctly
  • ARTICLES
  • ABOUT
    • Policies
  • CONTACT

Datenum performance

May 4, 2011 14 Comments

A few days ago, a reader on StackOverflow asked whether it is possible to improve the performance of Matlab’s built-in datenum function. This question reminded me of a similar case that I answered exactly two years ago, of improving the performance of the built-in ismember function.
In both cases, the solution to the performance question can be found by simply using Matlab’s built-in profiler in order to extract just the core processing functionality. It is often found that in a particular situation there is no need for all the input arguments data validity checks, and under some known limitations we can indeed use the core functionality directly.
In the case of ismember, it turned out that if we are assured in advance that the input data are sorted non-sparse non-NaN values, then we can use the undocumented built-in helper functions ismembc or ismembc2 for much-improved performance over the standard ismember. Both ismembc and ismembc2 happen to be mex files, although this is not always the case for helper functions.
Our datenum case is very similar. It turns out that datenum uses the undocumented built-in helper function dtstr2dtnummx for the actual processing – converting a date from text to floating-point number. As I noted in my response to the StackOverflow question, we can directly use this helper function for improved performance: On my particular computer, dtstr2dtnummx is over 3 times faster than the standard datenum function:

% Fast - using dtstr2dtnummx
>> tic, for i=1:1000; dateNum=dtstr2dtnummx({'2010-12-12 12:21:12.123'},'yyyy-MM-dd HH:mm:ss'); end; dateNum,toc
dateNum =
          734484.514722222
Elapsed time is 0.218423 seconds.
% Slower - using datenum
>> tic, for i=1:1000; dateNum=datenum({'2010-12-12 12:21:12.123'},'yyyy-mm-dd HH:MM:SS'); end; dateNum,toc
dateNum =
          734484.514722222   % Same value as dtstr2dtnummx - good!
Elapsed time is 0.658352 seconds.   % 3x slower than dtstr2dtnummx - bad!

% Fast - using dtstr2dtnummx >> tic, for i=1:1000; dateNum=dtstr2dtnummx({'2010-12-12 12:21:12.123'},'yyyy-MM-dd HH:mm:ss'); end; dateNum,toc dateNum = 734484.514722222 Elapsed time is 0.218423 seconds. % Slower - using datenum >> tic, for i=1:1000; dateNum=datenum({'2010-12-12 12:21:12.123'},'yyyy-mm-dd HH:MM:SS'); end; dateNum,toc dateNum = 734484.514722222 % Same value as dtstr2dtnummx - good! Elapsed time is 0.658352 seconds. % 3x slower than dtstr2dtnummx - bad!

While the difference in timing may appear negligible, if you are using this function to parse a text file with thousands of lines, each with its own timestamp, then these seemingly negligible time differences quickly add up. Of course, this only makes sense to do if you find out (using the profiler again) that this date parsing is a performance hotspot in your particular application. It was indeed such a performance hotspot in one of my applications, as it apparently was also for the original poster on StackOverflow.
Like ismembc, dtstr2dtnummx is an internal mex function. On my Windows system it is located in C:\Program Files\Matlab\R2011a\toolbox\matlab\timefun\private\dtstr2dtnummx.mexw32. It will have a different extension non-Windows systems, but you will easily find it in its containing folder.
To gain access to dtstr2dtnummx, simply add its folder to the Matlab path using the addpath function, or copy the dtstr2dtnummx.mexw32 file to another folder that is already on your Matlab path.
Note that the string format is different between dtstr2dtnummx and datenum: In the test case above, dtstr2dtnummx used 'yyyy-MM-dd HH:mm:ss', while datenum required 'yyyy-mm-dd HH:MM:SS'. I have no idea why MathWorks did not keep consistent formatting strings. But because of this, we need to be extra careful (example1, example2). If you are interested in finding out how the datenum format strings translates into a dtstr2dtnummx, take a look at the helper function cnv2icudf, which is a very readable m-file located in the same folder as dtstr2dtnummx.
To those interested, the folder that contains dtstr2dtnummx also contains some other interesting date conversion functions, so explore and enjoy!
Perhaps the main lesson that can be learned from this article, and its ismembc predecessor of two years ago, is that it is very useful to profile the code for performance hotspots. When such a hotspot is found, don’t stop your profiling at the built-in Matlab functions – keep digging in the profiler results and perhaps you’ll find that you can improve performance by taking an internal shortcut.
Have you discovered any other performance shortcuts in a built-in Matlab function? If so, please post a comment to tell us all about it.

Related posts:

  1. Performance: scatter vs. line – In many circumstances, the line function can generate visually-identical plots as the scatter function, much faster...
  2. Improving save performance – There are many different ways of improving Matlab's standard save function performance. ...
  3. Performance: accessing handle properties – Handle object property access (get/set) performance can be significantly improved using dot-notation. ...
  4. Plot performance – Undocumented inner plot mechanisms can significantly improve plotting performance ...
  5. datestr performance – Caching is a simple and very effective means to improve code performance, as demonstrated for the datestr function....
  6. Preallocation performance – Preallocation is a standard Matlab speedup technique. Still, it has several undocumented aspects. ...
datenum Internal component Performance Profiler Pure Matlab Undocumented function
Print Print
« Previous
Next »
14 Responses
  1. Jan Simon May 4, 2011 at 12:52 Reply

    I’ve published a C-Mex function for the conversion to date numbers: http://www.mathworks.com/matlabcentral/fileexchange/28093-datestr2num
    E.g. for the ‘yyyy-mm-dd HH:MM:SS’ format I get these timings (Matlab 2009a, 1.5GHz Pentium-M, 1000 iterations as in your example):

    DATENUM: 0.93 sec, DTSTR2DTNUMMX: 0.21 sec, DateStr2Num: 0.0087 sec

    And for a {1 x 1000} cell string:

    DATENUM: 2.52 sec, DTSTR2DTNUMMX: 1.77 sec, DateStr2Num: 0.027 sec

    The speed is based on two methods: 1. the format has to be specified by a very limited set of 6 most common formats. 2. The value is not checked for validity: While DATENUM and DTSTR2DTNUMMX recognize ‘2011-04-180’ more or less correctly as 179th day after 2011-04-01, DateStr2Num fails and does even not catch ‘2011-04-AB’ as an error. To calculate the fractional part for ’25:61:62′, the overflow can be ignored fortunately.

    Therefore, if you know that the date string is valid, a very simple C-code can be 100 times faster than DATENUM and 25 times faster than DSTR2DNUMMX.

    The C-Mex DATENUMMX.c was part of Matlab 6.5. What a pitty that modern Matlab versions include less of such cookies.

    Kind regards, Jan

    • Yair Altman May 5, 2011 at 10:33 Reply

      @Jan – thanks for the tip. I remember being impressed with your datestr2num utility back when, but I simply forgot it lately. So your comment is right on.

      In general the basic lesson here, as elsewhere in Matlab, is that wherever we can remove unnecessary input formats, options and validity checks, this could greatly increase the performance.

  2. Datenum performance | Undocumented Matlab | 零度季节 May 4, 2011 at 19:56 Reply

    […] See the article here: Datenum performance | Undocumented Matlab […]

  3. Jan Simon May 11, 2011 at 01:58 Reply

    Hi,

    The built-in DATENUMMX converts [1 x 6] date vectors 4 times faster than DATENUM. It is at least included in Matlab 5.3 to 2009a – I cannot check this in newer versions. As said already, the source code datenummx.c was shipped with Matlab 6.5.

    • Yair Altman May 11, 2011 at 03:20 Reply

      DATENUMMX is still available in the latest Matlab release (R2011a)

  4. Teegee May 11, 2011 at 06:02 Reply

    I have another Tip for such processing. For given vectors, it could save a huge amount of time:

    You want to compute the vector A and become the results in the variable Result:

    Result=datestr(A); % To avoid

    Result=datestr(A); % To avoid

    Use the unique function before:

    [b, m, n] = unique(A); % Reduce your vector A
    b=datestr(b); % Apply your function to b which is much smaller
    Result=b(n,:); % Assign it to a vector the same size as A

    [b, m, n] = unique(A); % Reduce your vector A b=datestr(b); % Apply your function to b which is much smaller Result=b(n,:); % Assign it to a vector the same size as A

    For a vector A with a lot of same values it saved my life 🙂

  5. datestr performance | Undocumented Matlab October 5, 2011 at 13:17 Reply

    […] A few months ago, I posted an article showing how we can use some internal help functions to significantly improve the performance […]

  6. sprintfc – undocumented helper function | Undocumented Matlab November 27, 2013 at 13:18 Reply

    […] was the case, for example, of the ismembc function, that I described here back in 2009, and the dtstr2dtnummx function that I described in 2011. Today I describe another such function, sprintfc, based on a tip I […]

  7. JakubT April 29, 2014 at 01:52 Reply

    Hello,
    Mathworks seem to have done some black magic – when I used older Matlab (2011), datenum conversion of a time vector (2.1*10^6 entries) with a given format took 600+ seconds. With DTSTR2DTNUMMX, it took 220 s. In Matlab 2013, it takes 28s only and seems to give a correct answer too!
    Best,
    Jakub

  8. Phillip April 2, 2015 at 02:00 Reply

    Hi

    Any idea where dtstr2dtnummx has disappeared to in the newer versions. Can’t find it in 2015a, for example. I know it’s there because I can call it but for the life of me I can’t find it.

    Regards,
    Phil

    • Yair Altman April 2, 2015 at 02:04 Reply
      which dtstr2dtnummx

      which dtstr2dtnummx

  9. David Long June 23, 2016 at 22:41 Reply

    Yair, Thanks so much on this. I have a particular problem that using dtstr2dtnummx doesn’t solve, and I was wondering if you knew of a simple fix. You are correct that dtstr2dtnummx is much faster but if you need milliseconds, this doesn’t seem to catch that. For instance, using your code above but adding milliseconds to the time string gives two different results.

    tic, 
    for i=1:1000 
        dateNum=dtstr2dtnummx({'2010-12-12 12:21:12.123'},'yyyy-MM-dd HH:mm:ss.FFF');
    end; 
    dateNum
    toc
     
    dateNum =
              734484.514722222
     
    Elapsed time is 0.099181 seconds.
     
    % Slower - using datenum
    tic
    for i=1:1000 
        dateNum=datenum({'2010-12-12 12:21:12.123'},'yyyy-mm-dd HH:MM:SS.FFF');
    end; 
    dateNum
    toc
     
    dateNum =
              734484.514723646
     
    Elapsed time is 0.172265 seconds.

    tic, for i=1:1000 dateNum=dtstr2dtnummx({'2010-12-12 12:21:12.123'},'yyyy-MM-dd HH:mm:ss.FFF'); end; dateNum toc dateNum = 734484.514722222 Elapsed time is 0.099181 seconds. % Slower - using datenum tic for i=1:1000 dateNum=datenum({'2010-12-12 12:21:12.123'},'yyyy-mm-dd HH:MM:SS.FFF'); end; dateNum toc dateNum = 734484.514723646 Elapsed time is 0.172265 seconds.

    The difference is the added millisecond value. Even adding the “.FFF” to the format string doesn’t seem to catch the milliseconds in the faster case. This must happen outside of the dtstr2dtnummx function.

    • Yair Altman June 24, 2016 at 15:24 Reply

      @David – there is indeed an answer to this but I make it a personal point not to answer any pro-bono questions from JHU-APL following a few cases in previous years where I felt that my goodwill was taken advantage of by your peers. If you want a professional answer to your question then email me for a paid consulting request.

    • David July 13, 2016 at 21:02 Reply

      Yair, sorry for the late response and sorry for your previous experience dealing with APL. Not sure what happened but APL usually has very nice people. On my question and the paid consultation issue…I would if I could but I’m just a engineer and that is way above my pay grade. Anyway, thanks so much.

Leave a Reply
HTML tags such as <b> or <i> are accepted.
Wrap code fragments inside <pre lang="matlab"> tags, like this:
<pre lang="matlab">
a = magic(3);
disp(sum(a))
</pre>
I reserve the right to edit/delete comments (read the site policies).
Not all comments will be answered. You can always email me (altmany at gmail) for private consulting.

Click here to cancel reply.

Useful links
  •  Email Yair Altman
  •  Subscribe to new posts (feed)
  •  Subscribe to new posts (reader)
  •  Subscribe to comments (feed)
 
Accelerating MATLAB Performance book
Recent Posts

Speeding-up builtin Matlab functions – part 3

Improving graphics interactivity

Interesting Matlab puzzle – analysis

Interesting Matlab puzzle

Undocumented plot marker types

Matlab toolstrip – part 9 (popup figures)

Matlab toolstrip – part 8 (galleries)

Matlab toolstrip – part 7 (selection controls)

Matlab toolstrip – part 6 (complex controls)

Matlab toolstrip – part 5 (icons)

Matlab toolstrip – part 4 (control customization)

Reverting axes controls in figure toolbar

Matlab toolstrip – part 3 (basic customization)

Matlab toolstrip – part 2 (ToolGroup App)

Matlab toolstrip – part 1

Categories
  • Desktop (45)
  • Figure window (59)
  • Guest bloggers (65)
  • GUI (165)
  • Handle graphics (84)
  • Hidden property (42)
  • Icons (15)
  • Java (174)
  • Listeners (22)
  • Memory (16)
  • Mex (13)
  • Presumed future risk (394)
    • High risk of breaking in future versions (100)
    • Low risk of breaking in future versions (160)
    • Medium risk of breaking in future versions (136)
  • Public presentation (6)
  • Semi-documented feature (10)
  • Semi-documented function (35)
  • Stock Matlab function (140)
  • Toolbox (10)
  • UI controls (52)
  • Uncategorized (13)
  • Undocumented feature (217)
  • Undocumented function (37)
Tags
AppDesigner (9) Callbacks (31) Compiler (10) Desktop (38) Donn Shull (10) Editor (8) Figure (19) FindJObj (27) GUI (141) GUIDE (8) Handle graphics (78) HG2 (34) Hidden property (51) HTML (26) Icons (9) Internal component (39) Java (178) JavaFrame (20) JIDE (19) JMI (8) Listener (17) Malcolm Lidierth (8) MCOS (11) Memory (13) Menubar (9) Mex (14) Optical illusion (11) Performance (78) Profiler (9) Pure Matlab (187) schema (7) schema.class (8) schema.prop (18) Semi-documented feature (6) Semi-documented function (33) Toolbar (14) Toolstrip (13) uicontrol (37) uifigure (8) UIInspect (12) uitable (6) uitools (20) Undocumented feature (187) Undocumented function (37) Undocumented property (20)
Recent Comments
Contact us
Captcha image for Custom Contact Forms plugin. You must type the numbers shown in the image
Undocumented Matlab © 2009 - Yair Altman
This website and Octahedron Ltd. are not affiliated with The MathWorks Inc.; MATLAB® is a registered trademark of The MathWorks Inc.
Scroll to top