Stock Matlab function – Undocumented Matlab

Improving graphics interactivity

Yair Altman — Sun, 21 Apr 2019 21:03:10 +0000

Matlab release R2018b added the concept of axes-specific toolbars and default axes mouse interactivity. Plain 2D plot axes have the following default interactions enabled by default: PanInteraction, ZoomInteraction, DataTipInteraction and RulerPanInteraction.

Unfortunately, I find that while the default interactions set is much more useful than the non-interactive default axes behavior in R2018a and earlier, it could still be improved in two important ways:

Performance – Matlab’s builtin Interaction objects are very inefficient. In cases of multiple overlapping axes (which is very common in multi-tab GUIs or cases of various types of axes), instead of processing events for just the top visible axes, they process all the enabled interactions for *all* axes (including non-visible ones!). This is particularly problematic with the default DataTipInteraction – it includes a Linger object whose apparent purpose is to detect when the mouse lingers for enough time on top of a chart object, and displays a data-tip in such cases. Its internal code is both inefficient and processed multiple times (for each of the axes), as can be seen via a profiling session.
Usability – In my experience, RegionZoomInteraction (which enables defining a region zoom-box via click-&-drag) is usually much more useful than PanInteraction for most plot types. ZoomInteraction, which is enabled by default only enables zooming-in and -out using the mouse-wheel, which is much less useful and more cumbersome to use than RegionZoomInteraction. The panning functionality can still be accessed interactively with the mouse by dragging the X and Y rulers (ticks) to each side.

For these reasons, I typically use the following function whenever I create new axes, to replace the default sluggish DataTipInteraction and PanInteraction with RegionZoomInteraction:

function axDefaultCreateFcn(hAxes, ~)
    try
        hAxes.Interactions = [zoomInteraction regionZoomInteraction rulerPanInteraction];
        hAxes.Toolbar = [];
    catch
        % ignore - old Matlab release
    end
end

The purpose of these two axes property changes shall become apparent below.

This function can either be called directly (axDefaultCreateFcn(hAxes), or as part of the containing figure’s creation script to ensure than any axes created in this figure has this fix applied:

set(hFig,'defaultAxesCreateFcn',@axDefaultCreateFcn);

Test setup

Figure with default axes toolbar and interactivity

To test the changes, let’s prepare a figure with 10 tabs, with 10 overlapping panels and a single axes in each tab:

hFig = figure('Pos',[10,10,400,300]);
hTabGroup = uitabgroup(hFig);
for iTab = 1 : 10
    hTab = uitab(hTabGroup, 'title',num2str(iTab));
    hPanel = uipanel(hTab);
    for iPanel = 1 : 10
        hPanel = uipanel(hPanel);
    end
    hAxes(iTab) = axes(hPanel); %see MLint note below
    plot(hAxes(iTab),1:5,'-ob');
end
drawnow

p.s. – there’s a incorrect MLint (Code Analyzer) warning in line 9 about the call to axes(hPanel) being inefficient in a loop. Apparently, MLint incorrectly parses this function call as a request to make the axes in-focus, rather than as a request to create the axes in the specified hPanel parent container. We can safely ignore this warning.

Now let’s create a run-time test script that simulates 2000 mouse movements using java.awt.Robot:

tic
monitorPos = get(0,'MonitorPositions');
y0 = monitorPos(1,4) - 200;
robot = java.awt.Robot;
for iEvent = 1 : 2000
    robot.mouseMove(150, y0+mod(iEvent,100));
    drawnow
end
toc

This takes ~45 seconds to run on my laptop: ~23ms per mouse movement on average, with noticeable “linger” when the mouse pointer is near the plotted data line. Note that this figure is extremely simplistic – In a real-life program, the mouse events processing lag the mouse movements, making the GUI far more sluggish than the same GUI on R2018a or earlier. In fact, in one of my more complex GUIs, the entire GUI and Matlab itself came to a standstill that required killing the Matlab process, just by moving the mouse for several seconds.

Notice that at any time, only a single axes is actually visible in our test setup. The other 9 axes are not visible although their Visible property is 'on'. Despite this, when the mouse moves within the figure, these other axes unnecessarily process the mouse events.

Changing the default interactions

Let’s modify the axes creation script as I mentioned above, by changing the default interactions (note the highlighted code addition):

hFig = figure('Pos',[10,10,400,300]);
hTabGroup = uitabgroup(hFig);
for iTab = 1 : 10
    hTab = uitab(hTabGroup, 'title',num2str(iTab));
    hPanel = uipanel(hTab);
    for iPanel = 1 : 10
        hPanel = uipanel(hPanel);
    end
    hAxes(iTab) = axes(hPanel);
    plot(hAxes(iTab),1:5,'-ob');
    hAxes(iTab).Interactions = [zoomInteraction regionZoomInteraction rulerPanInteraction];end
drawnow

The test script now takes only 12 seconds to run – 4x faster than the default and yet IMHO with better interactivity (using RegionZoomInteraction).

The axes-specific toolbar, another innovation of R2018b, does not just have interactivity aspects, which are by themselves much-contested. A much less discussed aspect of the axes toolbar is that it degrades the overall performance of axes. The reason is that the axes toolbar’s transparency, visibility, background color and contents continuously update whenever the mouse moves within the axes area.

Since we have set up the default interactivity to a more-usable set above, and since we can replace the axes toolbar with figure-level toolbar controls, we can simply delete the axes-level toolbars for even more-improved performance:

hFig = figure('Pos',[10,10,400,300]);
hTabGroup = uitabgroup(hFig);
for iTab = 1 : 10
    hTab = uitab(hTabGroup, 'title',num2str(iTab));
    hPanel = uipanel(hTab);
    for iPanel = 1 : 10
        hPanel = uipanel(hPanel);
    end
    hAxes(iTab) = axes(hPanel);
    plot(hAxes(iTab),1:5,'-ob');
    hAxes(iTab).Interactions = [zoomInteraction regionZoomInteraction rulerPanInteraction];
    hAxes(iTab).Toolbar = [];end
drawnow

This brings the test script’s run-time down to 6 seconds – 7x faster than the default run-time. At ~3ms per mouse event, the GUI is now as performant and snippy as in R2018a, even with the new interactive mouse actions of R2018b active.

Conclusions

MathWorks definitely did not intend for this slow-down aspect, but it is an unfortunate by-product of the choice to auto-enable DataTipInteraction and of its sub-optimal implementation. Perhaps this side-effect was never noticed by MathWorks because the testing scripts probably had only a few axes in a very simple figure – in such a case the performance lags are very small and might have slipped under the radar. But I assume that many real-life complex GUIs will display significant lags in R2018b and newer Matlab releases, compared to R2018a and earlier releases. I assume that such users will be surprised/dismayed to discover that in R2018b their GUI not only interacts differently but also runs slower, although the program code has not changed.

One of the common claims that I often hear against using undocumented Matlab features is that the program might break in some future Matlab release that would not support some of these features. But users certainly do not expect that their programs might break in new Matlab releases when they only use documented features, as in this case. IMHO, this case (and others over the years) demonstrates that using undocumented features is usually not much riskier than using the standard documented features with regards to future compatibility, making the risk/reward ratio more favorable. In fact, of the ~400 posts that I have published in the past decade (this blog is already 10 years old, time flies…), very few tips no longer work in the latest Matlab release. When such forward compatibility issues do arise, whether with fully-documented or undocumented features, we can often find workarounds as I have shown above.

If your Matlab program could use a performance boost, I would be happy to assist making your program faster and more responsive. Don’t hesitate to reach out to me for a consulting quote.

Undocumented plot marker types

Yair Altman — Wed, 13 Mar 2019 11:05:14 +0000

I wanted to take a break from my miniseries on the Matlab toolstrip to describe a nice little undocumented aspect of plot line markers. Plot line marker types have remained essentially unchanged in user-facing functionality for the past two+ decades, allowing the well-known marker types (.,+,o,^ etc.). Internally, lots of things changed in the graphics engine, particularly in the transition to HG2 in R2014b and the implementation of markers using OpenGL primitives. I suspect that during the massive amount of development work that was done at that time, important functionality improvements that were implemented in the engine were forgotten and did not percolate all the way up to the user-facing functions. I highlighted a few of these in the past, for example transparency and color gradient for plot lines and markers, or various aspects of contour plots.

Fortunately, Matlab usually exposes the internal objects that we can customize and which enable these extra features, in hidden properties of the top-level graphics handle. For example, the standard Matlab plot-line handle has a hidden property called MarkerHandle that we can access. This returns an internal object that enables marker transparency and color gradients. We can also use this object to set the marker style to a couple of formats that are not available in the top-level object:

>> x=1:10; y=10*x; hLine=plot(x,y,'o-'); box off; drawnow;
>> hLine.MarkerEdgeColor = 'r';
 
>> set(hLine, 'Marker')'  % top-level marker styles
ans =
  1×14 cell array
    {'+'} {'o'} {'*'} {'.'} {'x'} {'square'} {'diamond'} {'v'} {'^'} {'>'} {'<'} {'pentagram'} {'hexagram'} {'none'}
 
>> set(hLine.MarkerHandle, 'Style')'  % low-level marker styles
ans =
  1×16 cell array
    {'plus'} {'circle'} {'asterisk'} {'point'} {'x'} {'square'} {'diamond'} {'downtriangle'} {'triangle'} {'righttriangle'} {'lefttriangle'} {'pentagram'} {'hexagram'} {'vbar'} {'hbar'} {'none'}

We see that the top-level marker styles directly correspond to the low-level styles, except for the low-level ‘vbar’ and ‘hbar’ styles. Perhaps the developers forgot to add these two styles to the top-level object in the enormous upheaval of HG2. Luckily, we can set the hbar/vbar styles directly, using the line’s MarkerHandle property:

hLine.MarkerHandle.Style = 'hbar';
set(hLine.MarkerHandle, 'Style','hbar');  % alternative

hLine.MarkerHandle.Style='hbar'

hLine.MarkerHandle.Style='vbar'

USA visit

I will be travelling in the US in May/June 2019. Please let me know (altmany at gmail) if you would like to schedule a meeting or onsite visit for consulting/training, or perhaps just to explore the possibility of my professional assistance to your Matlab programming needs.

Speeding-up builtin Matlab functions – part 2

Yair Altman — Sun, 06 May 2018 16:26:19 +0000

Last week I showed how we can speed-up built-in Matlab functions, by creating local copies of the relevant m-files and then optimizing them for improved speed using a variety of techniques. Today I will show another example of such speed-up, this time of the Financial Toolbox’s maxdrawdown function, which is widely used to estimate the relative risk of a trading strategy or asset. One might think that such a basic indicator would be optimized for speed, but experience shows otherwise. In fact, this function turned out to be the main run-time performance hotspot for one of my clients. The vast majority of his code was optimized for speed, and he naturally assumed that the built-in Matlab functions were optimized as well, but this was not the case. Fortunately, I was able to create an equivalent version that was 30-40 times faster (!), and my client remains a loyal Matlab fan.

In today’s post I will show how I achieved this speed-up, using different methods than the ones I showed last week. A combination of these techniques can be used in a wide range of other Matlab functions. Additional speed-up techniques can be found in other performance-related posts on this website, as well in my book Accelerating MATLAB Performance.

Profiling

As I explained last week, the first step in speeding up any function is to profile its current run-time behavior using Matlab’s builtin Profiler tool, which can either be started from the Matlab Editor toolstrip (“Run and Time”) or via the profile function.

The profile report for the client’s function showed that it had two separate performance hotspots:

Code that checks the drawdown format (optional 2nd input argument) against a set of allowed formats
Main code section that iteratively loops over the data-series values to compute the maximal drawdown

In order top optimize the code, I copied %matlabroot%/toolbox/finance/finance/maxdrawdown.m to a local folder on the Matlab path, renaming the file (and the function) maxdrawdn, in order to avoid conflict with the built-in version.

Optimizing input args pre-processing

The main problem with the pre-processing of the optional format input argument is the string comparisons, which are being done even when the default format is used (which is by far the most common case). String comparison are often more expensive than numerical computations. Each comparison by itself is very short, but when maxdrawdown is run in a loop (as it often is), the run-time adds up.

Here’s a snippet of the original code:

if nargin < 2 || isempty(Format)
    Format = 'return';
end
if ~ischar(Format) || size(Format,1) ~= 1
    error(message('finance:maxdrawdown:InvalidFormatType'));
end
choice = find(strncmpi(Format,{'return','arithmetic','geometric'},length(Format)));
if isempty(choice)
    error(message('finance:maxdrawdown:InvalidFormatValue'));
end

An equivalent code, which avoids any string processing in the common default case, is faster, simpler and more readable:

if nargin < 2 || isempty(Format)
    choice = 1;
elseif ~ischar(Format) || size(Format,1) ~= 1
    error(message('finance:maxdrawdown:InvalidFormatType'));
else
    choice = find(strncmpi(Format,{'return','arithmetic','geometric'},length(Format)));
    if isempty(choice)
        error(message('finance:maxdrawdown:InvalidFormatValue'));
    end
end

The general rule is that whenever you have a common case, you should check it first, avoiding unnecessary processing downstream. Moreover, for improved run-time performance (although not necessarily maintainability), it is generally preferable to work with numbers rather than strings (choice rather than Format, in our case).

Vectorizing the main loop

The main processing loop uses a very simple yet inefficient iterative loop. I assume that the code was originally developed this way in order to assist debugging and to ensure correctness, and that once it was ready nobody took the time to also optimize it for speed. It looks something like this:

MaxDD = zeros(1,N);
MaxDDIndex = ones(2,N);
...
if choice == 1   % 'return' format
    MaxData = Data(1,:);
    MaxIndex = ones(1,N);
    for i = 1:T
        MaxData = max(MaxData, Data(i,:));
        q = MaxData == Data(i,:);
        MaxIndex(1,q) = i;
        DD = (MaxData - Data(i,:)) ./ MaxData;
        if any(DD > MaxDD)
            p = DD > MaxDD;
            MaxDD(p) = DD(p);
            MaxDDIndex(1,p) = MaxIndex(p);
            MaxDDIndex(2,p) = i;
        end
    end
else             % 'arithmetic' or 'geometric' formats
    ...

This loop can relatively easily be vectorized, making the code much faster, and arguably also simpler, more readable, and more maintainable:

if choice == 3
    Data = log(Data);
end
MaxDDIndex = ones(2,N);
MaxData = cummax(Data);
MaxIndexes = find(MaxData==Data);
DD = MaxData - Data;
if choice == 1	% 'return' format
    DD = DD ./ MaxData;
end
MaxDD = cummax(DD);
MaxIndex2 = find(MaxDD==DD,1,'last');
MaxIndex1 = MaxIndexes(find(MaxIndexes<=MaxIndex2,1,'last'));
MaxDDIndex(1,:) = MaxIndex1;
MaxDDIndex(2,:) = MaxIndex2;
MaxDD = MaxDD(end,:);

Let’s make a short run-time and accuracy check – we can see that we achieved a 31-fold speedup (YMMV), and received the exact same results:

>> data = rand(1,1e7);
 
>> tic, [MaxDD1, MaxDDIndex1] = maxdrawdown(data); toc  % builtin Matlab function
Elapsed time is 7.847140 seconds.
 
>> tic, [MaxDD2, MaxDDIndex2] = maxdrawdn(data); toc  % our optimized version
Elapsed time is 0.253130 seconds.
 
>> speedup = round(7.847140 / 0.253130)
speedup =
    31
 
>> isequal(MaxDD1,MaxDD2) && isequal(MaxDDIndex1,MaxDDIndex2)  % check accuracy
ans =
  logical
   1

Disclaimer: The code above seems to work (quite well in fact) for a 1D data vector. You’d need to modify it a bit to handle 2D data – the returned maximal drawdown are still computed correctly but not the returned indices, due to their computation using the find function. This modification is left as an exercise for the reader…

Very similar code could be used for the corresponding maxdrawup function. Although this function is used much less often than maxdrawdown, it is in fact widely used and very similar to maxdrawdown, so it is surprising that it is missing in the Financial Toolbox. Here is the corresponding code snippet:

% Code snippet for maxdrawup (similar to maxdrawdn)
MaxDUIndex = ones(2,N);
MinData = cummin(Data);
MinIndexes = find(MinData==Data);
DU = Data - MinData;
if choice == 1	% 'return' format
    DU = DU ./ MinData;
end
MaxDU = cummax(DU);
MaxIndex = find(MaxDD==DD,1,'last');
MinIndex = MinIndexes(find(MinIndexes<=MaxIndex,1,'last'));
MaxDUIndex(1,:) = MinIndex;
MaxDUIndex(2,:) = MaxIndex;
MaxDU = MaxDU(end,:);

Similar vectorization could be applied to the emaxdrawdown function. This too is left as an exercise for the reader…

Conclusions

Matlab is composed of thousands of internal functions. Each and every one of these functions was meticulously developed and tested by engineers, who are after all only human. Whereas supreme emphasis is always placed with Matlab functions on their accuracy, run-time performance sometimes takes a back-seat. Make no mistake about this: code accuracy is almost always more important than speed (an exception are cases where some accuracy may be sacrificed for improved run-time performance). So I’m not complaining about the current state of affairs.

But when we run into a specific run-time problem in our Matlab program, we should not despair if we see that built-in functions cause slowdown. We can try to avoid calling those functions (for example, by reducing the number of invocations, or limiting the target accuracy, etc.), or optimize these functions in our own local copy, as I’ve shown last week and today. There are multiple techniques that we could employ to improve the run time. Just use the profiler and keep an open mind about alternative speed-up mechanisms, and you’d be half-way there.

Let me know if you’d like me to assist with your Matlab project, either developing it from scratch or improving your existing code, or just training you in how to improve your Matlab code’s run-time/robustness/usability/appearance. I will be visiting Boston and New York in early June and would be happy to meet you in person to discuss your specific needs.

Speeding-up builtin Matlab functions – part 1

Yair Altman — Sun, 29 Apr 2018 09:46:29 +0000

A client recently asked me to assist with his numeric analysis function – it took 45 minutes to run, which was unacceptable (5000 runs of ~0.55 secs each). The code had to run in 10 minutes or less to be useful. It turns out that 99% of the run-time was taken up by Matlab’s built-in fitdist function (part of the Statistics Toolbox), which my client was certain is already optimized for maximal performance. He therefore assumed that to get the necessary speedup he must either switch to another programming language (C/Java/Python), and/or upgrade his computer hardware at considerable expense, since parallelization was not feasible in his specific case.

Luckily, I was there to assist and was able to quickly speed-up the code down to 7 minutes, well below the required run-time. In today’s post I will show how I did this, which is relevant for a wide variety of other similar performance issues with Matlab. Many additional speed-up techniques can be found in other performance-related posts on this website, as well in my book Accelerating MATLAB Performance.

Profiling

The first step in speeding up any function is to profile its current run-time behavior using Matlab’s builtin Profiler tool, which can either be started from the Matlab Editor toolstrip (“Run and Time”) or via the profile function.

The profile report for the client’s function showed that 99% of the time was spent in the Statistics Toolbox’s fitdist function. Drilling into this function in the profiling report, we see onion-like functions that processed input parameters, ensured data validation etc. The core processing is done inside a class that is unique to each required distribution (e.g., prob.StableDistribution, prob.BetaDistribution etc.) that is invoked within fitdist using an feval call, based on the distribution name that was specified by the user.

In our specific case, the external onion layers of sanity checks were unnecessary and could be avoided. In general, I advise not to discard such checks, because you never know whether future uses might have a problem with outlier data or input parameters. Moreover, in the specific case of fitdist they take only a very minor portion of the run-time (this may be different in other cases, such as the ismember function that I described years ago, where the sanity checks have a significant run-time impact compared to the core processing in the internal ismembc function).

However, since we wanted to significantly improve the total run-time and this was spent within the distribution class (prob.StableDistribution in the case of my client), we continue to drill-down into this class to determine what can be done.

It turns out that prob.StableDistribution basically does 3 things in its main fit() method:

pre-process the input data (prob.ToolboxFittableParametricDistribution.processFitArgs() and .removeCensoring() methods) – this turned out to be unnecessary in my client’s data, but has little run-time impact.
call stablefit() in order to get fitting parameters – this took about half the run-time
call stablelike() in order to get likelihood data – this took about half the run-time as well
call prob.StableDistribution.makeFitted() to create a probability-distribution object returned to the caller – this also took little run-time that was not worth bothering about.

The speed-up improvement process

With user-created code we could simply modify our code in-place. However, a more careful process is necessary when modifying built-in Matlab functions (either in the core Matlab distribution or in one of the toolboxes).

The basic idea here is to create a side-function that would replicate the core processing of fitdist. This is preferable to modifying Matlab’s installation files because we could then reuse the new function in multiple computers, without having to mess around in the Matlab installation (which may not even be possible if we do not have admin privileges). Also, if we ever upgrade our Matlab we won’t need to remember to update the installed files (and obviously retest).

I called the new function fitdist2 and inside it I initially placed only the very core functionality of prob.StableDistribution.fit():

% fitdist2 - fast replacement for fitdist(data,'stable')
% equivalent to pd = prob.StableDistribution.fit(data);
function pd = fitdist2(data)
    % Bypass pre-processing (not very important)
    [cens,freq,opt] = deal([]);
    %[data,cens,freq,opt] = prob.ToolboxFittableParametricDistribution.processFitArgs(data);
    %data = prob.ToolboxFittableParametricDistribution.removeCensoring(data,cens,freq,'stable');
 
    % Main processing in stablefit(), stablelike()
    params = stablefit(data,0.05,opt);
    [nll,cov] = stablelike(params,data);
 
    % Combine results into the returned probability distribution object
    pd = prob.StableDistribution.makeFitted(params,nll,cov,data,cens,freq);
end

If we try to run this as-is, we’d see errors because stablefit() and stablelike() are both sub-functions within %matlabroot%/toolbox/stats/stats/+prob/StableDistribution.m. So we copy these sub-functions (and their dependent subfunctions infoMtxCal(), intMle(), tabpdf(), neglog_pdf(), stable_nloglf(), varTrans) to the bottom of our fitdist2.m file, about 400 lines in total.

We also remove places that call checkargs(…) since that seems to be unnecessary – if you want to keep it then add the checkargs() function as well.

Now we re-run our code, after each speedup iteration verifying that the returned pd object returned by our fitdist2 is equivalent to the original object returned by fitdist.

Speeding-up stablefit()

A new profiling run shows that the vast majority of the time in stablefit() is spent in two lines:

s = load('private/StablePdfTable.mat');
[parmhat,~,err,output] = fminsearch(@(params)stable_nloglf(x,params),phi0,options);

The first of these lines is reloading a static data file. The very same static data file is later reloaded in stablelike(). Both of these data-loads is done in every single invocation of fitdist, so if we have 5000 data fits, we load the same static data file 10,000 times! This is certainly not indicative of good programming practice. It would be much faster to reload the static data once into memory, and then use this cached data for the next 9,999 invocation. Since the original authors of StableDistribution.m seem to like single-character global variables (another bad programming practice, for multiple reasons), we’ll follow their example (added lines are highlighted):

persistent s  % this should have a more meaningful name (but at least is not global...)!if isempty(s)    fit_path = fileparts(which('fitdist'));    s = load([fit_path '/private/StablePdfTable.mat']);
    a = s.a;
    b = s.b;
    xgd = s.xgd;
    p = s.p;
end

In order to speed-up the second line (that calls fminsearch), we can reduce the tolerances used by this function, by updating the options structure passed to it, so that we use tolerances of 1e-3 rather than the default 1e-6 (in our specific case this resulted in acceptable errors of ~0.1%). Specifically, we modify the code from this:

function [parmhat,parmci] = stablefit(x,alpha,options)
...
if nargin < 3 || isempty(options)
    options = statset('stablefit');
else
    options = statset(statset('stablefit'),options);
end
 
% Maximize the log-likelihood with respect to the transformed parameters
[parmhat,~,err,output] = fminsearch(@(params)stable_nloglf(x,params),phi0,options);
...
end

to this (note that the line that actually calls fminsearch remains unchanged):

function [parmhat,parmci] = stablefit(x,alpha,unused_options)...
persistent optionsif isempty(options)    options = statset('stablefit');
    options.TolX   = 1e-3;    options.TolFun = 1e-3;    options.TolBnd = 1e-3;end
 
% Maximize the log-likelihood with respect to the transformed parameters
[parmhat,~,err,output] = fminsearch(@(params)stable_nloglf(x,params),phi0,options);
...
end

The fminsearch internally calls tabpdf() repeatedly. Drilling down in the profiling report we see that it recomputes a griddedInterpolant object that is essentially the same for all iterations (and therefore a prime candidate for caching), and also that it uses the costly cubic interpolation rather than a more straight-forward linear interpolation:

function y = tabpdf(x,alpha,beta)
...
persistent G  % this should have a more meaningful name (but at least is not global...)!if isempty(G)    G = griddedInterpolant({b, a, xgd}, p, 'linear','none');  % 'linear' instead of 'cubic'end%G = griddedInterpolant({b, a, xgd}, p, 'cubic','none');  % original
y = G({beta,alpha,x});
...

These cases illustrate two important speedup technique categories: caching data in order to reduce the number of times that a certain code hotspot is being run, and modifying the parameters/inputs in order to reduce the individual run-time of the hotspot code. Variations of these techniques form the essence of effective speedup and can often be achieved by just reflecting on the problem and asking yourself two questions:

can I reduce the number of times that this code is being run? and
can I reduce the run-time of this specific code section?

Additional important speed-up categories include parallelization, vectorization and algorithmic modification. These are sometimes more difficult programmatically and riskier in terms of functional equivalence, but may be required in case the two main techniques above are not feasible. Of course, we can always combine these techniques, we don’t need to choose just one or the other.

Speeding-up stablelike()

We now turn our attentions to stablelike(). As for the loaded static file, we could simply use the cached s to load the data in order to avoid repeated reloading of the data from disk. But it turns out that this data is actually not used at all inside the function (!) so we just comment-out the old code:

%s = load('private/StablePdfTable.mat');
%a = s.a;
%b = s.b;
%xgd = s.xgd;
%p = s.p;

Think about this – the builtin Matlab code loads a data file from disk and then does absolutely nothing with it – what a waste!

Another important change is to reduce the run-time of the integral function, which is called thousands of times within a double loop. We do this by reducing the tolerances specified in the integral call from 1e-6:

F(i,j) = integral(@(x)infoMtxCal(x,params,step,i,j),-Inf,Inf,'AbsTol',1e-6,'RelTol',1e-4); % original
F(i,j) = integral(@(x)infoMtxCal(x,params,step,i,j),-Inf,Inf,'AbsTol',1e-3,'RelTol',1e-3); % new

You can see that once again these two cases follow the two techniques that I mentioned above: we reduced the number of times that we load the data file (to 0 in our case), and we improved the run-time of the individual integral calculation by reducing its tolerances.

Conclusions

The final result of applying the techniques above was a 6-fold speedup, reducing the total run-time from 45 minutes down to 7 minutes. I could probably have improved the run-time even further, but since we reached our target run-time I stopped there. The point after all was to make the code usable, not to reach a world speed record.

In my next article I will present another example of dramatically improving the run-time speed of a built-in Matlab function, this time a very commonly-used function in the Financial Toolbox that I was able to speed-up by a factor of 40.

Matlab releases improve continuously, so hopefully my techniques above (or alternatives) would find their way into the builtin Matlab functions, making them faster than today, out-of-the-box.

Until this happens, we should not lose hope when faced with a slow Matlab function, even if it is a built-in/internal one, as I hope to have clearly illustrated today, and will also show in my next article. Improving the performance is often easy. In fact, it took me much longer to write this article than it was to improve my client’s code…

Customizing axes tick labels

Yair Altman — Wed, 24 Jan 2018 13:38:26 +0000

In last week’s post, I discussed various ways to customize bar/histogram plots, including customization of the tick labels. While some of the customizations that I discussed indeed rely on undocumented properties/features, many Matlab users are not aware that tick labels can be individually customized, and that this is a fully documented/supported functionality. This relies on the fact that the default axes TickLabelInterpreter property value is 'tex', which supports a wide range of font customizations, individually for each label. This includes any combination of symbols, superscript, subscript, bold, italic, slanted, face-name, font-size and color – even intermixed within a single label. Since tex is the default interpreter, we don’t need any special preparation – simply set the relevant X/Y/ZTickLabel string to include the relevant tex markup.

To illustrate this, have a look at the following excellent answer by user Ubi on Stack Overflow:

Axes with Tex-customized tick labels

plot(1:10, rand(1,10))
ax = gca;
 
% Simply color an XTickLabel
ax.XTickLabel{3} = ['\color{red}' ax.XTickLabel{3}];
 
% Use TeX symbols
ax.XTickLabel{4} = '\color{blue} \uparrow';
 
% Use multiple colors in one XTickLabel
ax.XTickLabel{5} = '\color[rgb]{0,1,0}green\color{orange}?';
 
% Color YTickLabels with colormap
nColors = numel(ax.YTickLabel);
cm = jet(nColors);
for i = 1:nColors
    ax.YTickLabel{i} = sprintf('\\color[rgb]{%f,%f,%f}%s', cm(i,:), ax.YTickLabel{i});
end

In addition to 'tex', we can also set the axes object’s TickLabelInterpreter to 'latex' for a Latex interpreter, or 'none' if we want to use no string interpretation at all.

As I showed in last week’s post, we can control the gap between the tick labels and the axle line, using the Ruler object’s undocumented TickLabelGapOffset, TickLabelGapMultiplier properties.

Also, as I explained in other posts (here and here), we can also control the display of the secondary axle label (typically exponent or units) using the Ruler’s similarly-undocumented SecondaryLabel property. Note that the related Ruler’s Exponent property is documented/supported, but simply sets a basic exponent label (e.g., '\times10^{6}' when Exponent==6) – to set a custom label string (e.g., '\it\color{gray}Millions'), or to modify its other properties (position, alignment etc.), we should use SecondaryLabel.

Customizing histogram plots

Yair Altman — Wed, 17 Jan 2018 20:41:15 +0000

Earlier today, I was given the task of displaying a histogram plot of a list of values. In today’s post, I will walk through a few customizations that can be done to bar plots and histograms in order to achieve the desired results.

We start by binning the raw data into pre-selected bins. This can easily be done using the builtin histc (deprecated) or histcounts functions. We can then use the bar function to plot the results:

[binCounts, binEdges] = histcounts(data);
hBars = bar(hAxes, binEdges(1:end-1), binCounts);

Basic histogram bar plot

Let’s improve the appearance: In my specific case, the data was financial return (percentage) values, so let’s modify the x-label format accordingly and display a title. To make the labels and title more legible, we decrease the axes FontSize to 8 and remove the axes box:

hAxes = hBar.Parent;
xtickformat(hAxes, '%g%%');
title(hAxes, 'Distribution of total returns (monthly %)');
set(hAxes, 'FontSize',8, 'Box','off')

Improved histogram bar plot

So far nothing undocumented. Note that the xtickformat/ytickformat functions were only introduced in R2016b – for earlier Matlab releases see this post (which does rely on undocumented aspects).

Now, let’s use a couple of undocumented properties: to remove the excess white-space margin around the axes we’ll set the axes’ LooseInset property, and to remove the annoying white space between the tick labels and the X-axis we’ll set the XRuler‘s TickLabelGapOffset property to -2 (default: +2):

set(hAxes, 'LooseInset',[0,0,0,0]);    % default = [.13,.11,.095,.075]
hAxes.XRuler.TickLabelGapOffset = -2;  % default = +2

Even better histogram bar plot

Note that I used the undocumented axes XRuler property instead of the axes’ documented XAxis property, because XAxis is only available since R2015b, whereas XRuler (which points to the exact same object as XAxis) exists ever since R2014b, and so is better from a backward-compatibility standpoint. In either case, the ruler’s TickLabelGapOffset property is undocumented. Note that the ruler also contains another associated and undocumented TickLabelGapMultiplier property (default: 0.2), which I have not modified in this case.

Now let’s take a look at the bin labels: The problem with the bar plot above is that it’s not intuitively clear whether the bin for “5%”, for example, includes data between 4.5-5.5 or between 5.0-6.0 (which is the correct answer). It would be nicer if the labels were matched to the actual bin edges. There are 3 basic ways to fix this:

We could modify the bar plot axes tick values and labels, in essence “cheating” by moving the tick labels half a bin leftward of their tick values (don’t forget to add the extra tick label on the right):

hAxes.XTick(end+1) = hAxes.XTick(end) + 1;  % extra tick label on the right
labels = hAxes.XTickLabels;       % preserve tick labels for later use below
hAxes.XTick = hAxes.XTick - 0.5;  % move tick labels 1/2 bin leftward
hAxes.XTickLabel = labels;        % restore pre-saved tick labels
hAxes.XLim = hAxes.XLim - 0.5;    % ...and align the XLim

Improved labels

We could use the bar function’s optional 'histc' flag, in order to display the bars in histogram mode. The problem in histogram mode is that while the labels are now placed correctly, the bars touch each other – I personally find distinct bars that are separated by a small gap easier to understand.
```
hBars = bar(..., 'histc');
% [snip] - same customizations to hAxes as done above
```
Basic histogram plot
With the original bar chart we could use the built-in BarWidth to set the bar/gap width (default: 0.8 meaning a 10% gap on either side of the bar). Unfortunately, calling bar with 'hist' or 'histc' (i.e. histogram mode) results in a Patch (not Bar) object, and patches do not have a BarWidth property. However, we can modify the resulting patch vertices in order to achieve the same effect:
```
% Modify the patch vertices (5 vertices per bar, row-based)
hBars.Vertices(:,1) = hBars.Vertices(:,1) + 0.1;
hBars.Vertices(4:5:end,1) = hBars.Vertices(4:5:end,1) - 0.2;
hBars.Vertices(5:5:end,1) = hBars.Vertices(5:5:end,1) - 0.2;
 
% Align the bars & labels at the center of the axes
hAxes.XLim = hAxes.XLim + 0.5;
```
This now looks the same as option #1 above, except that the top-level handle is a Patch (not Bar) object. For various additional customizations, either Patch or Bar might be preferable, so you have a choice.
Improved histogram plot

Lastly, we could have used the builtin histogram function instead of bar. This function also displays a plot with touching bars, as above, using Quadrilateral objects (a close relative of Patch). The solution here is very similar to option #2 above, but we need to dig a bit harder to modify the patch faces, since their vertices is not exposed as a public property of the Histogram object. To modify the vertices, we first get the private Face property (explanation), and then modify its vertices, keeping in mind that in this specific case the bars have 4 vertices per bar and a different vertices matrix orientation:

hBars = histogram(data, 'FaceAlpha',1.0, 'EdgeColor','none');
% [snip] - same customizations to hAxes as done above
 
% Get access to *ALL* the object's properties
oldWarn = warning('off','MATLAB:structOnObject');
warning off MATLAB:hg:EraseModeIgnored
hBarsStruct = struct(hBars);
warning(oldWarn);
 
% Modify the patch vertices (4 vertices per bar, column-based)
drawnow;  % this is important, won't work without this!
hFace = hBarsStruct.Face;  % a Quadrilateral object (matlab.graphics.primitive.world.Quadrilateral)
hFace.VertexData(1,:) = hFace.VertexData(1,:) + 0.1;
hFace.VertexData(1,3:4:end) = hFace.VertexData(1,3:4:end) - 0.2;
hFace.VertexData(1,4:4:end) = hFace.VertexData(1,4:4:end) - 0.2;

In conclusion, there are many different ways to improve the appearance of charts in Matlab. Even if at first glance it may seem that some visualization function does not have the requested customization property or feature, a little digging will often find either a relevant undocumented property, or an internal object whose properties could be modified. If you need assistance with customizing your charts for improved functionality and appearance, then consider contacting me for a consulting session.

Customizing contour plots part 2

Yair Altman — Sun, 12 Nov 2017 11:03:37 +0000

A few weeks ago a user posted a question on Matlab’s Answers forum, asking whether it is possible to display contour labels in the same color as their corresponding contour lines. In today’s post I’ll provide some insight that may assist users with similar customizations in other plot types.

Matlab does not provide, for reasons that escape my limited understanding, documented access to the contour plot’s component primitives, namely its contour lines, labels and patch faces. Luckily however, these handles are accessible (in HG2, i.e. R2014b onward) via undocumented hidden properties aptly named EdgePrims, TextPrims and FacePrims, as I explained in a previous post about contour plots customization, two years ago.

Let’s start with a simple contour plot of the peaks function:

[X,Y,Z] = peaks;
[C,hContour] = contour(X,Y,Z, 'ShowText','on', 'LevelStep',1);

The result is the screenshot on the left:

Standard Matlab contour labels

Customized Matlab contour labels

In order to update the label colors (to get the screenshot on the right), we create a short updateContours function that updates the TextPrims color to their corresponding EdgePrims color:

The updateContours() function

function updateContours(hContour)
    % Update the text label colors
    drawnow  % very important!
    levels = hContour.LevelList;
    labels = hContour.TextPrims;  % undocumented/unsupported
    lines  = hContour.EdgePrims;  % undocumented/unsupported
    for idx = 1 : numel(labels)
        labelValue = str2double(labels(idx).String);
        lineIdx = find(abs(levels-labelValue)<10*eps, 1);  % avoid FP errors using eps
        labels(idx).ColorData = lines(lineIdx).ColorData;  % update the label color
        %labels(idx).Font.Size = 8;                        % update the label font size
    end
    drawnow  % optional
end

Note that in this function we don’t directly equate the numeric label values to the contour levels’ values: this would work well for integer values but would fail with floating-point ones. Instead I used a very small 10*eps tolerance in the numeric comparison.

Also note that I was careful to call drawnow at the top of the update function, in order to ensure that EdgePrims and TextPrims are updated when the function is called (this might not be the case before the call to drawnow). The final drawnow at the end of the function is optional: it is meant to reduce the flicker caused by the changing label colors, but it can be removed to improve the rendering performance in case of rapidly-changing contour plots.

Finally, note that I added a commented line that shows we can modify other label properties (in this case, the font size from 10 to 8). Feel free to experiment with other label properties.

Putting it all together

The final stage is to call our new updateContours function directly, immediately after creating the contour plot. We also want to call updateContours asynchronously whenever the contour is redrawn, for example, upon a zoom/pan event, or when one of the relevant contour properties (e.g., LevelStep or *Data) changes. To do this, we add a callback listener to the contour object’s [undocumented] MarkedClean event that reruns our updateContours function:

[X,Y,Z] = peaks;
[C,hContour] = contour(X,Y,Z, 'ShowText','on', 'LevelStep',1);
 
% Update the contours immediately, and also whenever the contour is redrawn
updateContours(hContour);
addlistener(hContour, 'MarkedClean', @(h,e)updateContours(hContour));

Contour level values

As noted in my comment reply below, the contour lines (hContour.EdgePrims) correspond to the contour levels (hContour.LevelList).

For example, to make all negative contour lines dotted, you can do the following:

[C,hContour] = contour(peaks, 'ShowText','on', 'LevelStep',1); drawnow
set(hContour.EdgePrims(hContour.LevelList<0), 'LineStyle', 'dotted');

Customized Matlab contour lines

Prediction about forward compatibility

As I noted on my previous post on contour plot customization, I am marking this article as “High risk of breaking in future Matlab versions“, not because of the basic functionality (being important enough I don’t presume it will go away anytime soon) but because of the property names: TextPrims, EdgePrims and FacePrims don’t seem to be very user-friendly property names. So far MathWorks has been very diligent in making its object properties have meaningful names, and so I assume that when the time comes to expose these properties, they will be renamed (perhaps to TextHandles, EdgeHandles and FaceHandles, or perhaps LabelHandles, LineHandles and FillHandles). For this reason, even if you find out in some future Matlab release that TextPrims, EdgePrims and FacePrims don’t exist, perhaps they still exist and simply have different names. Note that these properties have not changed their names or functionality in the past 3 years, so while it could well happen next year, it could also remain unchanged for many years to come. The exact same thing can be said for the MarkedClean event.

Professional assistance anyone?

As shown by this and many other posts on this site, a polished interface and functionality is often composed of small professional touches, many of which are not exposed in the official Matlab documentation for various reasons. So if you need top-quality professional appearance/functionality in your Matlab program, or maybe just a Matlab program that is dependable, robust and highly-performant, consider employing my consulting services.

Runtime code instrumentation

Yair Altman — Thu, 28 Sep 2017 13:36:17 +0000

I regularly follow the MathWorks Pick-of-the-Week (POTW) blog. In a recent post, Jiro Doke highlighted Per Isakson’s tracer4m utility. Per is an accomplished Matlab programmer, who has a solid reputation in the Matlab user community for many years. His utility uses temporary conditional breakpoints to enable users to trace calls to their Matlab functions and class methods. This uses a little-known trick that I wish to highlight in this post.

tracer4m utility uses conditional breakpoints that evaluate but never become live

Matlab breakpoints are documented and supported functionality, and yet their documented use is typically focused at interactive programming in the Matlab editor, or as interactive commands that are entered in the Matlab console using the set of db* functions: dbstop, dbclear, dbstatus, dbstack etc. However, nothing prevents us from using these db* functions directly within our code.

For example, the dbstack function can help us diagnose the calling tree for the current function, in order to do action A if one of the calling ancestors was FunctionX, or to do action B otherwise (for example, to avoid nested recursions).

Similarly, we could add a programmatic call to dbstop in order to stop at a certain code location downstream (for debugging), if a certain condition happens upstream.

Per extended this idea very cleverly in tracer4m: conditional breakpoints evaluate a string in run-time: if the result is true (non-zero) then the code run is stopped at that location, but if it’s false (or zero) then the code run continues normally. To instrument calls to specific functions, Per created a function tracer() that logs the function call (using dbstack) and always returns the value false. He then dynamically created a string that contains a call to this new function and used the dbstop function to create a conditional breakpoint based on this function, something similar to this:

dbstop('in', filename, 'at', location, 'if', 'tracer()');

We can use this same technique for other purposes. For example, if we want to do some action (not necessarily log – perhaps do something else) when a certain code point is reached. The benefit here is that we don’t need to modify the code at all – we’re adding ad-hoc code pieces using the conditional breakpoint mechanism without affecting the source code. This is particularly useful when we do not have access to the source code (such as when it’s compiled or write-protected). All you need to do is to ensure that the instrumentation function always returns false so that the breakpoint does not become live and for code execution to continue normally.

The tracer4m utility is quite sophisticated in the sense that it uses mlint and smart regexp to parse the code and know which functions/methods occur on which line numbers and have which type (more details). In this sense, Per used undocumented functionality. I’m certain that Jiro was not aware of the dependency on undocumented features when he posted about the utility, so please don’t take this to mean that Jiro or MathWorks officially support this or any other undocumented functionality. Undocumented aspects are often needed to achieve top functionality, and I’m happy that the POTW blog highlights utilities based on their importance and merit, even if they do happen to use some undocumented aspect.

tracer4m‘s code also contains references to the undocumented profiler option -history, but this is not in fact used by the code itself, only in comments. I use this feature in my profile_history utility, which displays the function call/timing history in an interactive GUI window. This utility complements tracer4m by providing a lot more information, but this can result in a huge amount of information for large and/or long-running programs. In addition, tracer4m has the benefit of only logging those functions/methods that the user finds useful, rather than all the function call, which enables easier debugging when the relevant code area is known. In short, I wish I had known about tracer4m when I created profile_history. Now that I know about it, maybe I’ll incorporate some of its ideas into profile_history in order to make it more useful. Perhaps another moral of this is that we should actively monitor the POTW blog, because true gems are quite often highlighted there.

Function call timeline profiling

For anyone who missed the announcement in my previous post, I’m hosting a series of live webinars on advanced Matlab topics in the upcoming 2 weeks – I’ll be happy if you would join.

Sending HTML emails from Matlab

Yair Altman — Wed, 02 Aug 2017 21:19:42 +0000

A few months ago I wrote about various tricks for sending email/text messages from Matlab. Unfortunately, Matlab only sends text emails by default and provides no documented way to send HTML-formatted emails. Text-only emails are naturally very bland and all mail clients in the past 2 decades support HTML-formatted emails. Today I will show how we can send such HTML emails from Matlab.

A quick recap: Matlab’s sendmail function uses Java (specifically, the standard javax.mail package) to prepare and send emails. The Java classes are extremely powerful and so there is no wonder that Mathworks chose to use them rather than reinventing the wheel. However, Matlab’s sendmail function only uses part of the functionality exposed by these classes (admittedly, the most important parts that deal with the basic mail-sending mechanism), and does not expose external hooks or input args that would enable the user to take full advantage of the more advanced features, HTML formatting included.

Only two small changes are needed in sendmail.m to support HTML formatting:

HTML formatting required calling the message-object’s setContent() method, rather than setText().
We need to specify 'text/html' as part of the message’s encoding

To implement these features, change the following (lines #119-130 in the original sendmail.m file of R2017a, changed lines highlighted):

% Construct the body of the message and attachments.
body = formatText(theMessage);
if numel(attachments) == 0    if ~isempty(charset)        msg.setText(body, charset);
    else
        msg.setText(body);
    end
else
    % Add body text.
    messageBodyPart = MimeBodyPart;
    if ~isempty(charset)        messageBodyPart.setText(body, charset);
    ...

to this (changed lines highlighted):

% Construct the body of the message and attachments.
body = formatText(theMessage);
isHtml = ~isempty(body) && body(1) == '<';  % msg starting with '<' indicates HTMLif isHtml    if isempty(charset)        charset = 'text/html; charset=utf-8';    else        charset = ['text/html; charset=' charset];    endendif numel(attachments) == 0  && ~isHtml    if isHtml        msg.setContent(body, charset);    elseif ~isempty(charset)        msg.setText(body, charset);
    else
        msg.setText(body);
    end
    else
        % Add body text.
        messageBodyPart = MimeBodyPart;
        if isHtml            messageBodyPart.setContent(body, charset);        elseif ~isempty(charset)            messageBodyPart.setText(body, charset);
        ...

In addition, I also found it useful to remove the hard-coded 75-character line-wrapping in text messages. This can be done by changing the following (line #291 in the original sendmail.m file of R2017a):

maxLineLength = 75;

to this:

maxLineLength = inf;  % or some other large numeric value

Deployment

It’s useful to note two alternatives for making these fixes:

Making the changes directly in %matlabroot%/toolbox/matlab/iofun/sendmail.m. You will need administrator rights to edit this file. You will also need to redo the fix whenever you install Matlab, either installation on a different machine, or installing a new Matlab release. In general, I discourage changing Matlab’s internal files because it is simply not very maintainable.
Copying %matlabroot%/toolbox/matlab/iofun/sendmail.m into a dedicated wrapper function (e.g., sendEmail.m) that has a similar function signature and exists on the Matlab path. This has the benefit of working on multiple Matlab releases, and being copied along with the rest of our m-files when we install our Matlab program on a different computer. The downside is that our wrapper function will be stuck with the version of sendmail.m that we copied into it, and we’d lose any possible improvements that Mathworks may implement in future Matlab releases.

The basic idea for the second alternative, the sendEmail.m wrapper, is something like this (the top highlighted lines are the additions made to the original sendmail.m, with everything placed in sendEmail.m on the Matlab path):

function sendEmail(to,subject,theMessage,attachments)%SENDEMAIL Send e-mail wrapper (with HTML formatting)   sendmail(to,subject,theMessage,attachments); 
% The rest of this file is copied from %matlabroot%/toolbox/matlab/iofun/sendmail.m (with the modifications mentioned above):
function sendmail(to,subject,theMessage,attachments)
%SENDMAIL Send e-mail.
%   SENDMAIL(TO,SUBJECT,MESSAGE,ATTACHMENTS) sends an e-mail.  TO is either a
%   character vector specifying a single address, or a cell array of character vector
...

We would then call the wrapper function as follows:

sendEmail('abc@gmail.com', 'email subject', 'regular text message');     % will send a regular text message
sendEmail('abc@gmail.com', 'email subject', 'HTML-formatted message');  % HTML-formatted message

In this case, the code automatically infers HTML formatting based on whether the first character in the message body is a ‘<‘ character. Instead, we could just as easily have passed an additional input argument (isHtml) to our sendEmail wrapper function.

Hopefully, in some future Matlab release Mathworks will be kind enough to enable sending 21^st-century HTML-formatted emails without needing such hacks. Until then, note that sendmail.m relies on standard non-GUI Java networking classes, which are expected to be supported far into the future, well after Java-based GUI may cease to be supported in Matlab. For this reason I believe that while it seems a bit tricky, the changes that I outlined in today’s post actually have a low risk of breaking in a future Matlab release.

Do you have some other advanced email feature that you use in your Matlab program by some crafty customization to sendmail? If so, please share it in a comment below.

Matlab compilation quirks – take 2

Yair Altman — Wed, 31 May 2017 18:00:42 +0000

Once again I would like to welcome guest blogger Hanan Kavitz of Applied Materials. Hanan posted a couple of guest posts here over the past few years, including a post last year about quirks with Matlab-compiled DLLs. Today Hanan will follow up on that post by discussing several additional quirks that they have encountered with Matlab compilations/deployment.
Don’t fix it, if it ain’t broke…
In Applied Materials Israel (PDC) we use Matlab code for both algorithm development and deployment (production). As part of the dev-ops build system, which builds our product software versions, we build Matlab artifacts (binaries) from the Matlab source code.
A typical software version has several hundreds Matlab artifacts that are automatically rebuilt on a daily basis, and we have many such versions – totaling many thousands of compilations each day.
This process takes a long time, so we were looking for a way to make it more efficient.
The idea that we chose to implement sounds simple – take a single binary module in any software version (Ex. foo.exe – Matlab-compiled exe) and check it: if the source code for this module has not changed since the last compilation then simply don’t compile it, just copy it from previous software version repository. Since most of our code doesn’t change daily (some of it hasn’t changed in years), we can skip the compilation time of most binaries and just copy them from some repository of previously compiled binaries.

In a broader look, avoiding lengthy compilations cycles by not compiling unchanged code is a common programming practice, implemented by all modern compilers. For example, the ‘make’ utility uses a ‘makefile’ to check the time stamps of all dependencies of every object file in order to decide which object requires recompilation. In reality, this is not always the best solution as time stamps may be incorrect, but it works well in the vast majority of cases.
Coming back to Matlab, now comes the hard part – how could our build system know that nothing has changed in module X and that something has changed in module Y? How does it even know which source files it needs to ensure didn’t change?
The credit for the idea goes to my manager, Lior Cohen, as follows: You can actually check the dependency of a given binary after compilation. The basis of the solution is that a Matlab executable is in fact a compressed (zip) file. The idea is then to:
Compile the binary once
Unzip the binary and “see” all your dependencies (source files are encrypted and resources are not, but we only need the list of file names – not their content).
Now build a list of all your dependency files and compute the CRC value of each from the source control. Save it for the next time you are required to compile this module.
In the next compilation cycle, find this dependency list, review it, dependency source file at a time and make sure CRC of the dependency hasn’t changed since last time.
If no dependency CRC has changed, then copy the binary from the repository of previous software version, without compiling.
Otherwise, recompile the binary and rebuild the CRC list of all dependencies again, in preparation for the next compilation cycle.
That’s it! That simple? Well… not really – the reality is a bit more complex since there are many other dependencies that need to be checked. Some of them are:
Did the requested Matlab version of the binary change since the last compilation?
Did the compilation instructions themselves (we have a sort of ‘makefile’) change?
Basically, I implemented a policy that if anything changed, or if the dependency check itself failed, then we don’t take any chances and just compile this binary. Keeping in mind that this dependencies check and file copying is much faster than a Matlab compilation, we save a lot of actual compilation time using this method.
Bottom line: Given a software version containing hundreds of compilation instructions to execute and assuming not much has changed in the version (which is often the case), we skip over 90% of compilations altogether and only rebuild what really changed. The result is a version build that takes about half an hour, instead of many hours. Moreover, since the compilation process is working significantly less, we get fewer failures, fewer stuck or crashed mcc processes, and [not less importantly] less maintenance required by me.
Note that in our implementation we rely on the undocumented fact that Matlab binaries are in fact compressed zip archives. If and when a future Matlab release will change the implementation such that the binaries will no longer be zip archives, another way will need to be devised in order to ensure the consistency of the target executable with its dependent source files.
Don’t kill it, if it ain’t bad…
I want to share a very weird issue I investigated over a year ago when using Matlab compiled exe. It started with a user showed me a Matlab compiled exe that didn’t run – I’m not talking about a regular Matlab exception: the process was crashing with an MS Windows popup window popping, stating something very obscure.
It was a very weird behavior that I couldn’t explain – the compiler seemed to work well but the compiled executable process kept crashing. Compiling completely different code showed the same behavior.
This issue has to do with the system compiler configuration that is being used. As you might know, when installing the Matlab compiler, before the first compilation is ever made, the user has to state the C compiler that the Matlab compiler should use in its compilation process. This is done by command ‘mbuild –setup’. This command asks the users to choose the C compiler and saves the configuration (batch file back then, xml in the newer versions of Matlab) in the user’s prefdir folder. At the time we were using Microsoft Visual C++ compiler 9.0 SP1.
The breakthrough in the investigation came when I ran mcc command with –verbose flag, which outputs much more compilation info than I would typically ever want… I discovered that although the target executable file had been created, a post compilation step failed to execute, while issuing a very cryptic error message:
mt.exe : general error c101008d: Failed to write the updated manifest to the resource of file “…”. Access is denied.
cryptic compilation error (click to zoom)
The failure was in one of the ‘post link’ commands in the configuration batch file – something obscure such as this:
set POSTLINK_CMDS2=mt.exe -outputresource: %MBUILD_OUTPUT_FILE_NAME%;%MANIFEST_RESOURCE% -manifest "%MANIFEST_FILE_NAME%"
This line of code takes an XML manifest file and inserts it into the generated binary file (additional details).
If you open a valid R2010a (and probably other old versions as well) Matlab-generated exe in a text editor you can actually see a small XML code embedded in it, while in a non-functioning exe I could not see this XML code.
So why would this command fail?
It turned out, as funny as it sounds, to be an antivirus issue – our IT department updated its antivirus policies and this ‘post link’ command suddenly became an illegal operation. Once our IT eased the policy, this command worked well again and the compiled executables stopped crashing, to our great joy.
Related posts:
Matlab compiler bug and workaround – Both the Matlab compiler and the publish function have errors when parsing block-comments in Matlab m-code. ...
UDD Properties – UDD provides a very convenient way to add customizable properties to existing Matlab object handles...
Disabling menu entries in deployed docked figures – Matlab's standard menu items can and should be removed from deployed docked figures. This article explains how. ...
Handle Graphics Behavior – HG behaviors are an important aspect of Matlab graphics that enable custom control of handle functionality. ...

Additional license data

Yair Altman — Wed, 15 Feb 2017 18:01:55 +0000

Matlab’s license function returns the primary license number/ID used by Matlab, but no information about the various toolboxes that may be installed. The ver function returns a bit more information, listing the version number and installation date of installed toolboxes (even user toolboxes, such as my IB-Matlab toolbox). However, no additional useful information is provided beyond that:
>> license ans = 123456 % actual number redacted >> ver ---------------------------------------------------------------------------------------------------- MATLAB Version: 9.1.0.441655 (R2016b) MATLAB License Number: 123456 Operating System: Microsoft Windows 7 Professional Version 6.1 (Build 7601: Service Pack 1) Java Version: Java 1.7.0_60-b19 with Oracle Corporation Java HotSpot(TM) 64-Bit Server VM mixed mode ---------------------------------------------------------------------------------------------------- MATLAB Version 9.1 (R2016b) Curve Fitting Toolbox Version 3.5.4 (R2016b) Database Toolbox Version 7.0 (R2016b) Datafeed Toolbox Version 5.4 (R2016b) Financial Instruments Toolbox Version 2.4 (R2016b) Financial Toolbox Version 5.8 (R2016b) GUI Layout Toolbox Version 2.2.1 (R2015b) Global Optimization Toolbox Version 3.4.1 (R2016b) IB-Matlab - Matlab connector to InteractiveBrokers Version 1.89 Expires: 1-Apr-2018 Image Processing Toolbox Version 9.5 (R2016b) MATLAB Coder Version 3.2 (R2016b) MATLAB Report Generator Version 5.1 (R2016b) Optimization Toolbox Version 7.5 (R2016b) Parallel Computing Toolbox Version 6.9 (R2016b) Statistical Graphics Toolbox Version 1.2 Statistics and Machine Learning Toolbox Version 11.0 (R2016b) >> v = ver v = 1×16 struct array with fields: Name Version Release Date >> v(1) ans = struct with fields: Name: 'Curve Fitting Toolbox' Version: '3.5.4' Release: '(R2016b)' Date: '25-Aug-2016' >> v(8) ans = struct with fields: Name: 'IB-Matlab - Matlab connector to InteractiveBrokers' Version: '1.89' Release: 'Expires: 1-Apr-2018' Date: '02-Feb-2017'
It is sometimes useful to know which license number “owns” which product/toolbox, and the expiration date is associated with each of them. Unfortunately, there is no documented way to retrieve this information in Matlab – the only documented way is to go to your account section on the MathWorks website and check there.
Luckily, there is a simpler way that can be used to retrieve additional information, from right inside Matlab, using matlab.internal.licensing.getFeatureInfo:

>> all_data = matlab.internal.licensing.getFeatureInfo all_data = 23×1 struct array with fields: feature expdate keys license_number entitlement_id >> all_data(20) ans = struct with fields: feature: 'optimization_toolbox' expdate: '31-mar-2018' keys: 0 license_number: '123456' entitlement_id: '1409891' >> all_data(21) ans = struct with fields: feature: 'optimization_toolbox' expdate: '07-mar-2017' keys: 0 license_number: 'DEMO' entitlement_id: '3749959'
As can be seen in this example, I have the Optimization toolbox licensed under my main Matlab license (123456 [actual number redacted]) until 31-mar-2018, and also licensed under a trial (DEMO) license that expires in 3 weeks. As long as a toolbox has any future expiration date, it will continue to function, so in this case I’m covered until March 2018.
We can also request information about a specific toolbox (“feature”):
>> data = matlab.internal.licensing.getFeatureInfo('matlab') data = 3×1 struct array with fields: feature expdate keys license_number entitlement_id >> data(1) data = struct with fields: feature: 'matlab' expdate: '31-mar-2018' keys: 0 license_number: '123456' entitlement_id: '1409891'
The drawback of this functionality is that it only provides information about MathWorks’ toolbox, not any user-provided toolboxes (such as my IB-Matlab connector, or MathWorks’ own GUI Layout toolbox). Also, some of the toolbox names may be difficult to understand (“gads_toolbox” apparently stands for the Global Optimization Toolbox, for example):
>> {all_data.feature} ans = 1×23 cell array Columns 1 through 4 'curve_fitting_toolbox' 'database_toolbox' 'datafeed_toolbox' 'distrib_computing_toolbox' Columns 5 through 8 'distrib_computing_toolbox' 'excel_link' 'fin_instruments_toolbox' 'financial_toolbox' Columns 9 through 15 'gads_toolbox' 'gads_toolbox' 'image_toolbox' 'image_toolbox' 'matlab' 'matlab' 'matlab' Columns 16 through 20 'matlab_coder' 'matlab_coder' 'matlab_report_gen' 'matlab_report_gen' 'optimization_toolbox' Columns 21 through 23 'optimization_toolbox' 'optimization_toolbox' 'statistics_toolbox'
A related undocumented builtin function is matlab.internal.licensing.getLicInfo:
% Information on a single toolbox/product: >> matlab.internal.licensing.getLicInfo('matlab') ans = struct with fields: license_number: {'123456' 'Prerelease' 'T3749959'} expiration_date: {'31-mar-2018' '30-sep-2016' '07-mar-2017'} % Information on multiple toolboxes/products: >> matlab.internal.licensing.getLicInfo({'matlab', 'image_toolbox'}) % cell array of toolbox/feature names ans = 1×2 struct array with fields: license_number expiration_date % The full case-insensitive names of the toolboxes can also be used: >> matlab.internal.licensing.getLicInfo({'Matlab', 'Image Processing toolbox'}) ans = 1×2 struct array with fields: license_number expiration_date % And here's how to get the full list (MathWorks products only): >> v=ver; data=matlab.internal.licensing.getLicInfo({v.Name}) data = 1×16 struct array with fields: license_number expiration_date
I have [still] not found any way to associate a user toolbox/product (such as my IB-Matlab) in a way that will report it in a unified manner with the MathWorks products. If anyone finds a way to do this, please do let me know.
p.s. – don’t even think of asking questions or posting comments on this website related to illegal uses or hacks of the Matlab license…
Related posts:
uiundo – Matlab’s undocumented undo/redo manager – The built-in uiundo function provides easy yet undocumented access to Matlab's powerful undo/redo functionality. This article explains its usage....
Undocumented Profiler options part 3 – An undocumented feature of the Matlab Profiler can report call history timeline - part 3 of series. ...
Undocumented Profiler options part 4 – Several undocumented features of the Matlab Profiler can make it much more useful - part 4 of series. ...
Pinning annotations to graphs – Annotation object can be programmatically set at, and pinned-to, plot axes data points. ...

Parsing XML strings

Yair Altman — Wed, 01 Feb 2017 09:52:45 +0000

I have recently consulted in a project where data was provided in XML strings and needed to be parsed in Matlab memory in an efficient manner (in other words, as quickly as possible). Now granted, XML is rather inefficient in storing data (JSON would be much better for this, for example). But I had to work with the given situation, and that required processing the XML.
I basically had two main alternatives:
I could either create a dedicated string-parsing function that searches for a particular pattern within the XML string, or
I could use a standard XML-parsing library to create the XML model and then parse its nodes
The first alternative is quite error-prone, since it relies on the exact format of the data in the XML. Since the same data can be represented in multiple equivalent XML ways, making the string-parsing function robust as well as efficient would be challenging. I was ~~lazy~~ expedient, so I chose the second alternative.
Unfortunately, Matlab’s xmlread function only accepts input filenames (of *.xml files), it cannot directly parse XML strings. Yummy!
The obvious and simple solution is to simply write the XML string into a temporary *.xml file, read it with xmlread, and then delete the temp file:
% Store the XML data in a temp *.xml file filename = [tempname '.xml']; fid = fopen(filename,'Wt'); fwrite(fid,xmlString); fclose(fid); % Read the file into an XML model object xmlTreeObject = xmlread(filename); % Delete the temp file delete(filename); % Parse the XML model object ...
This works well and we could move on with our short lives. But cases such as this, where a built-in function seems to have a silly limitation, really fire up the investigative reporter in me. I decided to drill into xmlread to discover why it couldn’t parse XML strings directly in memory, without requiring costly file I/O. It turns out that xmlread accepts not just file names as input, but also Java object references (specifically, java.io.File, java.io.InputStream or org.xml.sax.InputSource). In fact, there are quite a few other inputs that we could use, to specify a validation parser etc. – I wrote about this briefly back in 2009 (along with other similar semi-documented input altermatives in xmlwrite and xslt).

In our case, we could simply send xmlread as input a java.io.StringBufferInputStream(xmlString) object (which is an instance of java.io.InputStream) or org.xml.sax.InputSource(java.io.StringReader(xmlString)):
% Read the xml string directly into an XML model object inputObject = java.io.StringBufferInputStream(xmlString); % alternative #1 inputObject = org.xml.sax.InputSource(java.io.StringReader(xmlString)); % alternative #2 xmlTreeObject = xmlread(inputObject); % Parse the XML model object ...
If we don’t want to depend on undocumented functionality (which might break in some future release, although it has remained unchanged for at least the past decade), and in order to improve performance even further by passing xmlread‘s internal validity checks and processing, we can use xmlread‘s core functionality to parse our XML string directly. We can add a fallback to the standard (fully-documented) functionality, just in case something goes wrong (which is good practice whenever using any undocumented functionality):
try % The following avoids the need for file I/O: inputObject = java.io.StringBufferInputStream(xmlString); % or: org.xml.sax.InputSource(java.io.StringReader(xmlString)) try % Parse the input data directly using xmlread's core functionality parserFactory = javaMethod('newInstance','javax.xml.parsers.DocumentBuilderFactory'); p = javaMethod('newDocumentBuilder',parserFactory); xmlTreeObject = p.parse(inputObject); catch % Use xmlread's semi-documented inputObject input feature xmlTreeObject = xmlread(inputObject); end catch % Fallback to standard xmlread usage, using a temporary XML file: % Store the XML data in a temp *.xml file filename = [tempname '.xml']; fid = fopen(filename,'Wt'); fwrite(fid,xmlString); fclose(fid); % Read the file into an XML model object xmlTreeObject = xmlread(filename); % Delete the temp file delete(filename); end % Parse the XML model object ...
Related posts:
Undocumented XML functionality – Matlab's built-in XML-processing functions have several undocumented features that can be used by Java-savvy users...
Matlab-Java memory leaks, performance – Internal fields of Java objects may leak memory - this article explains how to avoid this without sacrificing performance. ...
Types of undocumented Matlab aspects – This article lists the different types of undocumented/unsupported/hidden aspects in Matlab...
Pause for the better – Java's thread sleep() function is much more accurate than Matlab's pause() function. ...

Quirks with parfor vs. for

Yair Altman — Thu, 05 Jan 2017 17:15:48 +0000

A few months ago, I discussed several tips regarding Matlab’s parfor command, which is used by the Parallel Computing Toolbox (PCT) for parallelizing loops. Today I wish to extend that post with some unexplained oddities when using parfor, compared to a standard for loop.
Data serialization quirks
Dimitri Shvorob may not appear at first glance to be a prolific contributor on Matlab Central, but from the little he has posted over the years I regard him to be a Matlab power-user. So when Dimitri reports something, I take it seriously. Such was the case several months ago, when he contacted me regarding very odd behavior that he saw in his code: the for loop worked well, but the parfor version returned different (incorrect) results. Eventually, Dimitry traced the problem to something originally reported by Dan Austin on his Fluffy Nuke It blog.
The core issue is that if we have a class object that is used within a for loop, Matlab can access the object directly in memory. But with a parfor loop, the object needs to be serialized in order to be sent over to the parallel workers, and deserialized within each worker. If this serialization/deserialization process involves internal class methods, the workers might see a different version of the class object than the one seen in the serial for loop. This could happen, for example, if the serialization/deserialization method croaks on an error, or depends on some dynamic (or random) conditions to create data.
In other words, when we use data objects in a parfor loop, the data object is not necessarily sent “as-is”: additional processing may be involved under the hood that modify the data in a way that may be invisible to the user (or the loop code), resulting in different processing results of the parallel (parfor) vs. serial (for) loops.
For additional aspects of Matlab serialization/deserialization, see my article from 2 years ago (and its interesting feedback comments).
Data precision quirks
The following section was contributed by guest blogger Lior Perlmuter-Shoshany, head algorithmician at a private equity fund.
In my work, I had to work with matrixes in the order of 10⁹ cells. To reduce the memory footprint (and hopefully also improve performance), I decided to work with data of type single instead of Matlab’s default double. Furthermore, in order to speed up the calculation I use parfor rather than for in the main calculation. In the end of the run I am running a mini for-loop to see the best results.
What I discovered to my surprise is that the results from the parfor and for loop variants is not the same!

The following simplified code snippet illustrate the problem by calculating a simple standard-deviation (std) over the same data, in both single– and double-precision. Note that the loops are ran with only a single iteration, to illustrate the fact that the problem is with the parallelization mechanism (probably the serialization/deserialization parts once again), not with the distribution of iterations among the workers.
clear rng('shuffle','twister'); % Prepare the data in both double and single precision arr_double = rand(1,100000000); arr_single = single(arr_double); % No loop - direct computation std_single0 = std(arr_single); std_double0 = std(arr_double); % Loop #1 - serial for loop std_single = 0; std_double = 0; for i=1 std_single(i) = std(arr_single); std_double(i) = std(arr_double); end % Loop #2 - parallel parfor loop par_std_single = 0; par_std_double = 0; parfor i=1 par_std_single(i) = std(arr_single); par_std_double(i) = std(arr_double); end % Compare results of for loop vs. non-looped computation isForSingleOk = isequal(std_single, std_single0) isForDoubleOk = isequal(std_double, std_double0) % Compare results of single-precision data (for vs. parfor) isParforSingleOk = isequal(std_single, par_std_single) parforSingleAccuracy = std_single / par_std_single % Compare results of double-precision data (for vs. parfor) isParforDoubleOk = isequal(std_double, par_std_double) parforDoubleAccuracy = std_double / par_std_double
Output example :
isForSingleOk = 1 % <= true (of course!) isForDoubleOk = 1 % <= true (of course!) isParforSingleOk = 0 % <= false (odd!) parforSingleAccuracy = 0.73895227413361 % <= single-precision results are radically different in parfor vs. for isParforDoubleOk = 0 % <= false (odd!) parforDoubleAccuracy = 1.00000000000021 % <= double-precision results are almost [but not exactly] the same in parfor vs. for
From my testing, the larger the data array, the bigger the difference is between the results of single-precision data when running in for vs. parfor.
In other words, my experience has been that if you have a huge data matrix, it’s better to parallelize it in double-precision if you wish to get [nearly] accurate results. But even so, I find it deeply disconcerting that the results are not exactly identical (at least on R2015a-R2016b on which I tested) even for the native double-precision .
Hmmm… bug?
Upcoming travels – Zürich & Geneva
I will shortly be traveling to clients in Zürich and Geneva, Switzerland. If you are in the area and wish to meet me to discuss how I could bring value to your work with some advanced Matlab consulting or training, then please email me (altmany at gmail):
Zürich: January 15-17
Geneva: January 18-21
Happy new year everybody!
Related posts:
Matlab mex in-place editing – Editing Matlab arrays in-place can be an important technique for optimizing calculations. This article shows how to do it using Mex. ...
Preallocation performance – Preallocation is a standard Matlab speedup technique. Still, it has several undocumented aspects. ...
Array resizing performance – Several alternatives are explored for dynamic array growth performance in Matlab loops. ...
Matlab’s internal memory representation – Matlab's internal memory structure is explored and discussed. ...

Sending email/text messages from Matlab

Yair Altman — Wed, 07 Dec 2016 21:24:03 +0000

In this day and age, applications are expected to communicate with users by sending email/text messages to alert them about applicative events (“IBM stock purchased @$99.99” or “House is on fire!”). Matlab has included the sendmail function to handle this for many years. Unfortunately, sendmail requires some tweaking to be useful on all but the most basic/insecure mail servers. Today’s post will hopefully fill the missing gaps.
None of the information I’ll present today is really new – it was all there already if you just knew what to search for online. But hopefully today’s post will concentrate all these loose ends in a single place, so it may have some value:
Using a secure mail server
Emailing multiple recipients
Sending text messages
User configuration panel
Using a secure mail server
All modern mail servers use end-to-end TLS/SSL encryption. The sendmail function needs extra configuration to handle such connections, since it is configured for a non-encrypted connection by default. Here’s the code that does this for gmail, using SMTP server smtp.gmail.com and default port #465 (for other SMTP servers, see here):
setpref('Internet', 'E_mail', from_address); % sender "from" address, typically same as username, e.g. 'xyz@gmail.com' setpref('Internet', 'SMTP_Username', username); setpref('Internet', 'SMTP_Password', password); setpref('Internet', 'SMTP_Server', 'smtp.gmail.com'); props = java.lang.System.getProperties; props.setProperty('mail.smtp.auth', 'true'); % Note: 'true' as a string, not a logical value! props.setProperty('mail.smtp.starttls.enable', 'true'); % Note: 'true' as a string, not a logical value! props.setProperty('mail.smtp.socketFactory.port', '465'); % Note: '465' as a string, not a numeric value! props.setProperty('mail.smtp.socketFactory.class', 'javax.net.ssl.SSLSocketFactory'); sendmail(recipient, title, body, attachments); % e.g., sendmail('recipient@gmail.com', 'Hello world', 'What a nice day!', 'C:\images\sun.jpg')
All this is not enough to enable Matlab to connect to gmail’s SMTP servers. In addition, we need to set the Google account to allow access from “less secure apps” (details, direct link). Without this, Google will not allow Matlab to relay emails. Other mail servers may require similar server-side account configurations to enable Matlab’s access.
Note: This code snippet uses a bit of Java as you can see. Under the hood, all networking code in Matlab relies on Java, and sendmail is no exception. For some reason that I don’t fully understand, MathWorks chose to label the feature of using sendmail with secure mail servers as a feature that relies on “undocumented commands” and is therefore not listed in sendmail‘s documentation. Considering the fact that all modern mail servers are secure, this seems to make sendmail rather useless without the undocumented extension. I assume that TMW are well aware of this, which is the reason they posted a partial documentation in the form of an official tech-support answer. I hope that one day MathWorks will incorporate it into sendmail as optional input args, so that using sendmail with secure servers would become fully documented and officially supported.
Emailing multiple recipients
To specify multiple email recipients, it is not enough to set sendmail‘s recipient input arg to a string with , or ; delimiters. Instead, we need to provide a cell array of individual recipient strings. For example:
sendmail({'recipient1@gmail.com','recipient2@gmail.com'}, 'Hello world', 'What a nice day!')
Note: this feature is actually fully documented in sendmail‘s doc-page, but for some reason I see that some users are not aware of it (to which it might be said: RTFM!).
Sending phone text (SMS) messages
With modern smartphones, text (SMS) messages have become rather outdated, as most users get push notifications of incoming emails. Still, for some users text messages may still be a useful. To send such messages, all we need is to determine our mobile carrier’s email gateway for SMS messages, and send a simple text message to that email address. For example, to send a text message to T-Mobile number 123-456-7890 in the US, simply email the message to 1234567890@tmomail.net (details).
Ke Feng posted a nice Matlab File Exchange utility that wraps this messaging for a wide variety of US carriers.
User configuration panel
Many GUI programs contain configuration panels/tabs/windows. Enabling the user to set up their own email provider is a typical use-case for such a configuration. Naturally, you’d want your config panel not to display plain-text password, nor non-integer port numbers. You’d also want the user to be able to test the email connection.
Here’s a sample implementation for such a panel that I implemented for a recent project – I plan to discuss the implementation details of the password and port (spinner) controls in my next post, so stay tuned:

User configuration of emails in Matlab GUI (click to zoom-in)
Related posts:
Types of undocumented Matlab aspects – This article lists the different types of undocumented/unsupported/hidden aspects in Matlab...
Legend ‘-DynamicLegend’ semi-documented feature – The built-in Matlab legend function has a very useful semi-documented feature for automatic dynamic update, which is explained here....
Undocumented XML functionality – Matlab's built-in XML-processing functions have several undocumented features that can be used by Java-savvy users...
Inactive Control Tooltips & Event Chaining – Inactive Matlab uicontrols cannot normally display their tooltips. This article shows how to do this with a combination of undocumented Matlab and Java hacks....

Afterthoughts on implicit expansion

Yair Altman — Wed, 30 Nov 2016 20:28:44 +0000

Matlab release R2016b introduced implicit arithmetic expansion, which is a great and long-awaited natural expansion of Matlab’s arithmetic syntax (if you are still unaware of this or what it means, now would be a good time to read about it). This is a well-documented new feature. The reason for today’s post is that this new feature contains an undocumented aspect that should very well have been documented and even highlighted.
The undocumented aspect that I’m referring to is the fact that code that until R2016a produced an error, in R2016b produces a valid result:
% R2016a >> [1:5] + [1:3]' Error using + Matrix dimensions must agree. % R2016b >> [1:5] + [1:3]' ans = 2 3 4 5 6 3 4 5 6 7 4 5 6 7 8
This incompatibility is indeed documented, but not where it matters most (read on).
I first discovered this feature by chance when trying to track down a very strange phenomenon with client code that produced different numeric results on R2015b and earlier, compared to R2016a Pre-release. After some debugging the problem was traced to a code snippet in the client’s code that looked something like this (simplified):
% Ensure compatible input data try dataA + dataB; % this will (?) error if dataA, dataB are incompatible catch dataB = dataB'; end
The code snippet relied on the fact that incompatible data (row vs. col) would error when combined, as it did up to R2015b. But in R2016a Pre-release it just gave a valid numeric matrix, which caused numerically incorrect results downstream in the code. The program never crashed, so everything appeared to be in order, it just gave different numeric results. I looked at the release notes and none of the mentioned release incompatibilities appeared relevant. It took me quite some time, using side-by-side step-by-step debugging on two separate instances of Matlab (R2015b and R2016aPR) to trace the problem to this new feature.
This implicit expansion feature was removed from the official R2016a release for performance reasons. This was apparently fixed in time for R2016b’s release.
I’m totally in favor of this great new feature, don’t get me wrong. I’ve been an ardent user of bsxfun for many years and (unlike many) have even grown fond of it, but I still find the new feature to be better. I use it wherever there is no significant performance penalty, a need to support older Matlab releases, or a possibility of incorrect results due to dimensional mismatch.
So what’s my point?
What I am concerned about is that I have not seen the new feature highlighted as a potential backward compatibility issue in the documentation or the release notes. Issues of far lesser importance are clearly marked for their backward incompatibility in the release notes, but not this important major change. A simple marking of the new feature with the warning icon () and in the “Functionality being removed or changed” section would have saved my client and me a lot of time and frustration.
MathWorks are definitely aware of the potential problems that the new feature might cause in rare use cases such as this. As Steve Eddins recently noted, there were plenty of internal discussions about this very thing. MathWorks were careful to ensure that the feature’s benefits far outweigh its risks (and I concur). But this also highlights the fact that MathWorks were fully aware that in some rare cases it might indeed break existing code. For those cases, I believe that they should have clearly marked the incompatibility implications in the release notes and elsewhere.
I have several clients who scour Matlab’s release notes before each release, trying to determine the operational risk of a Matlab upgrade. Having a program that returns different results in R2016b compared to R2016a, without being aware of this risk, is simply unacceptable to them, and leaves users with a disinclination to upgrade Matlab, to MathWorks’ detriment.
MathWorks in general are taking a very serious methodical approach to compatibility issues, and are clearly investing a lot of energy in this (a recent example). It’s too bad that sometimes this chain is broken. I find it a pity, and think that this can still be corrected in the online doc pages. If and when this is fixed, I’ll be happy to post an addendum here.
In my humble opinion from the backbenches, increasing the transparency on compatibility issues and open bugs will increase user confidence and result in greater adoption and upgrades of Matlab. Just my 2 cents…
Addendum December 27, 2016:
Today MathWorks added the following compatibility warning to the release notes (R2016b, Mathematics section, first item) – thanks for listening MathWorks
Related posts:
tic / toc – undocumented option – Matlab's built-in tic/toc functions have an undocumented option enabling multiple nested clockings...
Plot LimInclude properties – The plot objects' XLimInclude, YLimInclude, ZLimInclude, ALimInclude and CLimInclude properties are an important feature, that has both functional and performance implications....
Matrix processing performance – Matrix operations performance is affected by internal subscriptions in a counter-intuitive way....
Performance: accessing handle properties – Handle object property access (get/set) performance can be significantly improved using dot-notation. ...

Customizing axes part 5 – origin crossover and labels

Yair Altman — Wed, 27 Jul 2016 17:00:02 +0000

When HG2 graphics was finally released in R2014b, I posted a series of articles about various undocumented ways by which we can customize Matlab’s new graphic axes: rulers (axles), baseline, box-frame, grid, back-drop, and other aspects. Today I extend this series by showing how we can customize the axes rulers’ crossover location.
Non-default axes crossover location

The documented/supported stuff
Until R2015b, we could only specify the axes’ YAxisLocation as 'left' (default) or 'right', and XAxisLocation as 'bottom' (default) or 'top'. For example:
x = -2*pi : .01 : 2*pi; plot(x, sin(x)); hAxis = gca; hAxis.YAxisLocation = 'left'; % 'left' (default) or 'right' hAxis.XAxisLocation = 'bottom'; % 'bottom' (default) or 'top'
Default axis locations: axes crossover is non-fixed
The crossover location is non-fixed in the sense that if we zoom or pan the plot, the axes crossover will remain at the bottom-left corner, which changes its coordinates depending on the X and Y axes limits.
Since R2016a, we can also specify 'origin' for either of these properties, such that the X and/or Y axes pass through the chart origin (0,0) location. For example, move the YAxisLocation to the origin:
hAxis.YAxisLocation = 'origin';
Y-axis location at origin: axes crossover at 0 (fixed), -1 (non-fixed)
And similarly also for XAxisLocation:
hAxis.XAxisLocation = 'origin';
X and Y-axis location at origin: axes crossover fixed at (0,0)
The axes crossover location is now fixed at the origin (0,0), so as we move or pan the plot, the crossover location changes its position in the chart area, without changing its coordinates. This functionality has existed in other graphic packages (outside Matlab) for a long time and until now required quite a bit of coding to emulate in Matlab, so I’m glad that we now have it in Matlab by simply updating a single property value. MathWorks did a very nice job here of dynamically updating the axles, ticks and labels as we pan (drag) the plot towards the edges – try it out!
The undocumented juicy stuff
So far for the documented stuff. The undocumented aspect is that we are not limited to using the (0,0) origin point as the fixed axes crossover location. We can use any x,y crossover location, using the FirstCrossoverValue property of the axes’ hidden XRuler and YRuler properties. In fact, we could do this since R2014b, when the new HG2 graphics engine was released, not just starting in R2016a!
% Set a fixed crossover location of (pi/2,-0.4) hAxis.YRuler.FirstCrossoverValue = pi/2; hAxis.XRuler.FirstCrossoverValue = -0.4;
Custom fixed axes crossover location at (π/2,-0.4)
For some reason (bug?), setting XAxisLocation/YAxisLocation to ‘origin’ has no visible effect in 3D plots, nor is there any corresponding ZAxisLocation property. Luckily, we can set the axes crossover location(s) in 3D plots using FirstCrossoverValue just as easily as for 2D plots. The rulers also have a SecondCrossoverValue property (default = -inf) that controls the Z-axis crossover, as Yaroslav pointed out in a comment below. For example:
N = 49; x = linspace(-10,10,N); M = peaks(N); mesh(x,x,M);
Default crossover locations at (-10,±10,-10)
hAxis.XRuler.FirstCrossoverValue = 0; % X crossover with Y axis hAxis.YRuler.FirstCrossoverValue = 0; % Y crossover with X axis hAxis.ZRuler.FirstCrossoverValue = 0; % Z crossover with X axis hAxis.ZRuler.SecondCrossoverValue = 0; % Z crossover with Y axis
Custom fixed axes crossover location at (0,0,-10)
hAxis.XRuler.SecondCrossoverValue = 0; % X crossover with Z axis hAxis.YRuler.SecondCrossoverValue = 0; % Y crossover with Z axis
Custom fixed axes crossover location at (0,0,0)
Labels
Users will encounter the following unexpected behavior (bug?) when using either the documented *AxisLocation or the undocumented FirstCrossoverValue properties: when setting an x-label (using the xlabel function, or the internal axes properties), the label moves from the center of the axes (as happens when XAxisLocation=’top’ or ‘bottom’) to the right side of the axes, where the secondary label (e.g., exponent) usually appears, whereas the secondary label is moved to the left side of the axis:
Unexpected label positions
In such cases, we would expect the labels locations to be reversed, with the main label on the left and the secondary label in its customary location on the right. The exact same situation occurs with the Y labels, where the main label unexpectedly appears at the top and the secondary at the bottom. Hopefully MathWorks will fix this in the next release (it is probably too late to make it into R2016b, but hopefully R2017a). Until then, we can simply switch the strings of the main and secondary label to make them appear at the expected locations:
% Switch the Y-axes labels: ylabel(hAxis, '\times10^{3}'); % display secondary ylabel (x10^3) at top set(hAxis.YRuler.SecondaryLabel, 'Visible','on', 'String','main y-label'); % main label at bottom % Switch the X-axes labels: xlabel(hAxis, '2^{nd} label'); % display secondary xlabel at right set(hAxis.XRuler.SecondaryLabel, 'Visible','on', 'String','xlabel'); % main label at left
As can be seen from the screenshot, there’s an additional nuisance: the main label appears a bit larger than the axes font size (the secondary label uses the correct font size). This is because by default Matlab uses a 110% font-size for the main axes label, ostensibly to make them stand out. We can modify this default factor using the rulers’ hidden LabelFontSizeMultiplier property (default=1.1). For example:
hAxis.YRuler.LabelFontSizeMultiplier = 1; % use 100% font-size (same as tick labels) hAxis.XRuler.LabelFontSizeMultiplier = 0.8; % use 80% (smaller than standard) font-size
Note: I described the ruler objects in my first article of the axes series. Feel free to read it for more ideas on customizing the axes rulers.
Related posts:
Customizing axes rulers – HG2 axes can be customized in numerous useful ways. This article explains how to customize the rulers. ...
Customizing axes part 2 – Matlab HG2 axes can be customized in many different ways. This article explains some of the undocumented aspects. ...
Undocumented scatter plot jitter – Matlab's scatter plot can automatically jitter data to enable better visualization of distribution density. ...
HG2 update – HG2 appears to be nearing release. It is now a stable mature system. ...

rmfield performance

Yair Altman — Wed, 25 May 2016 07:00:48 +0000

Once again I would like to introduce guest blogger Hanan Kavitz of Applied Materials. Several months ago Hanan discussed some quirks with compiled Matlab DLLs. Today Hanan will discuss how they overcame a performance bottleneck with Matlab’s builtin rmfield function, exemplifying the general idea that we can sometimes improve performance by profiling the core functionality that causes a performance hotspot and optimizing it, even when it is part of a builtin Matlab function. For additional ideas of improving Matlab peformance, search this blog for “Performance” articles, and/or get the book “Accelerating MATLAB Performance“.

I’ve been using Matlab for many years now and from time to time I need to profile low-throughput code. When I profile this code sometimes I realize that a computational ‘bottleneck’ is due to a builtin Matlab function (part of the core language). I can often find ways to accelerate such builtin functions and get significant speedup in my code.
I recently found Matlab’s builtin rmfield function being too slow for my needs. It works great when one needs to remove a few fields from a small structure, but in our case we needed to remove thousands of fields from a structure containing about 5000 fields – and this is executed in a function that is called many times inside an external loop. The program was significantly sluggish.
It started when a co-worker asked me to look at a code that looked just slightly more intelligent than this:
for i = 1:5000 myStruct = rmfield(myStruct,fieldNames{i}); end
Running this code within a tic/toc pair yielded the following results:
>> tic; myFunc(); t1 = toc t1 = 25.7713
In my opinion 25.77 secs for such a simple functionality seems like an eternity…

The obvious thing was to change the code to the documented faster (vectorized) version:
>> tic; myStruct = rmfield(myStruct,fieldNames); t2 = toc t2 = 0.6097
This is obviously much better but since rmfield is called many times in my application, I needed something even better. So I profiled rmfield and was not happy with the result.
The original code of rmfield (%matlabroot%/toolbox/matlab/datatypes/rmfield.m) looks something like this (I deleted some non-essential code for brevity):
function t = rmfield(s,field) % get fieldnames of struct f = fieldnames(s); % Determine which fieldnames to delete. idxremove = []; for i=1:length(field) j = find(strcmp(field{i},f) == true); idxremove = [idxremove;j]; end % set indices of fields to keep idxkeep = 1:length(f); idxkeep(idxremove) = []; % remove the specified fieldnames from the list of fieldnames. f(idxremove,:) = []; % convert struct to cell array c = struct2cell(s); % find size of cell array sizeofarray = size(c); newsizeofarray = sizeofarray; % adjust size for fields to be removed newsizeofarray(1) = sizeofarray(1) - length(idxremove); % rebuild struct t = cell2struct(reshape(c(idxkeep,:),newsizeofarray),f);
When I profiled the code, the highlighted row was the bottleneck I was looking for.
First, I noticed the string comparison equals to true part – while '==true' is not the cause of the bottleneck, it does leave an impression of bad coding style Perhaps this code was created as some apprentice project, which might also explain its suboptimal performance.
The real performance problem here is that for each field that we wish to remove, rmfield compares it to all existing fields to find its location in a cell array of field names. This is algorithmically inefficient and makes the code hard to understand (just try – it took me hard, long minutes).
So, I created a variant of rmfield.m called fast_rmfield.m, as follows (again, omitting some non-essential code):
function t = fast_rmfield(s,field) % get fieldnames of struct f = fieldnames(s); [f,ia] = setdiff(f,field,'R2012a'); % convert struct to cell array c = squeeze(struct2cell(s)); % rebuild struct t = cell2struct(c(ia,:),f)';
This code is much shorter, easier to explain and maintain, but also (and most importantly) much faster:
>> tic; myStruct = fast_rmfield(myStruct,fieldNames); t3 = toc t3 = 0.0302 >> t2/t3 ans = 20.1893
This resulted in a speedup of ~850x compared to the original version (of 25.77 secs), and ~20x compared to the vectorized version. A nice improvement in my humble opinion…
The point in all this is that we can and should rewrite Matlab builtin functions when they are too slow for our needs, whether it is found to be an algorithmic flaw (as in this case), extraneous sanity checks (as in the case of ismember or datenum), bad default parameters (as in the case of fopen/fwrite or scatter), or merely slow implementation (as in the case of save, cellfun, or the conv family of functions).
A good pattern is to save such code pieces in file names that hint to the original code. In our case, I used fast_rmfield to suggest that it is a faster alternative to rmfield.
Do you know of any other example of a slow implementation in a built-in Matlab function that can be optimized? If so, please leave a comment below.
Related posts:
tic / toc – undocumented option – Matlab's built-in tic/toc functions have an undocumented option enabling multiple nested clockings...
Solving a MATLAB bug by subclassing – Matlab's Image Processing Toolbox's impoint function contains an annoying bug that can be fixed using some undocumented properties....
Plot LimInclude properties – The plot objects' XLimInclude, YLimInclude, ZLimInclude, ALimInclude and CLimInclude properties are an important feature, that has both functional and performance implications....
Class object creation performance – Performance aspects of Matlab class object creation are discussed, with specific suggestions. ...

Viewing saved profiling results

Yair Altman — Wed, 18 May 2016 18:00:37 +0000

Many Matlab users know and utilize Matlab’s built-in Profiler tool to identify performance bottlenecks and code-coverage issues. Unfortunately, not many are aware of the Profiler’s programmatic interface. In past articles as well as my performance book I explained how we can use this programmatic interface to save profiling results and analyze it offline. In fact, I took this idea further and even created a utility (profile_history) that displays the function call timeline in a standalone Matlab GUI, something that is a sorely missed feature in the built-in profiler:

Function call timeline profiling (click for full-size image)
Today I will discuss a related undocumented feature of the Profiler: loading and viewing pre-saved profiling results.

Programmatic access to profiling results
Matlab’s syntax for returning the detailed profiling results in a data struct is clearly documented in the profile function’s doc page. Although the documentation does not explain the resulting struct and sub-struct fields, they have meaningful names and we can relatively easily infer what each of them means (I added a few annotation comments for clarity):
>> profile('on','-history') >> surf(peaks); drawnow >> profile('off') >> profData = profile('info') profData = FunctionTable: [26x1 struct] FunctionHistory: [2x56 double] ClockPrecision: 4.10517962829241e-07 ClockSpeed: 2501000000 Name: 'MATLAB' Overhead: 0 >> profData.FunctionTable(1) ans = CompleteName: 'C:\Program Files\Matlab\R2016a\toolbox\matlab\specgraph\peaks.m>peaks' FunctionName: 'peaks' FileName: 'C:\Program Files\Matlab\R2016a\toolbox\matlab\specgraph\peaks.m' Type: 'M-function' Children: [1x1 struct] Parents: [0x1 struct] ExecutedLines: [9x3 double] IsRecursive: 0 TotalRecursiveTime: 0 PartialData: 0 NumCalls: 1 TotalTime: 0.0191679078068094 >> profData.FunctionTable(1).Children ans = Index: 2 % index in profData.FunctionTable array NumCalls: 1 TotalTime: 0.00136415141013509 >> profData.FunctionTable(1).ExecutedLines % line number, number of calls, duration in secs ans = 43 1 0.000160102031282782 44 1 2.29890096200918e-05 45 1 0.00647592190637408 56 1 0.0017093970724654 57 1 0.00145036019621044 58 1 0.000304193859437286 60 1 4.39254290955326e-05 62 1 3.44835144301377e-05 63 1 0.000138755093778411 >> profData.FunctionHistory(:,1:5) ans = 0 0 1 1 0 % 0=enter, 1=exit 1 2 2 1 6 % index in profData.FunctionHistory array
As we can see, this is pretty intuitive so far.
Loading and viewing saved profiling results
If we wish to save these results results in a file and later load and display them in the Profiler’s visualization browser, then we need to venture deeper into undocumented territory. It seems that while retrieving the profiling results (via profile(‘info’)) is fully documented, doing the natural complementary action (namely, loading this data into the viewer) is not. For the life of me I cannot understand the logic behind this decision, but that’s the way it is.
Luckily, the semi-documented built-in function profview does exactly what we need: profview accepts 2 input args (function name and the profData struct) and displays the resulting profiling info. The first input arg (function name) accepts either a string (e.g., 'peaks' or 'view>isAxesHandle'), or the numeric value 0 which signifies the home (top-level) page:
profView(0, profData); % display profiling home (top-level) page profview('peaks', profData); % display a specific profiling page
I use the 0 input value much more frequently than the string inputs, because I often don’t know which functions exactly were profiled, and starting at the home page enables me to easily drill-down the profiling results interactively.
Loading saved profiling results from a different computer
Things get slightly complicated if we try to load saved profiling results from a different computer. If the other computer has exactly the same folder structure as our computer, and all our Matlab functions reside in exactly the same disk folders/path, then everything will work out of the box. The problem is that in general the other computer will have the functions in different folders. When we then try to load the profData on our computer, it will not find the associated Matlab functinos in order to display the line-by-line profiling results. We will only see the profiling data at the function level, not line level. This significantly reduces the usefulness of the profiling data. The Profiler page will display the following error message:
This file was modified during or after profiling. Function listing disabled.
We can solve this problem in either of two ways:
Modify our profData to use the correct folder path on the local computer, rather than the other computer’s path (which is invalid on the local computer). For example:
% Save the profData on computer #1: profData = profile('info'); save('profData.mat', 'profData'); % Load the profData on computer #2: fileData = load('profData.mat'); profData = fileData.profData; path1 = 'N:\Users\Juan\programs\myProgram'; path2 = 'C:\Yair\consulting\clients\Intel\code'; for idx = 1 : numel(profData.FunctionTable) funcData = profData.FunctionTable(idx); funcData.FileName = strrep(funcData.FileName, path1, path2); funcData.CompleteName = strrep(funcData.CompleteName, path1, path2); profData.FunctionTable(idx) = funcData; end % note: this loop can be vectorized if you wish
As an alternative, we can modify Matlab’s profview.m function (%matlabroot%/toolbox/matlab/codetools/profview.m) to search for the function’s source code in the current Matlab path, if the specified direct path is not found (note that changing profview.m may require administrator priviledges). For example, the following is the code from R2016a’s profview.m file, line #506:
% g894021 - Make sure the MATLAB code file still exists if ~exist(fullName, 'file') [~,fname,fext] = fileparts(fullName); % Yair fname = which([fname fext]); % Yair if isempty(fname) % Yair mFileFlag = 0; else % Yair fullName = fname; % Yair end % Yair end
These two workarounds complement each other: the first workaround does not require changing any installed Matlab code, and so is platform- and release-independent, but would require rerunning the code snippet for each and every profiling data file that we receive from external computers. On the other hand, the second workaround is a one-time operation that should work for multiple saved profiling results, although we would need to redo it whenever we install Matlab.
Additional profview customizations
Modifying the profview.m function can be used for different improvements as well.
For example, several years ago I explained how this function can be modified to display 1 ms timing resolutions, rather than the default 10 mS.
Another customization that I often do after I install Matlab is to change the default setting of truncating function lines longer than 40 characters – I typically modify this to 60 or 80 (depending on the computer monitor’s size…). All we need to do is to update the truncateDisplayName sub-function within profview.m as follows (taken from R2016a again, line #1762):
function shortFileName = truncateDisplayName(longFileName,maxNameLen) %TRUNCATEDISPLAYNAME Truncate the name if it gets too long maxNameLen = max(60,maxNameLen); % YairshortFileName = escapeHtml(longFileName); if length(longFileName) > maxNameLen, shortFileName = char(com.mathworks.util.FileUtils.truncatePathname( ... shortFileName, maxNameLen)); end
You can see additional undocumented profiling features in the “Related posts” section below, as well as in Chapter 2 of my book “Accelerating MATLAB Performance“.
Do you have any other customization to the profiling results? If so, please share it in a comment.
Related posts:
Plot performance – Undocumented inner plot mechanisms can significantly improve plotting performance ...
Undocumented Profiler options part 2 – Several undocumented features of the Matlab Profiler can make it much more useful - part 2 of series. ...
Undocumented Profiler options part 3 – An undocumented feature of the Matlab Profiler can report call history timeline - part 3 of series. ...
Undocumented Profiler options part 4 – Several undocumented features of the Matlab Profiler can make it much more useful - part 4 of series. ...

Convolution performance

Yair Altman — Wed, 03 Feb 2016 19:00:35 +0000

MathWorks’ latest MATLAB Digest (January 2016) featured my book “Accelerating MATLAB Performance“. I am deeply honored and appreciative of MathWorks for this.
I would like to dedicate today’s post to a not-well-known performance trick from my book, that could significantly improve the speed when computing the convolution of two data arrays. Matlab’s internal implementation of convolution (conv, conv2 and convn) appears to rely on a sliding window approach, using implicit (internal) multithreading for speed.
However, this can often be sped up significantly if we use the Convolution Theorem, which states in essence that conv(a,b) = ifft(fft(a,N) .* fft(b,N)), an idea proposed by Bruno Luong. In the following usage example we need to remember to zero-pad the data to get comparable results:
% Prepare the input vectors (1M elements each) x = rand(1e6,1); y = rand(1e6,1); % Compute the convolution using the builtin conv() tic, z1 = conv(x,y); toc => Elapsed time is 360.521187 seconds. % Now compute the convolution using fft/ifft: 780x faster! n = length(x) + length(y) - 1; % we need to zero-pad tic, z2 = ifft(fft(x,n) .* fft(y,n)); toc => Elapsed time is 0.463169 seconds. % Compare the relative accuracy (the results are nearly identical) disp(max(abs(z1-z2)./abs(z1))) => 2.75200348450538e-10
This latest result shows that the results are nearly identical, up to a tiny difference, which is certainly acceptable in most cases when considering the enormous performance speedup (780x in this specific case). Bruno’s implementation (convnfft) is made even more efficient by using MEX in-place data multiplications, power-of-2 FFTs, and use of GPU/Jacket where available.
It should be noted that the builtin Matlab functions can still be faster for relatively small data arrays, or if your machine has a large number of CPU cores and free memory that Matlab’s builtin conv* functions can utilize, and of course also depending on the Matlab release. So, your mileage might well vary. But given the significant speedup potential, I contend that you should give it a try and see how well it performs on your specific system and data.
If you have read my book, please be kind enough to post your feedback about it on Amazon (link), for the benefit of others. Thanks in advance!
Related posts:
tic / toc – undocumented option – Matlab's built-in tic/toc functions have an undocumented option enabling multiple nested clockings...
Plot LimInclude properties – The plot objects' XLimInclude, YLimInclude, ZLimInclude, ALimInclude and CLimInclude properties are an important feature, that has both functional and performance implications....
Matrix processing performance – Matrix operations performance is affected by internal subscriptions in a counter-intuitive way....
Preallocation performance – Preallocation is a standard Matlab speedup technique. Still, it has several undocumented aspects. ...

Graphic sizing in Matlab R2015b

Yair Altman — Wed, 20 Jan 2016 18:00:31 +0000

I would like to introduce Daniel Dolan of Sandia National Laboratories. Dan works on a variety of data analysis projects in Matlab, and is an active lurker on MATLAB Central. Dan has a habit of finding interesting bugs for the Mac version of Matlab. Today he will discuss graphic sizing in Matlab and important changes that occurred in release R2015b.
Matlab-generated graphics are often not displayed at their requested size. This problem has been known for some time and has a well-known solution: setting the root object’s ScreenPixelsPerInch property to the display’s actual DPI (dots per inch) value. Release R2015b no longer supports this solution, creating problems for publication graphics and general readability.
Physical sizing in R2015a vs. R2015b (click for full-size)

Physical sizing
Matlab supports graphic sizing in various physical units: inches, centimeters, and points. For example:
figure; axes('Box','on', 'Units','inches','Position',[0.3 0.3 4 4]);
requests to display an axes having square sizes measuring exactly 4″ (101.6 mm) each. It is evident, however, that the displayed axes is smaller than 4″. The mismatch between requested and physical size depends on the display and operating system — go ahead, try it on your system. The problem is particularly severe on Mac laptops, presumably even worse for those with Retina displays.
The problem is that Matlab cannot determine pixel size, which varies from one display to the other. Generating a figure spanning a particular number of pixels (e.g., 1024 x 768) is easy, but absolute physical units requires a conversion factor called ScreenPixelsPerInch, which is a root property (see related post on setting/getting default graphics property values):
DPI = 110; % dots per inch for my 27" Apple Cinema Display set(0, 'ScreenPixelsPerInch',DPI); % all releases prior to R2015b set(groot,'ScreenPixelsPerInch',DPI); % R2014b through R2015a
DPI values tend to be higher for laptops, usually in the 120-130 range. Retina displays are supposed to be >300 DPI, but I have not been able to test that myself.
There are several ways to determine the correct DPI setting for a particular display. It may be available in the hardware specifications, and it can be calculated from the diagonal size and the number of pixels. Unfortunately these methods are not always reliable. If you really care about physical sizing, the best approach is to actually calibrate your display. There are tools for doing this at Matlab Central, but it’s not hard to do manually:
Create a figure.
Manually resize the figure to match a convenient width. I often use a piece of US letter paper as 8.5″ guide on the display.
Determine the width of the figure in pixels:
set(gcf,'Units','pixels'); pos = get(gcf,'Position'); width = 8.5; % inches DPI = pos(3) / width;
I usually apply the DPI settings in my startup file so that Matlab begins with a calibrated display.
What changed in 2015b?
ScreenPixelsPerInch is a read-only property in R2015b, so display calibration no longer works. The following sequence of commands:
figure('Units','inches', 'PaperPositionMode','auto', 'Position',[0 0 4 4]); set(gcf, 'MenuBar','none', 'ToolBar','none', 'DockControls','off', 'NumberTitle','off'); axes('FontUnits','points', 'FontSize',10); image
now renders differently in R2015b than does for a calibrated display in R2015a. Differences between the two outputs are shown in the screenshot at the top of this post. The grid behind the figures was rendered at 8.5″ x 8.5″ inches on my display; if your browser’s zoom level isn’t 100%, it may appear larger or smaller.
A side effect of improper graphic sizing is that text is difficult to read — the uncalibrated axes labels are clearly smaller than 10 points. These examples were rendered on ~110 DPI display. Matlab assumes that Macs use 72 DPI (96 DPI on Windows), so graphics appear at 65% of the request size.
The loss of ScreenPixelsPerInch as an adjustable setting strongly affects anyone using Matlab for publication graphics. Scientific and engineering journals are extremly strict about figure widths. With a calibrated screen, figure appear exactly as they will when printed to a file (usually EPS or PDF). Figures are often made as small as possible to and densely packed to save journal space, and accurate sized display helps the author determine legibility. Displaying accurately sized graphics is very difficult in R2015b, which is unfortunate given the many enhancements in this release.
Developers who create graphical interfaces for other users should also care about this change. A common complaint I get is that text and control labels is too small to easily read. Screen calibration deals with this problem, but this option is no longer available.
Where do we go from here?
I reported the above issues to the Mathworks several months ago. It does not appear as a formal bug, but technical support is aware of the problem. The change is part of the “DPI aware” nature of release R2015b. So far I have found no evidence this release is any more aware of pixel size than previous releases, but my experience is limited to non-Retina Macs. I welcome input from users on other operating systems, particularly those with high-resolution displays.
To be fair, correct physical sizing is not an easy across the many platforms that Matlab runs on. Display resolution is particularly tricky when it changes during a Matlab session, such as when computer is connector to projector/television or a laptop is connected to a docking station.
Thankfully, printed graphic sizes are rendered correctly when a figure’s PaperPositionMode property is 'auto'. Many users can (and will) ignore the display problem if they aren’t dealing with strict size requirements and text legibility isn’t too bad. Some users may be willing to periodically print publication figures to externally verify sizing, but this breaks the interactive nature of Matlab figures.
A potential work around is the creating of a new figure class that oversizes figures (as needed) to account for a particular display. I started working on such a class, but the problem is more complicated than one might think:
Child objects (axes, uicontrols, etc.) also must be resized if they are based on physical units.
Resized objects must be temporarily restored to their original size for printing, and new objects must be tracked whenever they are added.
Figure resolution may need to be changed when moving to different computer systems.
These capabilities are quite possible to implement, but this is a complicated solution to problem that was once easy to fix.
Retina displays don’t suffer as badly as one might think from the DPI mismatch. Even though the display specification may be greater than 200 DPI, OS X and/or Matlab must perform some intermediate size transformations. The effective DPI in R2015a is 110-120 for 13-15″ MacBook Pro laptops (at the default resolution). Objected sized with physical units still appear smaller than they should (~72/110), but not as small as I expected (<72/200).
Effect pixel size can also be changed by switching between different monitor scalings. This isn’t entirely surprising, but it can lead to some interesting results because Matlab only reads these settings at startup. Changing the display scaling during a session can cause square figures to appear rectangular. Also, the effective DPI changes for setting: I could reach values of ~60-110 DPI on an Apple Cinema Display.
So where does this leave us? Display calibration was always a finicky matter, but at least in principle one could make graphics appear exactly the same size on two different displays. Now it seems that sizing is completely variable between operation systems, displays, and display settings. For publication graphics, there will almost always be a disconnect between figure size on the screen and the printed output; some iteration may be needed to ensure everything looks right in the finished output. For graphical interfaces, font sizes may need to generated in normalized units and then converted to pixels (to avoid resizing).
Physical accuracy may not be important for non-publication figures, but the issue of text legibility remains. Some text objects–such as axes and tick labels–can easily be resized because the parent axes automatically adjusts itself as needed. Free floating text objects and uincontrols are much more difficult to deal with. Controls are often sized around the extent of their text label, so changing font sizes may require changes to the control position; adjacent controls may overlap after resizing for text clarity. Normalized units partially solve this problem, but their effect on uicontrols is not always desirable: do you really want push buttons to get larger/smaller when the figure is resized?
Can you think of a better workaround to this problem? If so, then please post a comment below. I will be very happy to hear your ideas, as I’m sure others who have high resolution displays would as well.
(cross-reference: CSSM newsgroup post)
Addendum Dec 31, 2016: Dan Dolan just posted a partial workaround on the MathWorks File Exchange. Also see the related recent article on working with non-standard DPI values.
Related posts:
HG2 update – HG2 appears to be nearing release. It is now a stable mature system. ...
Modifying default toolbar/menubar actions – The default Matlab figure toolbar and menu actions can easily be modified using simple pure-Matlab code. This article explains how....
FIG files format – FIG files are actually MAT files in disguise. This article explains how this can be useful in Matlab applications....
A couple of internal Matlab bugs and workarounds – A couple of undocumented Matlab bugs have simple workarounds. ...

Stock Matlab function – Undocumented Matlab

Improving graphics interactivity

Test setup

Changing the default interactions

Effects of the axes toolbar

Conclusions

Undocumented plot marker types

USA visit

Speeding-up builtin Matlab functions – part 2

Profiling

Optimizing input args pre-processing

Vectorizing the main loop

Related functions

Conclusions

Speeding-up builtin Matlab functions – part 1

Profiling

The speed-up improvement process

Speeding-up stablefit()

Speeding-up stablelike()

Conclusions

Customizing axes tick labels

Customizing histogram plots

Customizing contour plots part 2

The updateContours() function

Putting it all together

Contour level values

Prediction about forward compatibility

Professional assistance anyone?

Runtime code instrumentation

Sending HTML emails from Matlab

Deployment

Matlab compilation quirks – take 2

Don’t fix it, if it ain’t broke…

Don’t kill it, if it ain’t bad…

Additional license data

Parsing XML strings

Quirks with parfor vs. for

Data serialization quirks

Data precision quirks

Upcoming travels – Zürich & Geneva

Sending email/text messages from Matlab

Using a secure mail server

Emailing multiple recipients

Sending phone text (SMS) messages

User configuration panel

Afterthoughts on implicit expansion

So what’s my point?

Addendum December 27, 2016:

Customizing axes part 5 – origin crossover and labels

The documented/supported stuff

The undocumented juicy stuff

Labels

rmfield performance

Viewing saved profiling results

Programmatic access to profiling results

Loading and viewing saved profiling results

Loading saved profiling results from a different computer

Additional profview customizations

Convolution performance

Graphic sizing in Matlab R2015b

Physical sizing

What changed in 2015b?

Where do we go from here?