<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Memory &#8211; Undocumented Matlab</title>
	<atom:link href="https://undocumentedmatlab.com/articles/tag/memory/feed" rel="self" type="application/rss+xml" />
	<link>https://undocumentedmatlab.com</link>
	<description>Professional Matlab consulting, development and training</description>
	<lastBuildDate>Thu, 05 Jan 2017 17:15:48 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.7.2</generator>
	<item>
		<title>Quirks with parfor vs. for</title>
		<link>https://undocumentedmatlab.com/articles/quirks-with-parfor-vs-for?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=quirks-with-parfor-vs-for</link>
					<comments>https://undocumentedmatlab.com/articles/quirks-with-parfor-vs-for#comments</comments>
		
		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Thu, 05 Jan 2017 17:15:48 +0000</pubDate>
				<category><![CDATA[Guest bloggers]]></category>
		<category><![CDATA[Medium risk of breaking in future versions]]></category>
		<category><![CDATA[Memory]]></category>
		<category><![CDATA[Stock Matlab function]]></category>
		<category><![CDATA[Undocumented feature]]></category>
		<category><![CDATA[Bug]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Pure Matlab]]></category>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6821</guid>

					<description><![CDATA[<p>Parallelizing loops with Matlab's parfor might generate unexpected results. Users beware! </p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/quirks-with-parfor-vs-for">Quirks with parfor vs. for</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/a-few-parfor-tips" rel="bookmark" title="A few parfor tips">A few parfor tips </a> <small>The parfor (parallel for) loops can be made faster using a few simple tips. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-compilation-quirks-take-2" rel="bookmark" title="Matlab compilation quirks &#8211; take 2">Matlab compilation quirks &#8211; take 2 </a> <small>A few hard-to-trace quirks with Matlab compiler outputs are explained. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/quirks-with-compiled-matlab-dlls" rel="bookmark" title="Quirks with compiled Matlab DLLs">Quirks with compiled Matlab DLLs </a> <small>Several quirks with Matlab-compiled DLLs are discussed and workarounds suggested. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/preallocation-performance" rel="bookmark" title="Preallocation performance">Preallocation performance </a> <small>Preallocation is a standard Matlab speedup technique. Still, it has several undocumented aspects. ...</small></li>
</ol>
</div>
]]></description>
										<content:encoded><![CDATA[<p>A few months ago, I discussed several <a href="/articles/a-few-parfor-tips" target="_blank">tips regarding Matlab&#8217;s <i><b>parfor</b></i></a> command, which is used by the Parallel Computing Toolbox (PCT) for parallelizing loops. Today I wish to extend that post with some unexplained oddities when using <i><b>parfor</b></i></a>, compared to a standard <i><b>for</b></i> loop.</p>
<h3 id="serialization">Data serialization quirks</h3>
<p><a href="http://www.mathworks.com/matlabcentral/profile/authors/870050-dimitri-shvorob" rel="nofollow" target="_blank">Dimitri Shvorob</a> may not appear at first glance to be a prolific contributor on Matlab Central, but from the little he has posted over the years I regard him to be a Matlab power-user. So when Dimitri reports something, I take it seriously. Such was the case several months ago, when he contacted me regarding very odd behavior that he saw in his code: the <i><b>for</b></i> loop worked well, but the <i><b>parfor</b></i> version returned different (incorrect) results. Eventually, Dimitry traced the problem to something <a href="http://fluffynukeit.com/tag/loadobj" rel="nofollow" target="_blank">originally reported</a> by Dan Austin on his <a href="http://fluffynukeit.com" rel="nofollow" target="_blank">Fluffy Nuke It blog</a>.<br />
The core issue is that if we have a class object that is used within a <i><b>for</b></i> loop, Matlab can access the object directly in memory. But with a <i><b>parfor</b></i> loop, the object needs to be serialized in order to be sent over to the parallel workers, and deserialized within each worker. If this serialization/deserialization process involves internal class methods, the workers might see a different version of the class object than the one seen in the serial <i><b>for</b></i> loop. This could happen, for example, if the serialization/deserialization method croaks on an error, or depends on some dynamic (or random) conditions to create data.<br />
In other words, when we use data objects in a <i><b>parfor</b></i> loop, the data object is not necessarily sent &#8220;as-is&#8221;: additional processing may be involved under the hood that modify the data in a way that may be invisible to the user (or the loop code), resulting in different processing results of the parallel (<i><b>parfor</b></i>) vs. serial (<i><b>for</b></i>) loops.<br />
For additional aspects of Matlab serialization/deserialization, see <a href="/articles/serializing-deserializing-matlab-data" target="_blank">my article</a> from 2 years ago (and its interesting feedback comments).</p>
<h3 id="precision">Data precision quirks</h3>
<p><i>The following section was contributed by guest blogger Lior Perlmuter-Shoshany, head algorithmician at a private equity fund.</i><br />
In my work, I had to work with matrixes in the order of 10<sup>9</sup> cells. To reduce the memory footprint (and hopefully also improve performance), I decided to work with data of type <code>single</code> instead of Matlab&#8217;s default <code>double</code>. Furthermore, in order to speed up the calculation I use <i><b>parfor</b></i> rather than <i><b>for</b></i> in the main calculation. In the end of the run I am running a mini <i><b>for</b></i>-loop to see the best results.<br />
What I discovered to my surprise is that the results from the <b><i>parfor</i></b> and <i><b>for</b></i> loop variants is not the same!<br />
<span id="more-6821"></span><br />
The following simplified code snippet illustrate the problem by calculating a simple standard-deviation (<i><b>std</b></i>) over the same data, in both <code>single</code>&#8211; and <code>double</code>-precision. Note that the loops are ran with only a single iteration, to illustrate the fact that the problem is with the parallelization mechanism (probably the serialization/deserialization parts once again), not with the distribution of iterations among the workers.</p>
<pre lang="matlab">
clear
rng('shuffle','twister');
% Prepare the data in both double and single precision
arr_double = rand(1,100000000);
arr_single = single(arr_double);
% No loop - direct computation
std_single0 = std(arr_single);
std_double0 = std(arr_double);
% Loop #1 - serial for loop
std_single = 0;
std_double = 0;
for i=1
    std_single(i) = std(arr_single);
    std_double(i) = std(arr_double);
end
% Loop #2 - parallel parfor loop
par_std_single = 0;
par_std_double = 0;
parfor i=1
    par_std_single(i) = std(arr_single);
    par_std_double(i) = std(arr_double);
end
% Compare results of for loop vs. non-looped computation
isForSingleOk = isequal(std_single, std_single0)
isForDoubleOk = isequal(std_double, std_double0)
% Compare results of single-precision data (for vs. parfor)
isParforSingleOk = isequal(std_single, par_std_single)
parforSingleAccuracy = std_single / par_std_single
% Compare results of double-precision data (for vs. parfor)
isParforDoubleOk = isequal(std_double, par_std_double)
parforDoubleAccuracy = std_double / par_std_double
</pre>
<p>Output example :</p>
<pre lang="matlab">
isForSingleOk =
    1                   % <= true (of course!)
isForDoubleOk =
    1                   % <= true (of course!)
isParforSingleOk =
    0                   % <= false (odd!)
parforSingleAccuracy =
    0.73895227413361    % <= single-precision results are radically different in parfor vs. for
isParforDoubleOk =
    0                   % <= false (odd!)
parforDoubleAccuracy =
    1.00000000000021    % <= double-precision results are almost [but not exactly] the same in parfor vs. for
</pre>
<p>From my testing, the larger the data array, the bigger the difference is between the results of <code>single</code>-precision data when running in <i><b>for</b></i> vs. <i><b>parfor</b></i>.<br />
In other words, my experience has been that if you have a huge data matrix, it's better to parallelize it in <code>double</code>-precision if you wish to get [nearly] accurate results. But even so, I find it deeply disconcerting that the results are not exactly identical (at least on R2015a-R2016b on which I tested) even for the native <code>double</code>-precision .<br />
Hmmm... bug?</p>
<h3 id="travels">Upcoming travels - Zürich & Geneva</h3>
<p>I will shortly be traveling to clients in Zürich and Geneva, Switzerland. If you are in the area and wish to meet me to discuss how I could bring value to your work with some advanced Matlab consulting or training, then please email me (altmany at gmail):</p>
<ul>
<li><b>Zürich</b>: January 15-17</li>
<li><b>Geneva</b>: January 18-21</li>
</ul>
<p>Happy new year everybody!</p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/quirks-with-parfor-vs-for">Quirks with parfor vs. for</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/a-few-parfor-tips" rel="bookmark" title="A few parfor tips">A few parfor tips </a> <small>The parfor (parallel for) loops can be made faster using a few simple tips. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-compilation-quirks-take-2" rel="bookmark" title="Matlab compilation quirks &#8211; take 2">Matlab compilation quirks &#8211; take 2 </a> <small>A few hard-to-trace quirks with Matlab compiler outputs are explained. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/quirks-with-compiled-matlab-dlls" rel="bookmark" title="Quirks with compiled Matlab DLLs">Quirks with compiled Matlab DLLs </a> <small>Several quirks with Matlab-compiled DLLs are discussed and workarounds suggested. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/preallocation-performance" rel="bookmark" title="Preallocation performance">Preallocation performance </a> <small>Preallocation is a standard Matlab speedup technique. Still, it has several undocumented aspects. ...</small></li>
</ol>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://undocumentedmatlab.com/articles/quirks-with-parfor-vs-for/feed</wfw:commentRss>
			<slash:comments>7</slash:comments>
		
		
			</item>
		<item>
		<title>Assessing Java object size in Matlab</title>
		<link>https://undocumentedmatlab.com/articles/assessing-java-object-size-in-matlab?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=assessing-java-object-size-in-matlab</link>
					<comments>https://undocumentedmatlab.com/articles/assessing-java-object-size-in-matlab#respond</comments>
		
		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Wed, 29 Jan 2014 18:00:43 +0000</pubDate>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Medium risk of breaking in future versions]]></category>
		<category><![CDATA[Memory]]></category>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=4530</guid>

					<description><![CDATA[<p>Java object sizes are not reported by Matlab, but we can still estimate them using two free external utilities. </p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/assessing-java-object-size-in-matlab">Assessing Java object size in Matlab</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/findjobj-find-underlying-java-object" rel="bookmark" title="FindJObj &#8211; find a Matlab component&#039;s underlying Java object">FindJObj &#8211; find a Matlab component&#039;s underlying Java object </a> <small>The FindJObj utility can be used to access and display the internal components of Matlab controls and containers. This article explains its uses and inner mechanism....</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-callbacks-for-java-events-in-r2014a" rel="bookmark" title="Matlab callbacks for Java events in R2014a">Matlab callbacks for Java events in R2014a </a> <small>R2014a changed the way in which Java objects expose events as Matlab callbacks. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-callbacks-for-java-events" rel="bookmark" title="Matlab callbacks for Java events">Matlab callbacks for Java events </a> <small>Events raised in Java code can be caught and handled in Matlab callback functions - this article explains how...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/using-pure-java-gui-in-deployed-matlab-apps" rel="bookmark" title="Using pure Java GUI in deployed Matlab apps">Using pure Java GUI in deployed Matlab apps </a> <small>Using pure-Java GUI in deployed Matlab apps requires a special yet simple adaptation. ...</small></li>
</ol>
</div>
]]></description>
										<content:encoded><![CDATA[<p>Have you noticed that all Java object references are displayed as using 0 bytes in the Matlab Workspace browser and the <i><b>whos</b></i> function? This is not a bug, but in fact a deliberate design decision, in order to avoid the need to calculate the deep-memory usage of Java references (i.e., objects that include references to other objects etc.).<br />
Well, sometimes it so happens that we really need to know the size of the Java object, or the size difference between two objects (to help resolve memory leaks, for example). There are several resources online that explain how to do this in Matlab (examples <a target="_blank" rel="nofollow" href="http://stackoverflow.com/questions/52353/in-java-what-is-the-best-way-to-determine-the-size-of-an-object">1</a>, <a target="_blank" rel="nofollow" href="http://stackoverflow.com/questions/757300/programatically-calculate-memory-occupied-by-a-java-object-including-objects-it">2</a>, <a target="_blank" rel="nofollow" href="http://www.javapractices.com/topic/TopicAction.do?Id=83">3</a>). Today I will show two alternatives that I found useful within the context of Matlab:</p>
<ul>
<li><a href="/articles/assessing-java-object-size-in-matlab/#ObjectProfiler">ObjectProfiler</a></li>
<li><a href="/articles/assessing-java-object-size-in-matlab/#Classmexer">Classmexer</a></li>
</ul>
<p><span id="more-4530"></span></p>
<h3 id="ObjectProfiler">ObjectProfiler</h3>
<p><span class="alignright"><img fetchpriority="high" decoding="async" src="https://undocumentedmatlab.com/images/java_weigh.jpg" width="200" height="480" /></span> A full decade ago, Vladimir Roubtsov posted a <a target="_blank" rel="nofollow" href="http://www.javaworld.com/javaworld/javaqa/2003-12/02-qa-1226-sizeof.html">very detailed article</a> in the JavaWorld magazine explaining how to profile and report Java object sizes. The article contained a <a target="_blank" rel="nofollow" href="http://images.techhive.com/downloads/idge/imported/article/jvw/2003/12/02-qa-1226-sizeof.zip">downloadable Java archive</a> that we can easily use in Matlab. After downloading the zip file, extract the contained <i>objectprofiler.jar</i> file, add it to the Java classpath and start using it, as follows:</p>
<pre lang='matlab'>
>> javaaddpath 'C:\path\to\where\you\placed\your\copy\of\objectprofiler.jar'
>> com.vladium.utils.ObjectProfiler.sizeof(java.awt.Color.red)
ans =
    28
</pre>
<p>Note that the reported sizes may be different on different Matlab releases (=JVM versions) and platforms. Also note that the reported size (28 bytes for the Java Color object) is much smaller than the size required to <a target="_blank" href="/articles/serializing-deserializing-matlab-data/">serialize the object</a> into a byte stream (408 bytes in this case), as I&#8217;ve shown in last week&#8217;s article.<br />
Running the <i>sizeof</i> method on deeply referenced objects could quickly exhaust Matlab&#8217;s memory:</p>
<div class="wp_syntax">
<div class="code">
<pre class="matlab" style="font-family:monospace;">
>> jDesktop = com.mathworks.mde.desk.MLDesktop.getInstance;
<span style="color: green;">% on R2012a: takes a long time and finally reports</span>
>> com.vladium.utils.ObjectProfiler.sizeof(jDesktop)
ans =
    <span style="color: blue;">72011200</span>
<span style="color: green;">% on R2014a: takes a long time and finally croaks
% (which is not surprising, considering the Desktop's new toolstrip)</span>
>> com.vladium.utils.ObjectProfiler.sizeof(jDesktop)
<span style="color: red;">Java exception occurred:
java.lang.OutOfMemoryError: Java heap space
	at java.util.IdentityHashMap.resize(Unknown Source)
	at java.util.IdentityHashMap.put(Unknown Source)
	at com.vladium.utils.ObjectProfiler.computeSizeof(ObjectProfiler.java:329)
	at com.vladium.utils.ObjectProfiler.sizeof(ObjectProfiler.java:85)</span>
</pre>
</div>
</div>
<p><code>ObjectProfiler</code> has a very handy feature of enabling a visual display of the object&#8217;s reference tree. For example:</p>
<div class="wp_syntax">
<div class="code">
<pre class="text" style="font-family:monospace;">
>> jObject = java.util.Hashtable
jObject =
{}
>> com.vladium.utils.ObjectProfiler.sizeof(jObject)
ans =
   105
>> com.vladium.utils.ObjectProfiler.profile(jObject).dump
ans =
  105 -> &lt;INPUT> : Hashtable
    60 (57.1%) -> Hashtable#table : Hashtable$Entry[]
      60 (57.1%) -> &lt;shell: Hashtable$Entry[], length=11>
    45 (42.9%) -> &lt;shell: 6 prim/4 ref fields>
>> jObject.put(<span style="color:#A020F0;">'key1'</span>,<span style="color:blue;">1.23</span>);
>> com.vladium.utils.ObjectProfiler.sizeof(jObject)
ans =
   189
>> com.vladium.utils.ObjectProfiler.profile(jObject).dump
ans =
  189 -> &lt;INPUT> : Hashtable
    144 (76.2%) -> Hashtable#table : Hashtable$Entry[]
      84 (44.4%) -> Hashtable#table[4] : Hashtable$Entry
        44 (23.3%) -> Hashtable$Entry#key : String
          24 (12.7%) -> String#value : char[]
            24 (12.7%) -> &lt;shell: char[], length=4>
          20 (10.6%) -> &lt;shell: 2 prim/1 ref fields>
        24 (12.7%) -> &lt;shell: 1 prim/3 ref fields>
        16 (8.5%) -> Hashtable$Entry#value : Double
          16 (8.5%) -> &lt;shell: 1 prim/0 ref fields>
      60 (31.7%) -> &lt;shell: Hashtable$Entry[], length=11>
    45 (23.8%) -> &lt;shell: 6 prim/4 ref fields>
</pre>
</div>
</div>
<p>As we can see, adding the <code>'key1'</code> key to the hashtable object actually added 2 new references: a 44-byte <code>String</code> and a 16-byte <code>Double</code>, plus 24 additional overhead bytes, for a total addition of 84 bytes.<br />
<code>ObjectProfiler</code> has a convenience method <i>sizedelta(jObject1,jObject2)</i> which returns the size delta in bytes between the two specified objects. There are a few additional methods for <code>ObjectProfiler</code> and the <code>ObjectProfiler.profile()</code> object &#8211; interested readers are referred to the original article and to the source code (which is included within the zip file that we downloaded).</p>
<h3 id="Classmexer">Classmexer</h3>
<p>The <a target="_blank" rel="nofollow" href="http://www.javamex.com/classmexer/"><code>Classmexer</code> utility</a> works a bit differently but is also very easy to use once the initial setup is done. First we need to <a target="_blank" rel="nofollow" href="http://www.javamex.com/classmexer/classmexer-0_03.zip">download</a> the zip file, then extract the <i>classmexer.jar</i> and place it in your Matlab&#8217;s startup folder. In that same folder, edit (create if necessary) a <i>java.opts</i> file with the following line:</p>
<blockquote>
<pre lang='text'>-javaagent:classmexer.jar</pre>
</blockquote>
<p>After restarting Matlab, we can use <code>Classmexer</code> as follows:</p>
<pre lang='matlab'>
>> com.javamex.classmexer.MemoryUtil.deepMemoryUsageOf(java.awt.Color.red)
ans =
    32
>> jObject = java.util.Hashtable;
>> com.javamex.classmexer.MemoryUtil.deepMemoryUsageOf(jObject)
ans =
   120
>> jObject.put('key1',1.23); jObject
jObject =
{key1=1.23}
>> com.javamex.classmexer.MemoryUtil.deepMemoryUsageOf(jObject)
ans =
   264
</pre>
<p>Note how the values reported by <code>Classmexer</code> differ from those of <code>ObjectProfiler</code>. To tell the truth, I&#8217;m not sure which of them to believe: <code>ObjectProfiler</code> seems more detailed, but <code>Classmexer</code> uses Java&#8217;s preferred mechanism of using an <a target="_blank" rel="nofollow" href="http://www.javamex.com/tutorials/memory/instrumentation.shtml">instrumentation agent</a>.</p>
<h3 id="related">Related resources</h3>
<p>We can also use <code>java.lang.Runtime.getRuntime</code>&#8216;s methods (<i>maxMemory()</i>, <i>freeMemory()</i> and <i>totalMemory()</i>) to monitor overall Java memory (note a MathWorks <a target="_blank" rel="nofollow" href="http://blogs.mathworks.com/community/2009/08/17/calling-java-from-matlab-memory-issues/">blog article</a> on this). Note that this reports the total memory values, and fluctuates (sometimes dramatically) from second to second, as Matlab&#8217;s desktop and other Java-heavy tools create Java objects, which the JVM garbage-collects.<br />
Jeff Gullet has <a target="_blank" rel="nofollow" href="https://www.mathworks.com/matlabcentral/newsreader/view_thread/296813#797410">suggested</a> to monitor these values and programmatically activate a synchronous Java garbage-collection when the memory appears too &#8220;crowded&#8221; (I fixed Jeff&#8217;s posted idea in the snippet below):</p>
<pre lang='matlab'>
r = java.lang.Runtime.getRuntime;
if (r.freeMemory/r.totalMemory) < 0.1
    r.gc();
end
</pre>
<p>A MathWorks <a target="_blank" rel="nofollow" href="http://www.mathworks.com/matlabcentral/answers/95990">technical article</a> provided some assistance on using the <code>JConsole</code> utility to profile Java memory in Matlab. We can also use the <code>JMap</code> and <code>JHat</code> <a target="_blank" rel="nofollow" href="http://docs.oracle.com/javase/7/docs/technotes/tools/index.html">utilities</a>. All these utilities are part of the free Java Development Kit (JDK) that can be downloaded online, just ensure you're using the same Java version as reported by Matlab:</p>
<pre lang='matlab'>
>> version -java
ans =
Java 1.7.0_11-b21   % i.e., Java 7 update 11
</pre>
<p>In addition to the JDK tools, I find the open-source <a target="_blank" rel="nofollow" href="http://visualvm.java.net/"><code>JVisualVM</code></a> utility informative and easy to use. We can also use <a target="_blank" rel="nofollow" href="http://www.khelekore.org/jmp">JMP</a> (R2007a and earlier), <a target="_blank" rel="nofollow" href="http://www.khelekore.org/jmp/tijmp">TIJMP</a> (R2007b and later), or other 3rd-party tools. A list of Java-centric resources is available in the <a target="_blank" rel="nofollow" href="http://www.oracle.com/technetwork/java/javase/index-138283.html">Java SE Troubleshooting guide</a>.<br />
To complete the picture, a couple of years ago I posted an article on <a target="_blank" href="/articles/profiling-matlab-memory-usage/">profiling Matlab&#8217;s memory usage</a>, which included a section on Java memory. You may also find useful another article I wrote, on finding and <a target="_blank" href="/articles/matlab-java-memory-leaks-performance/">fixing a Java memory leak in Matlab</a>.</p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/assessing-java-object-size-in-matlab">Assessing Java object size in Matlab</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/findjobj-find-underlying-java-object" rel="bookmark" title="FindJObj &#8211; find a Matlab component&#039;s underlying Java object">FindJObj &#8211; find a Matlab component&#039;s underlying Java object </a> <small>The FindJObj utility can be used to access and display the internal components of Matlab controls and containers. This article explains its uses and inner mechanism....</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-callbacks-for-java-events-in-r2014a" rel="bookmark" title="Matlab callbacks for Java events in R2014a">Matlab callbacks for Java events in R2014a </a> <small>R2014a changed the way in which Java objects expose events as Matlab callbacks. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-callbacks-for-java-events" rel="bookmark" title="Matlab callbacks for Java events">Matlab callbacks for Java events </a> <small>Events raised in Java code can be caught and handled in Matlab callback functions - this article explains how...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/using-pure-java-gui-in-deployed-matlab-apps" rel="bookmark" title="Using pure Java GUI in deployed Matlab apps">Using pure Java GUI in deployed Matlab apps </a> <small>Using pure-Java GUI in deployed Matlab apps requires a special yet simple adaptation. ...</small></li>
</ol>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://undocumentedmatlab.com/articles/assessing-java-object-size-in-matlab/feed</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Allocation performance take 2</title>
		<link>https://undocumentedmatlab.com/articles/allocation-performance-take-2?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=allocation-performance-take-2</link>
					<comments>https://undocumentedmatlab.com/articles/allocation-performance-take-2#comments</comments>
		
		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Wed, 14 Aug 2013 18:00:05 +0000</pubDate>
				<category><![CDATA[Low risk of breaking in future versions]]></category>
		<category><![CDATA[Memory]]></category>
		<category><![CDATA[Undocumented feature]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Pure Matlab]]></category>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=4086</guid>

					<description><![CDATA[<p>The clear function has some non-trivial effects on Matlab performance. </p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/allocation-performance-take-2">Allocation performance take 2</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/performance-scatter-vs-line" rel="bookmark" title="Performance: scatter vs. line">Performance: scatter vs. line </a> <small>In many circumstances, the line function can generate visually-identical plots as the scatter function, much faster...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/zero-testing-performance" rel="bookmark" title="Zero-testing performance">Zero-testing performance </a> <small>Subtle changes in the way that we test for zero/non-zero entries in Matlab can have a significant performance impact. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/preallocation-performance" rel="bookmark" title="Preallocation performance">Preallocation performance </a> <small>Preallocation is a standard Matlab speedup technique. Still, it has several undocumented aspects. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/array-resizing-performance" rel="bookmark" title="Array resizing performance">Array resizing performance </a> <small>Several alternatives are explored for dynamic array growth performance in Matlab loops. ...</small></li>
</ol>
</div>
]]></description>
										<content:encoded><![CDATA[<p>Last week, Mike Croucher posted a very interesting <a target="_blank" rel="nofollow" href="http://www.walkingrandomly.com/?p=5043">article</a> on the fact that <i><b>cumprod</b></i> can be used to generate a vector of powers much more quickly than the built-in <code>.^</code> operator. Trying to improve on Mike&#8217;s results, I used my finding that <code>zeros(n,m)+scalar</code> is often faster than <code>ones(n,m)*scalar</code> (see my article on <a target="_blank" href="/articles/preallocation-performance/#non-default">pre-allocation performance</a>). Applying this to Mike&#8217;s powers-vector example, <code>zeros(n,m)+scalar</code> only gave me a 25% performance boost (i.e., 1-1/1.25 or 20% faster), rather than the x5 speedup that I received in my original article.<br />
Naturally, the difference could be due to different conditions: a different running platform, OS, Matlab release, and allocation size. But the difference felt intriguing enough to warrant a small investigation. I came up with some interesting new findings, that I cannot fully explain:<br />
<figure style="width: 495px" class="wp-caption alignright"><img decoding="async" alt="The performance of allocating zeros, ones" src="https://undocumentedmatlab.com/images/clear_performance.gif" title="The performance of allocating zeros, ones" width="495" height="392" /><figcaption class="wp-caption-text">The performance of allocating zeros, ones (x100)</figcaption></figure>            </p>
<pre lang='matlab'>
function t=perfTest
    % Run tests multiple time, for multiple allocation sizes
    n=100; tidx = 1; iters = 100;
    while n < 1e8
        t(tidx,1) = n;
        clear y; tic, for idx=1:iters, clear y; y=ones(n,1);  end, t(tidx,2)=toc;  % clear; ones()
        clear y; tic, for idx=1:iters, clear y; y=zeros(n,1); end, t(tidx,3)=toc;  % clear; zeros()
        clear y; tic, for idx=1:iters,          y=ones(n,1);  end, t(tidx,4)=toc;  % only ones()
        clear y; tic, for idx=1:iters,          y=zeros(n,1); end, t(tidx,5)=toc;  % only zeros()
        n = n * 2;
        tidx = tidx + 1;
    end
    % Normalize result on a per-element basis
    t2 = bsxfun(@rdivide, t(:,2:end), t(:,1));
    % Display the results
    h  = loglog(t(:,1), t(:,2:end));  % overall durations
    %h = loglog(t(:,1), t2);  % normalized durations
    set(h, 'LineSmoothing','on');  % see https://undocumentedmatlab.com/articles/plot-linesmoothing-property/
    set(h(2), 'LineStyle','--', 'Marker','+', 'MarkerSize',5, 'Color',[0,.5,0]);
    set(h(3), 'LineStyle',':',  'Marker','o', 'MarkerSize',5);
    set(h(4), 'LineStyle','-.', 'Marker','*', 'MarkerSize',5);
    legend(h, 'clear; ones', 'clear; zeros', 'ones', 'zeros', 'Location','NorthWest');
    xlabel('# allocated elements');
    ylabel('duration [secs]');
    box off
end
</pre>
<p><span id="more-4086"></span><br />
The full results were (R2013a, Win 7 64b, 8MB):<br />
<figure style="width: 495px" class="wp-caption alignright"><img decoding="async" alt="The same data normalized per-element" src="https://undocumentedmatlab.com/images/clear_performance2.gif" title="The same data normalized per-element" width="495" height="392" /><figcaption class="wp-caption-text">The same data normalized per-element (x100)</figcaption></figure>               </p>
<pre lang='matlab'>
       n  clear,ones  clear,zeros    only ones   only zeros
========  ==========  ===========    =========   ==========
     100    0.000442     0.000384     0.000129     0.000124
     200    0.000390     0.000378     0.000150     0.000121
     400    0.000404     0.000424     0.000161     0.000151
     800    0.000422     0.000438     0.000165     0.000176
    1600    0.000583     0.000516     0.000211     0.000206
    3200    0.000656     0.000606     0.000325     0.000296
    6400    0.000863     0.000724     0.000587     0.000396
   12800    0.001289     0.000976     0.000975     0.000659
   25600    0.002184     0.001574     0.001874     0.001360
   51200    0.004189     0.002776     0.003649     0.002320
  102400    0.010900     0.005870     0.010778     0.005487
  204800    0.051658     0.000966     0.049570     0.000466
  409600    0.095736     0.000901     0.095183     0.000463
  819200    0.213949     0.000984     0.219887     0.000817
 1638400    0.421103     0.001023     0.429692     0.000610
 3276800    0.886328     0.000936     0.877006     0.000609
 6553600    1.749774     0.000972     1.740359     0.000526
13107200    3.499982     0.001108     3.550072     0.000649
26214400    7.094449     0.001144     7.006229     0.000712
52428800   14.039551     0.001853    14.396687     0.000822
</pre>
<p>(Note: all numbers should be divided by the number of loop iterations <code>iters=100</code>)<br />
As can be seen from the data and resulting plots (log-log scale), the more elements we allocate, the longer this takes. It is not surprising that in all cases the allocation duration is roughly linear, since when twice as many elements need to be allocated, this roughly takes twice as long. It is also not surprising to see that the allocation has some small overhead, which is apparent when allocating a small number of elements.<br />
A potentially surprising result, namely that allocating 200-400 elements is in some cases a bit faster than allocating only 100 elements, can actually be attributed to measurement inaccuracies and JIT warm-up time.<br />
Another potentially surprising result, that <i><b>zeros</b></i> is consistently faster than <i><b>ones</b></i> can perhaps be explained by <i><b>zeros</b></i> being able to use more efficient low-level functions (<code>bzero</code>) for clearing memory, than <i><b>ones</b></i> which needs to <code>memset</code> a value.<br />
A somewhat more surprising effect is that of the <i><b>clear</b></i> command: As can be seen in the code, calling <i><b>clear</b></i> within the timed loops has no functional use, because in all cases the variable <code>y</code> is being overridden with new values. However, we clearly see that the overhead of calling <i><b>clear</b></i> is an extra 3&#038;#181S or so per call. Calling <i><b>clear</b></i> is important in cases where we deal with very large memory constructs: clearing them from memory enables additional memory allocations (of the same or different variables) without requiring virtual memory paging, which would be disastrous for performance. But if we have a very large loop which calls <i><b>clear</b></i> numerous times and does not serve such a purpose, then it is better to remove this call: although the overhead is small, it accumulates and might be an important factor in very large loops.<br />
Another aspect that is surprising is the fact that <i><b>zeros</b></i> (with or without <i><b>clear</b></i>) is <i>much</i> faster when allocating 200K+ elements, compared to 100K elements. This is indicative of an internal switch to a more optimized allocation algorithm, which apparently has constant speed rather than linear with allocation size. At the very same point, there is a corresponding performance <i>degradation</i> in the allocation of <i><b>ones</b></i>. I suspect that 100K is the point at which <a target="_blank" rel="nofollow" href="http://www.mathworks.com/support/solutions/en/data/1-4PG4AN/">Matlab&#8217;s internal parallelization</a> (multi-threading) kicks in. This occurs at varying points for different functions, but it is normally some multiple of 20K elements (20K, 40K, 100K or 200K &#8211; a detailed list was <a target="_blank" rel="nofollow" href="http://www.walkingrandomly.com/?p=1894">posted</a> by Mike Croucher again). Apparently, it kicks-in at 100K for <i><b>zeros</b></i>, but for some reason not for <i><b>ones</b></i>.<br />
The performance degradation at 100K elements has been around in Matlab for ages &#8211; I see it as far back as R12 (Matlab 6.0), for both <i><b>zeros</b></i> and <i><b>ones</b></i>. The reason for it is unknown to me, if anyone could illuminate me, I&#8217;d be happy to hear. The new thing is the implementation of a faster internal mechanism (presumably multi-threading) in R2008b (Matlab 7.7) for <i><b>zeros</b></i>, at the very same point (100K elements), although for some unknown reason this was not done for <i><b>ones</b></i> as well.<br />
Another aspect that is strange here is that the speedup for <i><b>zeros</b></i> at 200K elements is ~12 &#8211; much higher than the expected optimal speedup of 4 on my quad-core system. The higher speedup may perhaps be explained by hyper-threading or <a target="_blank" rel="nofollow" href="http://en.wikipedia.org/wiki/SIMD">SIMD</a> at the CPU level.<br />
In any case, going back to the original reason I started this investigation, the reason for getting such wide disparity in speedups between using <i><b>zeros</b></i> and <i><b>ones</b></i> for 10K elements (as in Mike Croucher&#8217;s post), and for 3M elements (as in my pre-allocation performance article) now becomes clear: In the case of 10K elements, multi-threading is not active, and <i><b>zeros</b></i> is indeed only 20-30% faster than <i><b>ones</b></i>; In the case of 3M elements, the superior multi-threading of <i><b>zeros</b></i> over <i><b>ones</b></i> enables much larger speedups, increasing with allocated size.<br />
Some take-away lessons:</p>
<ul>
<li>Using <i><b>zeros</b></i> is always preferable to <i><b>ones</b></i>, especially for more than 100K elements on Matlab 7.7 (R2008b) and newer.</li>
<li><code>zeros(n,m)+scalar</code> is consistently faster than <code>ones(n,m)*scalar</code>, especially for more than 100K elements, and for the same reason.</li>
<li>In some cases, it may be worth to use a built-in function with more elements than actually necessary, just to benefit from its internal multi-threading.</li>
<li>Never take performance assumptions for granted. Always test on your specific system using a representative data-set.</li>
</ul>
<p>p.s. &#8211; readers who are interested in the historical evolution of the <i><b>zeros</b></i> function are referred to <a target="_blank" rel="nofollow" href="http://blogs.mathworks.com/loren/2013/08/09/zero-evolution/">Loren Shure&#8217;s latest post</a>, only a few days ago (a fortunate coincidence indeed). Unfortunately, Loren&#8217;s post does not illuminate the mysteries above.</p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/allocation-performance-take-2">Allocation performance take 2</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/performance-scatter-vs-line" rel="bookmark" title="Performance: scatter vs. line">Performance: scatter vs. line </a> <small>In many circumstances, the line function can generate visually-identical plots as the scatter function, much faster...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/zero-testing-performance" rel="bookmark" title="Zero-testing performance">Zero-testing performance </a> <small>Subtle changes in the way that we test for zero/non-zero entries in Matlab can have a significant performance impact. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/preallocation-performance" rel="bookmark" title="Preallocation performance">Preallocation performance </a> <small>Preallocation is a standard Matlab speedup technique. Still, it has several undocumented aspects. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/array-resizing-performance" rel="bookmark" title="Array resizing performance">Array resizing performance </a> <small>Several alternatives are explored for dynamic array growth performance in Matlab loops. ...</small></li>
</ol>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://undocumentedmatlab.com/articles/allocation-performance-take-2/feed</wfw:commentRss>
			<slash:comments>6</slash:comments>
		
		
			</item>
		<item>
		<title>The Java import directive</title>
		<link>https://undocumentedmatlab.com/articles/the-java-import-directive?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=the-java-import-directive</link>
					<comments>https://undocumentedmatlab.com/articles/the-java-import-directive#respond</comments>
		
		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Wed, 13 Jun 2012 18:00:37 +0000</pubDate>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Low risk of breaking in future versions]]></category>
		<category><![CDATA[Memory]]></category>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=2959</guid>

					<description><![CDATA[<p>The import function can be used to clarify Java code used in Matlab. </p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/the-java-import-directive">The Java import directive</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/converting-java-vectors-to-matlab-arrays" rel="bookmark" title="Converting Java vectors to Matlab arrays">Converting Java vectors to Matlab arrays </a> <small>Converting Java vectors to Matlab arrays is pretty simple - this article explains how....</small></li>
<li><a href="https://undocumentedmatlab.com/articles/udd-and-java" rel="bookmark" title="UDD and Java">UDD and Java </a> <small>UDD provides built-in convenience methods to facilitate the integration of Matlab UDD objects with Java code - this article explains how...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/using-pure-java-gui-in-deployed-matlab-apps" rel="bookmark" title="Using pure Java GUI in deployed Matlab apps">Using pure Java GUI in deployed Matlab apps </a> <small>Using pure-Java GUI in deployed Matlab apps requires a special yet simple adaptation. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-java-memory-leaks-performance" rel="bookmark" title="Matlab-Java memory leaks, performance">Matlab-Java memory leaks, performance </a> <small>Internal fields of Java objects may leak memory - this article explains how to avoid this without sacrificing performance. ...</small></li>
</ol>
</div>
]]></description>
										<content:encoded><![CDATA[<p>A recent <a target="_blank" rel="nofollow" href="https://thilinasameera.wordpress.com/2012/05/30/undocumented-matlab-java-codes-in-matlab-m-files/">blog post</a> on a site I came across showed me that some users who are using Java in Matlab take the unnecessary precaution of always using the fully-qualified class-name (FQCN) of the Java classes, and are not familiar with the <i><b>import</b></i> directive in Matlab. Today I&#8217;ll show how to use <i><b>import</b></i> to simplify Java usage in Matlab.<br />
Basically, the <i><b>import</b></i> function enables Matlab users to declare that a specific class name belongs to a particular Java namespace, without having to specifically state the full namespace in each use. In this regard, Matlab&#8217;s <i><b>import</b></i> closely mimics <a target="_blank" rel="nofollow" href="http://docs.oracle.com/javase/tutorial/java/package/usepkgs.html">Java&#8217;s <code>import</code></a>, and not surprisingly also has similar syntax:</p>
<pre lang='matlab'>
% Alternative 1 - using explicit namespaces
jFrame = javax.swing.JFrame;
jDim = java.awt.Dimension(50,120);
jPanel.add(jButton, java.awt.GridBagConstraints(0, 0, 1, 1, 1.0, 1.0, ...
                    java.awt.GridBagConstraints.NORTHWEST, ...
                    java.awt.GridBagConstraints.NONE, ...
                    java.awt.Insets(6, 12, 6, 6), 1, 1));
% Alternative 2 - using import
import javax.swing.*
import java.awt.*
jFrame = JFrame;
jDim = Dimension(50,120);
jPanel.add(jButton, GridBagConstraints(0, 0, 1, 1, 1.0, 1.0, ...
                    GridBagConstraints.NORTHWEST, ...
                    GridBagConstraints.NONE, ...
                    Insets(6, 12, 6, 6), 1, 1));
</pre>
<p>Note how much cleaner Alternative #2 looks compared to Alternative #1. However, as with Java&#8217;s <code>import</code>, there is a tradeoff here: by removing the namespaces from the code, it could become confusing as to which namespace a particular object belongs. For example, by specifying <code>java.awt.Insets</code>, we immediately know that it&#8217;s an AWT insets object, rather than, say, a book&#8217;s insets. There is no clear-cut answer to this dilemma, and in fact there are many Java developers who prefer one way or the other. As in Java, the choice is yours to make also in Matlab.<br />
Perhaps a good compromise, one which I often use, is to stay away from the <code>import something.*</code> format and directly specify the imported classes. In the example above, I would have written:</p>
<pre lang='matlab'>
% Alternative 3 - using explicit import
import javax.swing.JFrame;
import java.awt.Dimension;
import java.awt.GridBagConstraints;
import java.awt.Insets;
jFrame = JFrame;
jDim = Dimension(50,120);
jPanel.add(jButton, GridBagConstraints(0, 0, 1, 1, 1.0, 1.0, ...
                    GridBagConstraints.NORTHWEST, ...
                    GridBagConstraints.NONE, ...
                    Insets(6, 12, 6, 6), 1, 1));
</pre>
<p>This alternative has the benefit that it is immediately clear that Insets belongs to the AWT package, without having to explicitly use the <code>java.awt</code> prefix everywhere. Obviously, if the list of imported classes becomes too large, we could always revert to the <code>import java.awt.*</code> format.<br />
Interestingly, we can also use the functional form of <i><b>import</b></i>:</p>
<pre lang='matlab'>
% Alternative #1
import java.lang.String;
% Alternative #2
import('java.lang.String');
% Alternative #3
classname = 'java.lang.String';
import(classname);
</pre>
<p>Using the third alternative format, that of dynamic import, enables us to decide <b><u>in run-time</u></b>(!) whether to use a class C from package PA or PB. This is a cool feature but must be used with care, since it could lead to very difficult-to-diagnose errors. For example, if the code later invokes a method that exists only in PA.C but not in PB.C. The correct way to do this would probably be to define a class hierarchy where PA.C and PB.C both inherit from the same superclass. But in some cases this is simply not feasible (for example, when you have 2 JARs from different vendors, which use the same classname) and dynamic importing can help.<br />
It is possible to specify multiple input parameters to <i><b>import</b></i> in the same directive. However, note that Matlab 7.5 R2007b and older releases crash (at least on WinXP) when one of the imported parameters is any MathWorks-derived (<code>com.mathworks...</code>) package/class. This bug was fixed in Matlab 7.6 R2008a, but to support earlier releases simply separate such imports into different lines:</p>
<pre lang='matlab'>
% This crashes Matlab 7.5 R2007b and earlier;  OK on Matlab 7.6 R2008a and later
import javax.swing.* com.mathworks.mwswing.*
% This is ok in all Matlab releases
import javax.swing.*
import com.mathworks.mwswing.*
</pre>
<p><i><b>import</b></i> does NOT load the Java class into memory &#8211; it just declares its namespace for the JVM. This mechanism is sometimes called <i>lazy loading</i> (compare to the <a target="_blank" href="/articles/internal-matlab-memory-optimizations/">lazy copying mechanism</a> that I described a couple of weeks ago). To force-load a class into memory, either use it directly (for example, by declaring an object of it, or by using one of its methods), or use a classloader to load it. The issue of JVM classloaders in Matlab is non-trivial (there are several non-identical alternatives), and will be covered in a future article.<br />
A few additional notes:</p>
<ul>
<li>Although not strictly mandatory, it is good practice to place all the <i><b>import</b></i> directives at the top of the function, for visibility and code maintainability reasons</li>
<li>There is no need to end the <i><b>import</b></i> declaration with a semicolon (;). It&#8217;s really a matter of style consistency. I usually omit it because I find that it is a bit intrusive when placed after a *</li>
<li><i><b>import</b></i> by itself, without any input arguments (class/package names) returns the current list of imported classes/packages</li>
<li>Imported classes and packages can be un-imported using the <i><b>clear import</b></i> directive from the Command Window</li>
<li>It has been <a target="_blank" rel="nofollow" href="https://www.mathworks.com/matlabcentral/newsreader/view_thread/285933">reported</a> that in some cases using <i><b>import</b></i> in deployed (compiled) application fails &#8211; the solution is to use the FQCN in such cases</li>
</ul>
<p><i><b>Note: This topic is covered and extended in Chapter 1 of my <a target="_blank" href="/matlab-java-book/">Matlab-Java programming book</a></b></i></p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/the-java-import-directive">The Java import directive</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/converting-java-vectors-to-matlab-arrays" rel="bookmark" title="Converting Java vectors to Matlab arrays">Converting Java vectors to Matlab arrays </a> <small>Converting Java vectors to Matlab arrays is pretty simple - this article explains how....</small></li>
<li><a href="https://undocumentedmatlab.com/articles/udd-and-java" rel="bookmark" title="UDD and Java">UDD and Java </a> <small>UDD provides built-in convenience methods to facilitate the integration of Matlab UDD objects with Java code - this article explains how...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/using-pure-java-gui-in-deployed-matlab-apps" rel="bookmark" title="Using pure Java GUI in deployed Matlab apps">Using pure Java GUI in deployed Matlab apps </a> <small>Using pure-Java GUI in deployed Matlab apps requires a special yet simple adaptation. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-java-memory-leaks-performance" rel="bookmark" title="Matlab-Java memory leaks, performance">Matlab-Java memory leaks, performance </a> <small>Internal fields of Java objects may leak memory - this article explains how to avoid this without sacrificing performance. ...</small></li>
</ol>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://undocumentedmatlab.com/articles/the-java-import-directive/feed</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Internal Matlab memory optimizations</title>
		<link>https://undocumentedmatlab.com/articles/internal-matlab-memory-optimizations?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=internal-matlab-memory-optimizations</link>
					<comments>https://undocumentedmatlab.com/articles/internal-matlab-memory-optimizations#comments</comments>
		
		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Wed, 30 May 2012 12:09:16 +0000</pubDate>
				<category><![CDATA[Low risk of breaking in future versions]]></category>
		<category><![CDATA[Memory]]></category>
		<category><![CDATA[Stock Matlab function]]></category>
		<category><![CDATA[JIT]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Pure Matlab]]></category>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=2952</guid>

					<description><![CDATA[<p>Copy-on-write and in-place data manipulations are very useful Matlab performance improvement techniques. </p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/internal-matlab-memory-optimizations">Internal Matlab memory optimizations</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/matlabs-internal-memory-representation" rel="bookmark" title="Matlab&#039;s internal memory representation">Matlab&#039;s internal memory representation </a> <small>Matlab's internal memory structure is explored and discussed. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/profiling-matlab-memory-usage" rel="bookmark" title="Profiling Matlab memory usage">Profiling Matlab memory usage </a> <small>mtic and mtoc were a couple of undocumented features that enabled users of past Matlab releases to easily profile memory usage. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-java-memory-leaks-performance" rel="bookmark" title="Matlab-Java memory leaks, performance">Matlab-Java memory leaks, performance </a> <small>Internal fields of Java objects may leak memory - this article explains how to avoid this without sacrificing performance. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/couple-of-bugs-and-workarounds" rel="bookmark" title="A couple of internal Matlab bugs and workarounds">A couple of internal Matlab bugs and workarounds </a> <small>A couple of undocumented Matlab bugs have simple workarounds. ...</small></li>
</ol>
</div>
]]></description>
										<content:encoded><![CDATA[<p>Yesterday I attended a seminar on developing trading strategies using Matlab. This is of interest to me because of my <a target="_blank" href="/ib-matlab/">IB-Matlab</a> product, and since many of my clients are traders in the financial sector. In the seminar, the issue of memory and performance naturally arose. It seemed to me that there was some confusion with regards to Matlab&#8217;s built-in memory optimizations. Since I discussed related topics in the past two weeks (<a target="_blank" href="/articles/preallocation-performance/">preallocation performance</a>, <a target="_blank" href="/articles/array-resizing-performance/">array resizing performance</a>), these internal optimizations seemed a natural topic for today&#8217;s article.<br />
The specific mechanisms I&#8217;ll describe today are <i><b>Copy on Write</b></i> (aka <i>COW</i> or <i>Lazy Copying</i>) and <i><b>in-place data manipulations</b></i>. Both mechanisms were already documented (for example, on <a target="_blank" rel="nofollow" href="http://blogs.mathworks.com/loren/2006/05/10/memory-management-for-functions-and-variables/">Loren&#8217;s blog</a> or on <a target="_blank" href="/articles/matlab-mex-in-place-editing/#COW">this blog</a>). But apparently, they are still not well known. Understanting them could help Matlab users modify their code to improve performance and reduce memory consumption. So although this article is not entirely &#8220;undocumented&#8221;, I&#8217;ll give myself some slack today.</p>
<h3 id="COW">Copy on Write (COW, Lazy Copy)</h3>
<p>Matlab implements an automatic <a target="_blank" rel="nofollow" href="http://blogs.mathworks.com/loren/2006/05/10/memory-management-for-functions-and-variables/">copy-on-write</a> (sometimes called <i>copy-on-update</i> or <i>lazy copying</i>) mechanism, which transparently allocates a temporary copy of the data only when it sees that the input data is modified. This improves run-time performance by delaying actual memory block allocation until absolutely necessary. COW has two variants: during regular variable copy operations, and when passing data as input parameters into a function:</p>
<h4 id="COW-vars">1. Regular variable copies</h4>
<p>When a variable is copied, as long as the data is not modified, both variables actually use the same shared memory block. The data is only copied onto a newly-allocated memory block when one of the variables is modified. The modified variable is assigned the newly-allocated block of memory, which is initialized with the values in the shared memory block before being updated:</p>
<pre lang='matlab'>
data1 = magic(5000);  % 5Kx5K elements = 191 MB
data2 = data1;        % data1 & data2 share memory; no allocation done
data2(1,1) = 0;       % data2 allocated, copied and only then modified
</pre>
<p>If we profile our code using any of <a target="_blank" href="/articles/profiling-matlab-memory-usage/">Matlab&#8217;s memory-profiling options</a>, we will see that the copy operation <code>data2=data1</code> takes negligible time to run and allocates no memory. On the other hand, the simple update operation <code>data2(1,1)=0</code>, which we could otherwise have assumed to take minimal time and memory, actually takes a relatively long time and allocates 191 MB of memory.<br />
<center><figure style="width: 449px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" alt="Copy-on-write effect monitored using the Profiler's -memory option" src="https://undocumentedmatlab.com/images/Copy-on-Write1c.png" title="Copy-on-write effect monitored using the Profiler's -memory option" width="449" height="96"/><figcaption class="wp-caption-text">Copy-on-write effect monitored using the Profiler's -memory option</figcaption></figure><br />
<figure style="width: 439px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" alt="Copy-on-write effect monitored using Windows Process Explorer" src="https://undocumentedmatlab.com/images/Copy-on-Write1a.png" title="Copy-on-write effect monitored using Windows Process Explorer" width="439" height="515"/><figcaption class="wp-caption-text">Copy-on-write effect monitored using Windows Process Explorer</figcaption></figure></center><br />
We first see a memory spike (used during the computation of the magic square data), closely followed by a leveling off at 190.7MB above the baseline (this is due to allocation of data1). Copying <code>data2=data1</code> has no discernible effect on either CPU or memory. Only when we set <code>data2(1,1)=0</code> does the CPU return, in order to allocate the extra 190MB for data2. When we exit the test function, data1 and data2 are both deallocated, returning the Matlab process memory to its baseline level.<br />
There are several lessons that we can draw from this simple example:<br />
Firstly, creating copies of data does not necessarily or immediately impact memory and performance. Rather, it is the update of these copies which may be problematic. If we can modify our code to use more read-only data and less updated data copies, then we would improve performance. The Profiler report will show us exactly where in our code we have memory and CPU hotspots – these are the places we should consider optimizing.<br />
Secondly, when we see such odd behavior in our Profiler reports (i.e., memory and/or CPU spikes that occur on seemingly innocent code lines), we should be aware of the copy-on-write mechanism, which could be the cause for the behavior.</p>
<h4 id="COW-functions">2. Function input parameters</h4>
<p>The copy-on-write mechanism behaves similarly for input parameters in functions: whenever a function is invoked (called) with input data, the memory allocated for this data is used up until the point that one of its copies is modified. At that point, the copies diverge: a new memory block is allocated, populated with data from the shared memory block, and assigned to the modified variable. Only then is the update done on the new memory block.</p>
<pre lang='matlab'>
data1 = magic(5000);      % 5Kx5K elements = 191 MB
data2 = perfTest(data1);
function outData = perfTest(inData)
   outData = inData;   % inData & outData share memory; no allocation
   outData2(1,1) = 0;  % outData allocated, copied and then modified
end
</pre>
<p><center><figure style="width: 534px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" alt="Copy-on-write effect monitored using the Profiler's -memory option" src="https://undocumentedmatlab.com/images/Copy-on-Write2a.png" title="Copy-on-write effect monitored using the Profiler's -memory option" width="534" height="82"/><figcaption class="wp-caption-text">Copy-on-write effect monitored using the Profiler's -memory option</figcaption></figure></center><br />
One lesson that can be drawn from this is that whenever possible we should attempt to use functions that do not modify their input data. This is particularly true if the modified input data is very large. Read-only functions will be faster than functions that do even the simplest of data updates.<br />
Another lesson is that perhaps counter intuitively, it does not make a difference from a performance standpoint to pass read-only data to functions as input parameters. We might think that passing large data objects around as function parameters will involve multiple memory allocations and deallocations of the data. In fact, it is only the data&#8217;s reference (or more precisely, its <a target="_blank" href="/articles/matlabs-internal-memory-representation/">mxArray structure</a>) which is being passed around and placed on the function&#8217;s call stack. Since this reference/structure is quite small in size, there are no real performance penalties. In fact, this only benefits code clarity and maintainability.<br />
The only case where we may wish to use other means of passing data to functions is when a large data object needs to be updated. In such cases, the updated copy will be allocated to a new memory block with an associated performance cost.</p>
<h3 id="inplace">In-place data manipulation</h3>
<p>Matlab&#8217;s interpreter, at least in recent releases, has a very sophisticated algorithm for using in-place data manipulation (<a target="_blank" rel="nofollow" href="http://blogs.mathworks.com/loren/2007/03/22/in-place-operations-on-data">report</a>). Modifying data in-place means that the original data block is modified, rather than creating a new block with the modified data, thus saving any memory allocations and deallocations.<br />
For example, let us manipulate a simple 4Kx4K (122MB) numeric array:</p>
<pre lang='matlab'>
>> m = magic(4000);   % 4Kx4K = 122MB
>> memory
Maximum possible array:            1022 MB (1.072e+09 bytes)
Memory available for all arrays:   1218 MB (1.278e+09 bytes)
Memory used by MATLAB:              709 MB (7.434e+08 bytes)
Physical Memory (RAM):             3002 MB (3.148e+09 bytes)
% In-place array data manipulation: no memory allocated
>> m = m * 0.5;
>> memory
Maximum possible array:            1022 MB (1.072e+09 bytes)
Memory available for all arrays:   1214 MB (1.273e+09 bytes)
Memory used by MATLAB:              709 MB (7.434e+08 bytes)
Physical Memory (RAM):             3002 MB (3.148e+09 bytes)
% New variable allocated, taking an extra 122MB of memory
>> m2 = m * 0.5;
>> memory
Maximum possible array:            1022 MB (1.072e+09 bytes)
Memory available for all arrays:   1092 MB (1.145e+09 bytes)
Memory used by MATLAB:              831 MB (8.714e+08 bytes)
Physical Memory (RAM):             3002 MB (3.148e+09 bytes)
</pre>
<p>The extra memory allocation of the not-in-place manipulation naturally translates into a performance loss:</p>
<pre lang='matlab'>
% In-place data manipulation, no memory allocation
>> tic, m = m * 0.5; toc
Elapsed time is 0.056464 seconds.
% Regular data manipulation (122MB allocation) – 50% slower
>> clear m2; tic, m2 = m * 0.5; toc;
Elapsed time is 0.084770 seconds.
</pre>
<p>The difference may not seem large, but placed in a loop it could become significant indeed, and might be much more important if virtual memory swapping comes into play, or when Matlab&#8217;s memory space is exhausted (out-of-memory error).<br />
Similarly, when returning data from a function, we should try to update the original data variable whenever possible, <a target="_blank" rel="nofollow" href="http://www.mathworks.com/company/newsletters/news_notes/june07/patterns.html">avoiding the need for allocation</a> of a new variable:</p>
<pre lang='matlab'>
% In-place data manipulation, no memory allocation
>> d=0:1e-7:1; tic, d = sin(d); toc
Elapsed time is 0.083397 seconds.
% Regular data manipulation (76MB allocation) – 50% slower
>> clear d2, d=0:1e-7:1; tic, d2 = sin(d); toc
Elapsed time is 0.121415 seconds.
</pre>
<p>Within the function itself we should ensure that we return the modified input variable, and not assign the output to a new variable, so that in-place optimization can also be applied within the function. The in-place optimization mechanism is smart enough to override Matlab&#8217;s default copy-on-write mechanism, which automatically allocates a new copy of the data when it sees that the input data is modified:</p>
<pre lang='matlab'>
% Suggested practice: use in-place optimization within functions
function x = function1(x)
   x = someOperationOn(x);   % temporary variable x is NOT allocated
end
% Standard practice: prevents future use of in-place optimizations
function y = function2(x)
   y = someOperationOn(x);   % new temporary variable y is allocated
end
</pre>
<p>In order to benefit from in-place optimizations of function results, we must both use the same variable in the caller workspace (x = function1(x)) and also ensure that the called function is optimizable (e.g., function x = function1(x)) – if any of these two requirements is not met then in-place function-call optimization is not performed.<br />
Also, for the in-place optimization to be active, we need to call the in-place function from within another function, not from a script or the Matlab Command Window.<br />
A related performance trick is to use masks on the original data rather than temporary data copies. For example, suppose we wish to get the result of a function that acts on only a portion of some large data. If we create a temporary variable that holds the data subset and then process it, it would create an unnecessary copy of the original data:</p>
<pre lang='matlab'>
% Original data
data = 0 : 1e-7 : 1;     % 10^7 elements, 76MB allocated
% Unnecessary copy of data into data2 (extra 8MB allocated)
data2 = data(data>0.1);  % 10^6 elements, 7.6MB allocated
results = sin(data2);    % another 10^6 elements, 7.6MB allocated
% Use of data masks obviates the need for temporary variable data2:
results = sin(data(data>0.1));  % no need for the data2 allocation
</pre>
<p>A note of caution: we should not invest undue efforts to use in-place data manipulation if the overall benefits would be negligible. It would only help if we have a real memory limitation issue and the data matrix is very large.<br />
Matlab in-place optimization is a topic of continuous development. Code which is not in-place optimized today (for example, in-place manipulation on class object properties) may possibly be optimized in next year&#8217;s release. For this reason, it is important to write the code in a way that would facilitate the future optimization (for example, obj.x=2*obj.x rather than y=2*obj.x).<br />
Some in-place optimizations were added to the JIT Accelerator as early as Matlab 6.5 R13, but Matlab 7.3 R2006b saw a major boost. As Matlab&#8217;s JIT Accelerator improves from release to release, we should expect in-place data manipulations to be automatically applied in an increasingly larger number of code cases.<br />
In some older Matlab releases, and in some complex data manipulations where the JIT Accelerator cannot implement in-place processing, a temporary storage is allocated that is assigned to the original variable when the computation is done. To implement in-place data manipulations in such cases we could develop an external function (e.g., <a target="_blank" href="/articles/matlab-mex-in-place-editing/">using Mex</a>) that directly works on the original data block. Note that the officially supported mex update method is to always create deep-copies of the data using <i>mxDuplicateArray()</i> and then modify the new array rather than the original; modifying the original data directly is both <a target="_blank" rel="nofollow" href="http://stackoverflow.com/questions/1708433/matlab-avoiding-memory-allocation-in-mex">discouraged</a> and <a target="_blank" rel="nofollow" href="http://blogs.mathworks.com/loren/2007/03/22/in-place-operations-on-data/#comment-16202">not officially supported</a>. Doing it incorrectly can easily crash Matlab. If you do directly overwrite the original input data, at least ensure that you <a target="_blank" rel="nofollow" href="http://www.mk.tu-berlin.de/Members/Benjamin/mex_sharedArrays">unshare any variables</a> that share the same data memory block, thus mimicking the copy-on-write mechanism.<br />
Using Matlab&#8217;s internal in-place data manipulation is very useful, especially since it is done automatically without need for any major code changes on our part. But sometimes we need certainty of actually processing the original data variable without having to guess or check whether the automated in-place mechanism will be activated or not. This can be achieved using several alternatives:</p>
<ul>
<li>Using global or persistent variable</li>
<li>Using a parent-scope variable within a nested function</li>
<li>Modifying a reference (handle class) object&#8217;s internal properties</li>
</ul>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/internal-matlab-memory-optimizations">Internal Matlab memory optimizations</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/matlabs-internal-memory-representation" rel="bookmark" title="Matlab&#039;s internal memory representation">Matlab&#039;s internal memory representation </a> <small>Matlab's internal memory structure is explored and discussed. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/profiling-matlab-memory-usage" rel="bookmark" title="Profiling Matlab memory usage">Profiling Matlab memory usage </a> <small>mtic and mtoc were a couple of undocumented features that enabled users of past Matlab releases to easily profile memory usage. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-java-memory-leaks-performance" rel="bookmark" title="Matlab-Java memory leaks, performance">Matlab-Java memory leaks, performance </a> <small>Internal fields of Java objects may leak memory - this article explains how to avoid this without sacrificing performance. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/couple-of-bugs-and-workarounds" rel="bookmark" title="A couple of internal Matlab bugs and workarounds">A couple of internal Matlab bugs and workarounds </a> <small>A couple of undocumented Matlab bugs have simple workarounds. ...</small></li>
</ol>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://undocumentedmatlab.com/articles/internal-matlab-memory-optimizations/feed</wfw:commentRss>
			<slash:comments>7</slash:comments>
		
		
			</item>
		<item>
		<title>Array resizing performance</title>
		<link>https://undocumentedmatlab.com/articles/array-resizing-performance?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=array-resizing-performance</link>
					<comments>https://undocumentedmatlab.com/articles/array-resizing-performance#comments</comments>
		
		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Wed, 23 May 2012 20:43:03 +0000</pubDate>
				<category><![CDATA[Low risk of breaking in future versions]]></category>
		<category><![CDATA[Memory]]></category>
		<category><![CDATA[Stock Matlab function]]></category>
		<category><![CDATA[Undocumented feature]]></category>
		<category><![CDATA[JIT]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Pure Matlab]]></category>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=2949</guid>

					<description><![CDATA[<p>Several alternatives are explored for dynamic array growth performance in Matlab loops. </p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/array-resizing-performance">Array resizing performance</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/preallocation-performance" rel="bookmark" title="Preallocation performance">Preallocation performance </a> <small>Preallocation is a standard Matlab speedup technique. Still, it has several undocumented aspects. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/performance-accessing-handle-properties" rel="bookmark" title="Performance: accessing handle properties">Performance: accessing handle properties </a> <small>Handle object property access (get/set) performance can be significantly improved using dot-notation. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/convolution-performance" rel="bookmark" title="Convolution performance">Convolution performance </a> <small>Matlab's internal implementation of convolution can often be sped up significantly using the Convolution Theorem. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/zero-testing-performance" rel="bookmark" title="Zero-testing performance">Zero-testing performance </a> <small>Subtle changes in the way that we test for zero/non-zero entries in Matlab can have a significant performance impact. ...</small></li>
</ol>
</div>
]]></description>
										<content:encoded><![CDATA[<p>As I have <a target="_blank" href="/articles/preallocation-performance/">explained</a> last week, the best way to avoid the performance penalties associated with dynamic array resizing (typically, growth) in Matlab is to pre-allocate the array to its expected final size. I have shown different alternatives for such preallocation, but in all cases the performance is much better than using a naïve dynamic resize.<br />
Unfortunately, such simple preallocation is not always possible. Apparently, all is not lost. There are still a few things we can do to mitigate the performance pain. As in last week, there is much more here than meets the eye at first sight.<br />
The interesting <a target="_blank" rel="nofollow" href="https://www.mathworks.com/matlabcentral/newsreader/view_thread/102704">newsgroup thread from 2005</a> about this issue that I mentioned last week contains two main solutions to this problem. The effects of these solutions is negligible for small data sizes and/or loop iterations (i.e., number of memory reallocations), but could be dramatic for large data arrays and/or a large number of memory reallocations. The difference could well mean the difference between a usable and an unusable (&#8220;hang&#8221;) program:</p>
<h3 id="Chunks">Factor growth: dynamic allocation by chunks</h3>
<p>The idea is to dynamically grow the array by a certain percentage factor each time. When the array first needs to grow by a single element, we would in fact grow it by a larger chunk (say 40% of the current array size, for example by using the <i><b>repmat</b></i> function, or by concatenating a specified number of <i><b>zeros</b></i>, or by setting some way-forward index to 0), so that it would take the program some time before it needs to reallocate memory.<br />
This method has a theoretical cost of N&middot;log(N), which is nearly linear in N for most practical purposes. It is similar to preallocation in the sense that we are preparing a chunk of memory for future array use in advance. You might say that this is on-the-fly preallocation.</p>
<h3 id="Cells">Using cell arrays</h3>
<p>The idea here is to use cell arrays to store and grow the data, then use cell2mat to convert the resulting cell array to a regular numeric array. Cell elements are <a target="_blank" rel="nofollow" href="http://www.mathworks.com/help/techdoc/matlab_prog/brh72ex-25.html#brh72ex-38">implemented as references</a> to distinct memory blocks, so concatenating an object to a cell array merely concatenates its reference; when a cell array is reallocated, only its internal references (not the referenced data) are moved. Note that this relies on the internal implementation of cell arrays in Matlab, and may possibly change in some future release.<br />
Like factor growth, using cell arrays is faster than quadratic behavior (although <a target="_blank" rel="nofollow" href="http://abandonmatlab.wordpress.com/2009/07/28/no-lists/#comment-113">not quite as fast enough</a> as we would have liked, of course). Different situations may favor using either the cell arrays method or the factor growth mechanism.</p>
<h3 id="Growdata">The <i><b>growdata</b></i> utility</h3>
<p>John D&#8217;Errico has posted a well-researched utility called <a target="_blank" rel="nofollow" href="http://www.mathworks.com/matlabcentral/fileexchange/8334-incremental-growth-of-an-array-revisited"><i><b>growdata</b></i></a> that optimizes dynamic array growth for maximal performance. It is based in part on ideas mentioned in the aforementioned 2005 newsgroup thread, where <i><b>growdata</b></i> is also discussed in detail.<br />
As an interesting side-note, John D&#8217;Errico also recently posted an extremely fast <a target="_blank" rel="nofollow" href="http://www.mathworks.com/matlabcentral/fileexchange/34766-the-fibonacci-sequence">implementation</a> of the Fibonacci function. The source code may seem complex, but the resulting performance gain is well worth the extra complexity. I believe that readers who will read this utility&#8217;s source code and understand its underlying logic will gain insight into several performance tricks that could be very useful in general.</p>
<h3 id="JIT">Effects of incremental JIT improvements</h3>
<p>The introduction of JIT Acceleration in Matlab 6.5 (R13) caused a dramatic boost in performance (there is an internal distinction between the Accelerator and JIT: JIT is <a target="_blank" rel="nofollow" href="http://www.mathworks.com/matlabcentral/fileexchange/18510-matlab-performance-measurement/content/Documents/MATLABperformance/configinfo.m">apparently</a> only part of the Matlab Accelerator, but this distinction appears to have no practical impact on the discussion here).<br />
Over the years, MathWorks has consistently improved the efficiency of its computational engine and the JIT Accelerator in particular. JIT was consistently improved since that release, giving a small improvement with each new Matlab release. In Matlab 7.11 (R2010b), the short Fibonacci snippet used in last week&#8217;s article showed executed about 30% faster compared to Matlab 7.1 R14SP3. The behavior was still quadratic in nature, and so in these releases, using any of the above-mentioned solutions could prove very beneficial.<br />
In Matlab 7.12 (R2011a), a major improvement was done in the Matlab engine (JIT?). The execution run-times improved significantly, and in addition have become linear in nature. This means that multiplying the array size by N only degrades performance by N, not N<sup>2</sup> – an impressive achievement:</p>
<pre lang='matlab'>
% This is ran on Matlab 7.14 (R2012a):
clear f, tic, f=[0,1]; for idx=3:10000, f(idx)=f(idx-1)+f(idx-2); end, toc
   => Elapsed time is 0.004924 seconds.  % baseline loop size, & exec time
clear f, tic, f=[0,1]; for idx=3:20000, f(idx)=f(idx-1)+f(idx-2); end, toc
   => Elapsed time is 0.009971 seconds.  % x2 loop size, x2 execution time
clear f, tic, f=[0,1]; for idx=3:40000, f(idx)=f(idx-1)+f(idx-2); end, toc
   => Elapsed time is 0.019282 seconds.  % x4 loop size, x4 execution time
</pre>
<p>In fact, it turns out that using either the cell arrays method or the factor growth mechanism is much slower in R2011a than using the naïve dynamic growth!<br />
This teaches us a very important lesson: It is not wise to program against a specific implementation of the engine, at least not in the long run. While this may yield performance benefits on some Matlab releases, the situation may well be reversed on some future release. This might force us to retest, reprofile and potentially rewrite significant portions of code for each new release. Obviously this is not a maintainable solution. In practice, most code that is written on some old Matlab release would likely we carried over with minimal changes to the newer releases. If this code has release-specific tuning, we could be shooting ourselves in the leg in the long run.<br />
MathWorks strongly <a target="_blank" rel="nofollow" href="http://blogs.mathworks.com/loren/2008/06/25/speeding-up-matlab-applications/#comment-29607">advises</a> (and <a target="_blank" rel="nofollow" href="https://www.mathworks.com/matlabcentral/newsreader/view_thread/284759#784131">again</a>, and once <a target="_blank" href="/articles/undocumented-profiler-options/#comment-64">again</a>), and I concur, to program in a natural manner, rather than in a way that is tailored to a particular Matlab release (unless of course we can be certain that we shall only be using that release and none other). This will improve development time, maintainability and in the long run also performance.<br />
<i>(and of course you could say that a corollary lesson is to hurry up and get the latest Matlab release&#8230;)</i></p>
<h3 id="Variants">Variants for array growth</h3>
<p>If we are left with using a naïve dynamic resize, there are several equivalent alternatives for doing this, having significantly different performances:</p>
<pre lang='matlab'>
% This is ran on Matlab 7.12 (R2011a):
% Variant #1: direct assignment into a specific out-of-bounds index
data=[]; tic, for idx=1:100000; data(idx)=1; end, toc
   => Elapsed time is 0.075440 seconds.
% Variant #2: direct assignment into an index just outside the bounds
data=[]; tic, for idx=1:100000; data(end+1)=1; end, toc
   => Elapsed time is 0.241466 seconds.    % 3 times slower
% Variant #3: concatenating a new value to the array
data=[]; tic, for idx=1:100000; data=[data,1]; end, toc
   => Elapsed time is 22.897688 seconds.   % 300 times slower!!!
</pre>
<p>As can be seen, it is much faster to directly index an out-of-bounds element as a means to force Matlab to enlarge a data array, rather than using the end+1 notation, which needs to recalculate the value of end each time.<br />
In any case, try to avoid using the concatenation variant, which is significantly slower than either of the other two alternatives (300 times slower in the above example!). In this respect, there is no discernible difference between using the [] operator or the <i><b>cat</b>()</i> function for the concatenation.<br />
Apparently, the JIT performance boost gained in Matlab R2011a does not work for concatenation. Future JIT improvements may possibly also improve the performance of concatenations, but in the meantime it is better to use direct indexing instead.<br />
The effect of the JIT performance boost is easily seen when we run the same variants on pre-R2011a Matlab releases. The corresponding values are 30.9, 34.8 and 34.3 seconds. Using direct indexing is still the fastest approach, but concatenation is now only 10% slower, not 300 times slower.<br />
When we need to append a non-scalar element (for example, a 2D matrix) to the end of an array, we might think that we have no choice but to use the slow concatenation method. This assumption is incorrect: we can still use the much faster direct-indexing method, as shown below (notice the non-linear growth in execution time for the concatenation variant):</p>
<pre lang='matlab'>
% This is ran on Matlab 7.12 (R2011a):
matrix = magic(3);
% Variant #1: direct assignment – fast and linear cost
data=[]; tic, for idx=1:10000; data(:,(idx*3-2):(idx*3))=matrix; end, toc
   => Elapsed time is 0.969262 seconds.
data=[]; tic, for idx=1:100000; data(:,(idx*3-2):(idx*3))=matrix; end, toc
   => Elapsed time is 9.558555 seconds.
% Variant #2: concatenation – much slower, quadratic cost
data=[]; tic, for idx=1:10000; data=[data,matrix]; end, toc
   => Elapsed time is 2.666223 seconds.
data=[]; tic, for idx=1:100000; data=[data,matrix]; end, toc
   => Elapsed time is 356.567582 seconds.
</pre>
<p>As the size of the array enlargement element (in this case, a 3&#215;3 matrix) increases, the computer needs to allocate more memory space more frequently, thereby increasing execution time and the importance of preallocation. Even if the system has an internal memory-management mechanism that enables it to expand into adjacent (contiguous) empty memory space, as the size of the enlargement grows the empty space will run out sooner and a new larger memory block will need to be allocated more frequently than in the case of small incremental enlargements of a single 8-byte double.</p>
<h3 id="Alternatives">Other alternatives</h3>
<p>If preallocation is not possible, JIT is not very helpful, vectorization is out of the question, and rewriting the problem so that it doesn&#8217;t need dynamic array growth is impossible &#8211; if all these are not an option, then consider using one of the following alternatives for array growth (read again the interesting <a target="_blank" rel="nofollow" href="https://www.mathworks.com/matlabcentral/newsreader/view_thread/102704">newsgroup thread from 2005</a> about this issue):</p>
<ul>
<li>Dynamically grow the array by a certain percentage factor each time the array runs out of space (on-the-fly preallocation)</li>
<li>Use John D&#8217;Errico&#8217;s <i><b>growdata</b></i> utility</li>
<li>Use cell arrays to store and grow the data, then use cell2mat to convert the resulting cell array to a regular numeric array</li>
<li>Reuse an existing data array that has the necessary storage space</li>
<li>Wrap the data in a referential object (a class object that inherits from handle), then append the reference handle rather than the original data (<a target="_blank" rel="nofollow" href="http://stackoverflow.com/questions/276198/matlab-class-array">ref</a>). Note that if your class object does not inherit from handle, it is not a referential object but rather a value object, and as such it will be appended in its entirety to the array data, losing any performance benefits. Of course, it may not always be possible to wrap our class objects as a handle.<br />
References have a much small memory footprint than the objects that they reference. The objects themselves will remain somewhere in memory and will not need to be moved whenever the data array is enlarged and reallocated – only the small-footprint reference will be moved, which is much faster. This is also the reason that cell concatenation is faster than array concatenations for large objects.</li>
</ul>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/array-resizing-performance">Array resizing performance</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/preallocation-performance" rel="bookmark" title="Preallocation performance">Preallocation performance </a> <small>Preallocation is a standard Matlab speedup technique. Still, it has several undocumented aspects. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/performance-accessing-handle-properties" rel="bookmark" title="Performance: accessing handle properties">Performance: accessing handle properties </a> <small>Handle object property access (get/set) performance can be significantly improved using dot-notation. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/convolution-performance" rel="bookmark" title="Convolution performance">Convolution performance </a> <small>Matlab's internal implementation of convolution can often be sped up significantly using the Convolution Theorem. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/zero-testing-performance" rel="bookmark" title="Zero-testing performance">Zero-testing performance </a> <small>Subtle changes in the way that we test for zero/non-zero entries in Matlab can have a significant performance impact. ...</small></li>
</ol>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://undocumentedmatlab.com/articles/array-resizing-performance/feed</wfw:commentRss>
			<slash:comments>7</slash:comments>
		
		
			</item>
		<item>
		<title>Preallocation performance</title>
		<link>https://undocumentedmatlab.com/articles/preallocation-performance?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=preallocation-performance</link>
					<comments>https://undocumentedmatlab.com/articles/preallocation-performance#comments</comments>
		
		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Wed, 16 May 2012 12:14:46 +0000</pubDate>
				<category><![CDATA[Low risk of breaking in future versions]]></category>
		<category><![CDATA[Memory]]></category>
		<category><![CDATA[Stock Matlab function]]></category>
		<category><![CDATA[Undocumented feature]]></category>
		<category><![CDATA[JIT]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Pure Matlab]]></category>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=2940</guid>

					<description><![CDATA[<p>Preallocation is a standard Matlab speedup technique. Still, it has several undocumented aspects. </p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/preallocation-performance">Preallocation performance</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/zero-testing-performance" rel="bookmark" title="Zero-testing performance">Zero-testing performance </a> <small>Subtle changes in the way that we test for zero/non-zero entries in Matlab can have a significant performance impact. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/performance-scatter-vs-line" rel="bookmark" title="Performance: scatter vs. line">Performance: scatter vs. line </a> <small>In many circumstances, the line function can generate visually-identical plots as the scatter function, much faster...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/array-resizing-performance" rel="bookmark" title="Array resizing performance">Array resizing performance </a> <small>Several alternatives are explored for dynamic array growth performance in Matlab loops. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/allocation-performance-take-2" rel="bookmark" title="Allocation performance take 2">Allocation performance take 2 </a> <small>The clear function has some non-trivial effects on Matlab performance. ...</small></li>
</ol>
</div>
]]></description>
										<content:encoded><![CDATA[<p>Array <a target="_blank" rel="nofollow" href="http://www.mathworks.com/help/techdoc/matlab_prog/f8-784135.html#f8-793781">preallocation</a> is a standard and quite well-known technique for improving Matlab loop run-time performance. Today&#8217;s article will show that there is more than meets the eye for even such a simple coding technique.<br />
A note of caution: in the examples that follow, don&#8217;t take any speedup as an expected actual value &#8211; the actual value may well be different on your system. Your mileage may vary. I only mean to display the relative differences between different alternatives.</p>
<h3 id="problem">The underlying problem</h3>
<p>Memory management has a direct influence on performance. I have already shown <a target="_blank" href="/articles/matlab-java-memory-leaks-performance/">some examples of this</a> in past articles here.<br />
Preallocation solves a basic problem in simple program loops, where an array is iteratively enlarged with new data (dynamic array growth). Unlike other programming languages (such as C, C++, C# or Java) that use static typing,  Matlab uses dynamic typing. This means that it is natural and easy to modify array size dynamically during program execution. For example:</p>
<pre lang='matlab'>
fibonacci = [0, 1];
for idx = 3 : 100
   fibonacci(idx) = fibonacci(idx-1) + fibonacci(idx-2);
end
</pre>
<p>While this may be simple to program, it is not wise with regards to performance. The reason is that whenever an array is resized (typically enlarged), Matlab allocates an entirely new contiguous block of memory for the array, copying the old values from the previous block to the new, then releasing the old block for potential reuse. This operation takes time to execute. In some cases, this reallocation might require accessing virtual memory and page swaps, which would take an even longer time to execute. If the operation is done in a loop, then performance could quickly drop off a cliff.<br />
The cost of such naïve array growth is theoretically quadratic. This means that multiplying the number of elements by N multiplies the execution time by about N<sup>2</sup>. The reason for this is that Matlab needs to reallocate N times more than before, and each time takes N times longer due to the larger allocation size (the average block size multiplies by N), and N times more data elements to copy from the old to the new memory blocks.<br />
A very interesting discussion of this phenomenon and various solutions can be found in a <a target="_blank" rel="nofollow" href="https://www.mathworks.com/matlabcentral/newsreader/view_thread/102704">newsgroup thread from 2005</a>. Three main solutions were presented: preallocation, selective dynamic growth (<i>allocating headroom</i>) and using cell arrays. The best solution among these in terms of ease of use and performance is preallocation.</p>
<h3 id="basics">The basics of pre-allocation</h3>
<p>The basic idea of preallocation is to create a data array in the final expected size before actually starting the processing loop. This saves any reallocations within the loop, since all the data array elements are already available and can be accessed. This solution is useful when the final size is known in advance, as the following snippet illustrates:</p>
<pre lang='matlab'>
% Regular dynamic array growth:
tic
fibonacci = [0,1];
for idx = 3 : 40000
   fibonacci(idx) = fibonacci(idx-1) + fibonacci(idx-2);
end
toc
   => Elapsed time is 0.019954 seconds.
% Now use preallocation – 5 times faster than dynamic array growth:
tic
fibonacci = zeros(40000,1);
fibonacci(1)=0; fibonacci(2)=1;
for idx = 3 : 40000,
   fibonacci(idx) = fibonacci(idx-1) + fibonacci(idx-2);
end
toc
   => Elapsed time is 0.004132 seconds.
</pre>
<p>On pre-R2011a releases the effect of preallocation is even more pronounced: I got a 35-times speedup on the same machine using Matlab 7.1 (R14 SP3). R2011a (Matlab 7.12) had a dramatic performance boost for such cases in the internal accelerator, so newer releases are much faster in dynamic allocations, but preallocation is still 5 times faster even on R2011a.</p>
<h3 id="nondeterministic">Non-deterministic pre-allocation</h3>
<p>Because the effect of preallocation is so dramatic on all Matlab releases, it makes sense to utilize it even in cases where the data array&#8217;s final size is not known in advance. We can do this by estimating an upper bound to the array&#8217;s size, preallocate this large size, and when we&#8217;re done remove any excess elements:</p>
<pre lang='matlab'>
% The final array size is unknown – assume 1Kx3K upper bound (~23MB)
data = zeros(1000,3000);  % estimated maximal size
numRows = 0;
numCols = 0;
while (someCondition)
   colIdx = someValue1;   numCols = max(numCols,colIdx);
   rowIdx = someValue2;   numRows = max(numRows,rowIdx);
   data(rowIdx,colIdx) = someOtherValue;
end
% Now remove any excess elements
data(:,numCols+1:end) = [];   % remove excess columns
data(numRows+1:end,:) = [];   % remove excess rows
</pre>
<h3 id="variants">Variants for pre-allocation</h3>
<p>It turns out that MathWorks&#8217; <a target="_blank" rel="nofollow" href="http://www.mathworks.com/help/techdoc/matlab_prog/f8-784135.html#f8-793795">official suggestion</a> for preallocation, namely using the <i><b>zeros</b></i> function, is not the most efficient:</p>
<pre lang='matlab'>
% MathWorks suggested variant
clear data1, tic, data1 = zeros(1000,3000); toc
   => Elapsed time is 0.016907 seconds.
% A much faster alternative - 500 times faster!
clear data1, tic, data1(1000,3000) = 0; toc
   => Elapsed time is 0.000034 seconds.
</pre>
<p>The reason for the second variant being so much faster is because it only allocates the memory, without worrying about the internal values (they get a default of 0, <i>false</i> or &#8221;, in case you wondered). On the other hand, <i><b>zeros</b></i> has to place a value in each of the allocated locations, which takes precious time.<br />
In most cases the differences are immaterial since the preallocation code would only run once in the program, and an extra 17ms isn&#8217;t such a big deal. But in some cases we may have a need to periodically refresh our data, where the extra run-time could quickly accumulate.<br />
<b><u>Update (October 27, 2015)</u></b>: As Marshall <a href="/articles/preallocation-performance#comment-359956">notes below</a>, this behavior changed in R2015b when <a target="_blank" href="/articles/callback-functions-performance">the new LXE</a> (Matlab&#8217;s new execution engine) replaced the previous engine. In R2015b, the <i><b>zeros</b></i> function is faster than the alternative of just setting the last array element to 0. Similar changes may also have occurred to the following post content, so if you are using R2015b onward, be sure to test carefully on your specific system.</p>
<h3 id="non-default">Pre-allocating non-default values</h3>
<p>When we need to preallocate a specific value into every data array element, we cannot use Variant #2. The reason is that Variant #2 only sets the very last data element, and all other array elements get assigned the default value (0, ‘’ or false, depending on the array’s data type). In this case, we can use one of the following alternatives (with their associated timings for a 1000&#215;3000 data array):</p>
<pre lang='matlab'>
scalar = pi;  % for example...
data = scalar(ones(1000,3000));           % Variant A: 87.680 msecs
data(1:1000,1:3000) = scalar;             % Variant B: 28.646 msecs
data = repmat(scalar,1000,3000);          % Variant C: 17.250 msecs
data = scalar + zeros(1000,3000);         % Variant D: 17.168 msecs
data(1000,3000) = 0; data = data+scalar;  % Variant E: 16.334 msecs
</pre>
<p>As can be seen, Variants C-E are about twice as fast as Variant B, and 5 times faster than Variant A.</p>
<h3 id="non-double">Pre-allocating non-double data</h3>
<p>7.4.5 Preallocating non-double data<br />
When preallocating an array of a type that is not <i><b>double</b></i>, we should be careful to create it using the desired type, to prevent memory and/or performance inefficiencies. For example, if we need to process a large array of small integers (<i><b>int8</b></i>), it would be inefficient to preallocate an array of doubles and type-convert to/from int8 within every loop iteration. Similarly, it would be inefficient to preallocate the array as a double type and then convert it to int8. Instead, we should create the array as an int8 array in the first place:</p>
<pre lang='matlab'>
% Bad idea: allocates 8MB double array, then converts to 1MB int8 array
data = int8(zeros(1000,1000));   % 1M elements
   => Elapsed time is 0.008170 seconds.
% Better: directly allocate the array as a 1MB int8 array – x80 faster
data = zeros(1000,1000,'int8');
   => Elapsed time is 0.000095 seconds.
</pre>
<h3 id="cells">Pre-allocating cell arrays</h3>
<p>To preallocate a cell-array we can use the cell function (explicit preallocation), or the maximal cell index (implicit preallocation). Explicit preallocation is faster than implicit preallocation, but functionally equivalent (Note: this is contrary to the experience with allocation of numeric arrays and other arrays):</p>
<pre lang='matlab'>
% Variant #1: Explicit preallocation of a 1Kx3K cell array
data = cell(1000,3000);
   => Elapsed time is 0.004637 seconds.
% Variant #2: Implicit preallocation – x3 slower than explicit
clear('data'), data{1000,3000} = [];
   => Elapsed time is 0.012873 seconds.
</pre>
<h3 id="structs">Pre-allocating arrays of structs</h3>
<p>To preallocate an array of structs or class objects, we can use the <i><b>repmat</b></i> function to replicate copies of a single data element (explicit preallocation), or just use the maximal data index (implicit preallocation). In this case, unlike the case of cell arrays, implicit preallocation is much faster than explicit preallocation, since the single element does not actually need to be copied multiple times (<a target="_blank" rel="nofollow" href="http://www.mathworks.com/support/solutions/en/data/1-7S1YKO/">ref</a>):</p>
<pre lang='matlab'>
% Variant #1: Explicit preallocation of a 100x300 struct array
element = struct('field1',magic(2), 'field2',{[]});
data = repmat(element, 100, 300);
   => Elapsed time is 0.002804 seconds.
% Variant #2: Implicit preallocation – x7 faster than explicit
element = struct('field1',magic(2), 'field2',{[]});
clear('data'), data(100,300) = element;
   => Elapsed time is 0.000429 seconds.
</pre>
<p>When preallocating structs, we can also use a third variant, using the built-in struct feature of replicating the struct when the <i><b>struct</b></i> function is passed a cell array. For example, <code>struct('field1',cell(100,1), 'field2',5)</code> will create 100 structs, each of them having the empty field <i>field1</i> and another field called <i>field2</i> with value 5. Unfortunately, this variant is slower than both of the previous variants.</p>
<h3 id="objects">Pre-allocating class objects</h3>
<p>When preallocating in general, ensure that you are using the maximal expected array size. There is no point in preallocating an empty array or an array having a smaller size than the expected maximum, since dynamic memory reallocation will automatically kick-in within the processing-loop. For this reason, <a target="_blank" rel="nofollow" href="http://stackoverflow.com/questions/2510427/how-to-preallocate-an-array-of-class-in-matlab">do not use</a> the <i>empty()</i> method of class objects to preallocate, but rather <i><b>repmat</b></i> as explained above.<br />
When using <i><b>repmat</b></i> to replicate class objects, always be careful to note whether you are replicating the object itself (this happens if your class does NOT derive from <i><b>handle</b></i>) or its reference handle (which happens if you derive the class from <i><b>handle</b></i>). If you are replicating objects, then you can safely edit any of their properties independently of each other; but if you replicate references, you are merely using multiple copies of the same reference, so that modifying referenced object #1 will also automatically affect all the other referenced objects. This may or may not be suitable for your particular program requirements, so be careful to check carefully. If you actually need to use independent object copies, you will <a target="_blank" rel="nofollow" href="http://stackoverflow.com/questions/591495/matlab-preallocate-a-non-numeric-vector#591788">need to call</a> the class constructor multiple times, once for each new independent object.</p>
<p />
Next week: what if we can&#8217;t avoid dynamic array resizing? &#8211; apparently, all is not lost. Stay tuned&#8230;<br />
<i><br />
Do you have any similar allocation-related tricks you&#8217;re using? or unexpected differences such as the ones shown above? If so, then please do <a href="/articles/preallocation-performance/#respond">post a comment</a>.<br />
</i> </p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/preallocation-performance">Preallocation performance</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/zero-testing-performance" rel="bookmark" title="Zero-testing performance">Zero-testing performance </a> <small>Subtle changes in the way that we test for zero/non-zero entries in Matlab can have a significant performance impact. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/performance-scatter-vs-line" rel="bookmark" title="Performance: scatter vs. line">Performance: scatter vs. line </a> <small>In many circumstances, the line function can generate visually-identical plots as the scatter function, much faster...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/array-resizing-performance" rel="bookmark" title="Array resizing performance">Array resizing performance </a> <small>Several alternatives are explored for dynamic array growth performance in Matlab loops. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/allocation-performance-take-2" rel="bookmark" title="Allocation performance take 2">Allocation performance take 2 </a> <small>The clear function has some non-trivial effects on Matlab performance. ...</small></li>
</ol>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://undocumentedmatlab.com/articles/preallocation-performance/feed</wfw:commentRss>
			<slash:comments>30</slash:comments>
		
		
			</item>
		<item>
		<title>Matlab&#039;s internal memory representation</title>
		<link>https://undocumentedmatlab.com/articles/matlabs-internal-memory-representation?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=matlabs-internal-memory-representation</link>
					<comments>https://undocumentedmatlab.com/articles/matlabs-internal-memory-representation#comments</comments>
		
		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Thu, 15 Mar 2012 18:11:23 +0000</pubDate>
				<category><![CDATA[Guest bloggers]]></category>
		<category><![CDATA[High risk of breaking in future versions]]></category>
		<category><![CDATA[Memory]]></category>
		<category><![CDATA[Mex]]></category>
		<category><![CDATA[Undocumented feature]]></category>
		<category><![CDATA[Peter Li]]></category>
		<category><![CDATA[Pure Matlab]]></category>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=2798</guid>

					<description><![CDATA[<p>Matlab's internal memory structure is explored and discussed. </p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/matlabs-internal-memory-representation">Matlab&#039;s internal memory representation</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/internal-matlab-memory-optimizations" rel="bookmark" title="Internal Matlab memory optimizations">Internal Matlab memory optimizations </a> <small>Copy-on-write and in-place data manipulations are very useful Matlab performance improvement techniques. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/couple-of-bugs-and-workarounds" rel="bookmark" title="A couple of internal Matlab bugs and workarounds">A couple of internal Matlab bugs and workarounds </a> <small>A couple of undocumented Matlab bugs have simple workarounds. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/profiling-matlab-memory-usage" rel="bookmark" title="Profiling Matlab memory usage">Profiling Matlab memory usage </a> <small>mtic and mtoc were a couple of undocumented features that enabled users of past Matlab releases to easily profile memory usage. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/accessing-internal-java-class-members" rel="bookmark" title="Accessing internal Java class members">Accessing internal Java class members </a> <small>Java inner classes and enumerations can be used in Matlab with a bit of tweaking. ...</small></li>
</ol>
</div>
]]></description>
										<content:encoded><![CDATA[<p><i>Once again I&#8217;d like to welcome guest blogger <a target="_blank" rel="nofollow" href="http://absurdlycertain.blogspot.com/">Peter Li</a>. Peter wrote about <a target="_blank" href="/articles/matlab-mex-in-place-editing/">Matlab Mex in-place editing</a> last month. Today, Peter pokes around in Matlab&#8217;s internal memory representation to the greater good and glory of Matlab Mex programming.</i><br />
<b><i>Disclaimer: The information in this article is provided for informational purposes only.  Be aware that poking into Matlab&#8217;s internals is not condoned or supported by MathWorks, and is not recommended for any regular usage.  Poking into memory has the potential to crash your computer so save your data!  Moreover, be advised (as the text below will show) that the information is highly prone to change without any advance notice in future Matlab releases, which could lead to very adverse effects on any program that relies on it. On the scale of undocumented Matlab topics, this practically breaks the scale, so be EXTREMELY careful when using this.</i></b><br />
A few weeks ago I <a target="_blank" href="/articles/matlab-mex-in-place-editing/">discussed</a> Matlab&#8217;s copy-on-write mechanism as part of my discussion of editing Matlab arrays in-place.  Today I want to explore some behind-the-scenes details of how the copy-on-write mechanism is implemented.  In the process we will learn a little about Matlab&#8217;s internal array representation.  I will also introduce some simple tools you can use to explore more of Matlab&#8217;s internals.  I will only cover basic information, so there are plenty more details left to be filled in by others who are interested.</p>
<h3 id="Copy-on-write">Brief review of copy-on-write and mxArray</h3>
<p>Copy-on-write is Matlab&#8217;s mechanism for avoiding unnecessary duplication of data in memory.  To implement this, Matlab needs to keep track internally of which sets of variables are copies of each other.  As described in <a target="_blank" rel="nofollow" href="http://www.mathworks.com/help/techdoc/matlab_external/f21585.html">MathWorks&#8217;s article</a>, &#8220;<i>the Matlab language works with a single object type: the Matlab array. All Matlab variables (including scalars, vectors, matrices, strings, cell arrays, structures, and objects) are stored as Matlab arrays. In C/C++, the Matlab array is declared to be of type <a target="_blank" rel="nofollow" href="http://www.mathworks.com/help/techdoc/apiref/mxarray.html"><code>mxArray</code></a></i>&#8220;. This means that <code>mxArray</code> defines how Matlab lays out all the information about an array (its Matlab data type, its size, its data, etc.) in memory.  So understanding Matlab&#8217;s internal array representation basically boils down to understanding <code>mxArray</code>.<br />
Unfortunately, MathWorks also tells us that &#8220;<i><code>mxArray</code> is a C language <a target='_blank' rel='nofollow' href='http://en.wikipedia.org/wiki/Opaque_pointer'>opaque type</a></i>&#8220;. This means that MathWorks does not expose the organization of <code>mxArray</code> to users (i.e. Matlab or Mex programmers).  Instead, MathWorks defines <code>mxArray</code> internally, and allows users to interact with it only through an API, a set of functions that know how to handle <code>mxArray</code> in their back end.  So, for example, a Mex programmer does not get the dimensions of an <code>mxArray</code> by directly accessing the relevant field in memory.  Instead, the Mex programmer only has a pointer to the <code>mxArray</code>, and passes this pointer into an API function that knows where in memory to find the requested information and then passes the result back to the programmer.<br />
This is generally a good thing: the API provides an abstraction layer between the programmer and the memory structures so that if MathWorks needs to change the back end organization (to add a new feature for example), we programmers do not need to modify our code; instead MathWorks just updates the API to reflect the new internal organization.  On the other hand, being able to look into the internal structure of <code>mxArray</code> on occasion can help us understand how Matlab works, and can help us write more efficient code if we are careful as in the example of editing arrays in-place.<br />
So how do we get a glimpse inside <code>mxArray</code>?  The first step is simply to find the region of memory where the <code>mxArray</code> lives: its beginning and end.  Finding where in memory the <code>mxArray</code> begins is pretty easy: it is given by its pointer value.  Here is a simple Mex function that takes a Matlab array as input and prints its memory address:</p>
<pre lang='c'>
/* printaddr.cpp */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   if (nrhs < 1) mexErrMsgTxt("One input required.");
   printf("%p\n", prhs[0]);
}
</pre>
<p>This function is nice as it prints the address in a standard hexadecimal format.  The same information can also be received directly in Matlab (i.e., without needing <i><b>printaddr</b></i>), using the undocumented <a target="_blank" rel="nofollow" href="https://www.mathworks.com/matlabcentral/newsreader/view_thread/15485#34519"><i><b>format debug</b></i> command</a> (here's <a target="_blank" rel="nofollow" href="https://www.mathworks.com/matlabcentral/newsreader/view_thread/15988">another reference</a>):</p>
<pre lang='matlab'>
>> format debug
>> A = 1:10
A =
Structure address = 7fc3b8869ae0
m = 1
n = 10
pr = 7fc44922c890
pi = 0
     1     2     3     4     5     6     7     8     9    10
>> printaddr(A)
7fc3b8869ae0
</pre>
<p>To play with this further from within Matlab however, it&#8217;s nice to have the address returned to us as a 64-bit unsigned integer; here&#8217;s a Mex function that does that:</p>
<pre lang='c'>
/* getaddr.cpp */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   if (nrhs < 1) mexErrMsgTxt("One input required.");
   plhs[0] = mxCreateNumericMatrix(1, 1, mxUINT64_CLASS, mxREAL);
   unsigned long *out = static_cast<unsigned long *>(mxGetData(plhs[0]));
   out[0] = (unsigned long) prhs[0];
}
</pre>
<p>Here&#8217;s <i><b>getaddr</b></i> in action:</p>
<pre lang='matlab'>
>> getaddr(A)
ans =
           139870853618400
% And using pure Matlab:
>> hex2dec('7f36388b5ae0')  % output of printaddr or format debug
ans =
           139870853618400
</pre>
<p>So now we know where to find our array in memory.  With this information we can already learn a lot.  To make our exploration a little cleaner though, it would be nice to know where the array ends in memory too, in other words we would like to know the size of the <code>mxArray</code>.</p>
<h3 id="Structure">Finding the structure of mxArray</h3>
<p>The first thing to understand is that the amount of memory taken by an <code>mxArray</code> does not have anything to do with the dimensions of the array in Matlab.  So a 1&#215;1 Matlab array and a 100&#215;100 Matlab array have the same size <code>mxArray</code> representation in memory.  As you will know if you have experience programming in Mex, this is simply because the Matlab array&#8217;s data contents are not stored directly within <code>mxArray</code>.  Instead, <code>mxArray</code> only stores a pointer to another memory location where the actual data reside.  This is fine; the internal information we want to poke into is all still in <code>mxArray</code>, and it is easy to get the pointer to the array&#8217;s data contents using the API functions <i>mxGetData</i> or <i>mxGetPr</i>.<br />
So we are still left with trying to figure out the size of <code>mxArray</code>.  There are a couple paths forward.  First I want to talk about a historical tool that used to make a lot of this internal information easily available.  This was a function called <i>headerdump</i>, by Peter Boetcher (described <a target="_blank" rel="nofollow" href="http://www.mit.edu/~pwb/matlab/">here</a> and <a target="_blank" rel="nofollow" href="http://groups.google.com/group/comp.soft-sys.matlab/browse_thread/thread/c241d8821fb90275">here</a>).  <i>headerdump</i> was created for exactly the goal we are currently working towards: to understand Matlab&#8217;s copy-on-write mechanism.  Unfortunately, as Matlab has evolved, newer versions have incrementally broken this useful tool.  So our goal here is to create a replacement.  Still, we can learn a lot from the earlier work.<br />
One of the things that helped people figure out Matlab&#8217;s internals in the past is that in older versions of Matlab <code>mxArray</code> is not a completely opaque type.  Even in recent versions up through at least R2010a, if you look into $MATLAB/extern/include/matrix.h you can find a definition of <code>mxArray_tag</code> that looks something like this:</p>
<pre lang='c'>
/* R2010a */
struct mxArray_tag {
   void  *reserved;
   int    reserved1[2];
   void  *reserved2;
   size_t  number_of_dims;
   unsigned int reserved3;
   struct {
       unsigned int  flag0  : 1;
       unsigned int  flag1  : 1;
       unsigned int  flag2  : 1;
       unsigned int  flag3  : 1;
       unsigned int  flag4  : 1;
       unsigned int  flag5  : 1;
       unsigned int  flag6  : 1;
       unsigned int  flag7  : 1;
       unsigned int  flag7a : 1;
       unsigned int  flag8  : 1;
       unsigned int  flag9  : 1;
       unsigned int  flag10 : 1;
       unsigned int  flag11 : 4;
       unsigned int  flag12 : 8;
       unsigned int  flag13 : 8;
   }   flags;
   size_t reserved4[2];
   union {
       struct {
           void  *pdata;
           void  *pimag_data;
           void  *reserved5;
           size_t reserved6[3];
       }   number_array;
   }   data;
};
</pre>
<p>This is what you could call murky or obfuscated, but not completely opaque.  The fields mostly have unhelpful names like &#8220;reserved&#8221;, but on the other hand we at least have a sense for what fields there are and their layout.<br />
A more informative (yet unofficial) definition was <a target="_blank" rel="nofollow" href="http://groups.google.com/group/comp.soft-sys.matlab/browse_thread/thread/b8dbd91953c494fb">provided</a> by James Tursa and Peter Boetcher:</p>
<pre lang='c'>
#include "mex.h"
/* Definition of structure mxArray_tag for debugging purposes. Might not be fully correct
 * for Matlab 2006b or 2007a, but the important things are. Thanks to Peter Boettcher.
 */
struct mxArray_tag {
  const char *name;
  mxClassID class_id;
  int vartype;
  mxArray    *crosslink;
  int      number_of_dims;
  int      refcount;
  struct {
    unsigned int    scalar_flag : 1;
    unsigned int    flag1 : 1;
    unsigned int    flag2 : 1;
    unsigned int    flag3 : 1;
    unsigned int    flag4 : 1;
    unsigned int    flag5 : 1;
    unsigned int    flag6 : 1;
    unsigned int    flag7 : 1;
    unsigned int    private_data_flag : 1;
    unsigned int    flag8 : 1;
    unsigned int    flag9 : 1;
    unsigned int    flag10 : 1;
    unsigned int    flag11 : 4;
    unsigned int    flag12 : 8;
    unsigned int    flag13 : 8;
  }   flags;
  int  rowdim;
  int  coldim;
  union {
    struct {
      double  *pdata;       // original: void*
      double  *pimag_data;  // original: void*
      void    *irptr;
      void    *jcptr;
      int     nelements;
      int     nfields;
    }   number_array;
    struct {
      mxArray **pdata;
      char    *field_names;
      void    *dummy1;
      void    *dummy2;
      int     dummy3;
      int     nfields;
    }   struct_array;
    struct {
      void  *pdata;  /*mxGetInfo*/
      char  *field_names;
      char  *name;
      int   checksum;
      int   nelements;
      int   reserved;
    }  object_array;
  }   data;
};
</pre>
<p>For comparison, here is another definition from an earlier version of Matlab.</p>
<pre lang='c'>
/* R11 aka Matlab 5.0 (1999) */
struct mxArray_tag {
  char name[mxMAXNAM];
  int  class_id;
  int  vartype;
  mxArray *crosslink;
  int  number_of_dims;
  int  nelements_allocated;
  int  dataflags;
  int  rowdim;
  int  coldim;
  union {
    struct {
      void *pdata;
      void *pimag_data;
      void *irptr;
      void *jcptr;
      int   reserved;
      int   nfields;
    }   number_array;
  }   data;
};
</pre>
<p>I took this R11 definition from the source code to <i>headerdump</i> (specifically, from <i>mxinternals.h</i>, which also has <code>mxArray_tag</code> definitions for R12 (Matlab 6.0) and R13 (Matlab 6.5)), and you can see that it is much more informative, because many fields have been given useful names thanks to the work of Peter Boetcher and others.  Note also that the definition from this old version of Matlab is quite different from the version from R2010a.<br />
At this point, if you are running a much earlier version of Matlab like R11 or R13, you can break off from the current article and start playing around with <i>headerdump</i> directly to try to understand Matlab&#8217;s internals.  For more recent versions of Matlab, we have more work to do.  Getting back to our original goal, if we take the <code>mxArray_tag</code> definition from R2010a and run <i>sizeof</i>, we get an answer for the amount of memory taken up by an <code>mxArray</code> in R2010a: <b>104 bytes</b>.</p>
<h3 id="Size">Determining the size of mxArray</h3>
<p>It was nice to derive the size of <code>mxArray</code> from actual MathWorks code, but unfortunately this information is no longer available as of R2011a.  Somewhere between R2010a and R2011a, MathWorks stepped up their efforts to make <code>mxArray</code> completely opaque.  So we should find another way to get the size of <code>mxArray</code> for current and future Matlab versions.<br />
One ugly trick that works is to create many new arrays quickly and see where their starting points end up in memory:</p>
<pre lang='matlab'>
>> A = num2cell(1:100)';
>> addrs = sort(cellfun(@getaddr, A));
</pre>
<p>What we did here is create 100 new arrays, and then get all their memory addresses in sorted order.  Now we can take a look at how far apart these new arrays ended up in memory:</p>
<pre lang='matlab'>
>> semilogy(diff(addrs));
</pre>
<p>The resulting plot will look different each time you run this; it is not really predictable where Matlab will put new arrays into memory.  Here is an example from my system:<br />
<center><figure style="width: 483px" class="wp-caption aligncenter"><img loading="lazy" decoding="async" alt="Plot of memory addresses" src="https://undocumentedmatlab.com/images/mxArray_memory.png" title="Plot of memory addresses" width="483" height="381" /><figcaption class="wp-caption-text">Plot of memory addresses</figcaption></figure></center><br />
Your results may look different, and you might have to increase the number of new arrays from 100 to 1000 to get the qualitative result, but the important feature of this plot is that there is a minimum distance between new arrays of about 10<sup>2</sup>.  In fact, if we just go straight for this minimum distance:</p>
<pre lang='matlab'>
>> min(diff(addrs))
ans =
            104
</pre>
<p>we find that although <code>mxArray</code> has gone completely opaque from R2010a to R2011a, the full size of <code>mxArray</code> in memory has stayed the same: 104 bytes.</p>
<h3 id="Dump">Dumping mxArray from memory</h3>
<p>We now have all the information we need to start looking into Matlab&#8217;s array representation.  There are many tools available that allow you to browse memory locations or dump memory contents to disk.  For our purposes though, it is nice to be able to do everything from within Matlab.  Therefore I introduce a simple tool that prints memory locations into the Matlab console:</p>
<pre lang='c'>
/* printmem.cpp */
#include "mex.h"
void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
  if (nrhs < 1 || !mxIsUint64(prhs[0]) || mxIsEmpty(prhs[0]))
    mexErrMsgTxt("First argument must be a uint64 memory address");
  unsigned long *addr = static_cast<unsigned long *>(mxGetData(prhs[0]));
  unsigned char *mem = (unsigned char *) addr[0];
  if (nrhs < 2 || !mxIsDouble(prhs[1]) || mxIsEmpty(prhs[1]))
    mexErrMsgTxt("Second argument must be a double-type integer byte size.");
  unsigned int nbytes = static_cast<unsigned int>(mxGetScalar(prhs[1]));
  for (int i = 0; i < nbytes; i++) {
    printf("%.2x ", mem[i]);
    if ((i+1) % 16 == 0) printf("\n");
 }
 printf("\n");
}
</pre>
<p>Here is how you use it in Matlab:</p>
<pre lang='matlab'>
>> A = 0;
>> printmem(getaddr(A), 104)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 01 02 00 00 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 70 fa 33 df 6f 7f 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
</pre>
<p>And there you have it: the inner guts of <code>mxArray</code> laid bare.  I have printed each byte as a two character hexadecimal value, as is standard, so there are 16 bytes printed per row.</p>
<h3 id="">What does it mean?</h3>
<p>So now we have 104 bytes of Matlab internals to dig into.  We can start playing with this with a few simple examples:</p>
<pre>
>> A = 0; B = 1;
>> printmem(getaddr(A), 104)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 01 02 00 00 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 <span style="background-color:#e6b8af;">c0 b0 27 df 6f 7f</span> 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
>> printmem(getaddr(B), 104)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 01 02 00 00 01 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 <span style="background-color:#e6b8af;">70 fa 33 df 6f 7f</span> 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00
</pre>
<p>
In this and subsequent examples, I will highlight bytes that are different or that are of interest.  What we can see from this example is that although arrays A and B have different content, almost nothing is different between their <code>mxArray</code> representations.  What is different, is the memory address stored in the highlighted bytes.  This confirms our earlier assertion that <code>mxArray</code> does not store the array contents, but only a pointer to the content location.<br />
Now let us try to figure out some of the other fields:</p>
<pre>
>> A = 1:3; B = 1:10; C = (1:10)';
>> printmem(getaddr(A), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 <span style="background-color:#a4c2f4;">02</span> 00 00 00 00 00 00 00
00 00 00 00 00 <span style="background-color:#a4c2f4;">02</span> 00 00 <span style="background-color:#e6b8af;">01</span> 00 00 00 00 00 00 00
<span style="background-color:#e6b8af;">03</span> 00 00 00 00 00 00 00 60 80 22 df 6f 7f 00 00
>> printmem(getaddr(B), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 <span style="background-color:#a4c2f4;">02</span> 00 00 00 00 00 00 00
00 00 00 00 00 <span style="background-color:#a4c2f4;">02</span> 00 00 <span style="background-color:#e6b8af;">01</span> 00 00 00 00 00 00 00
<span style="background-color:#e6b8af;">0a</span> 00 00 00 00 00 00 00 80 83 29 df 6f 7f 00 00
>> printmem(getaddr(C), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 <span style="background-color:#a4c2f4;">02</span> 00 00 00 00 00 00 00
00 00 00 00 00 <span style="background-color:#a4c2f4;">02</span> 00 00 <span style="background-color:#e6b8af;">0a</span> 00 00 00 00 00 00 00
<span style="background-color:#e6b8af;">01</span> 00 00 00 00 00 00 00 80 83 29 df 6f 7f 00 00
</pre>
<p>
(Note that this time I only printed the first four lines of each array as this is where the interesting differences are for this example.)<br />
In <span style="background-color:#e6b8af;">red</span> I highlighted the bytes in each array that give its number of rows and columns (note that hexadecimal 0a is 10 in decimal).  In <span style="background-color:#a4c2f4;">blue</span> I highlighted areas that store the value &#8220;02&#8221;, which could be the location for storing the number of dimensions.  Let us look into this more:</p>
<pre>
>> A = rand([3 3 3]);
>> printmem(getaddr(A), 64)
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 <span style="background-color:#e6b8af;">03</span> 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 <span style="background-color:#e6b8af;">30 4a 3f df 6f 7f</span> 00 00
<span style="background-color:#e6b8af;">09</span> 00 00 00 00 00 00 00 b0 d3 24 df 6f 7f 00 00
</pre>
<p>
Two interesting results here:  The first highlighted region changed from 02 to 03, so this must be the place where <code>mxArray</code> indicates a 3-dimensional array rather than 2D.  Another important thing also changed though: we can see in the second highlighted region that there is a new memory address stored where we used to find the number of rows.  And in the third highlighted region we now have the number 09 instead of the number of columns.<br />
Clearly, Matlab has a different way of representing a 2D matrix versus arrays of higher dimension such as 3D.  In the 2D case, <code>mxArray</code> simply holds the nrows and ncols directly, but for a higher dimension case we hold only the number of dimensions (03), the total number of elements (09), and a pointer to another memory location (0x7f6fdf3f4a30) which holds the array of sizes for each dimension.</p>
<h3 id="COW">The copy-on-write mechanism</h3>
<p>Finally, we are in a position to understand how Matlab internally implements copy-on-write:</p>
<pre>
>> A = 1:10;
>> printmem(getaddr(A), 64);
00 00 00 00 00 00 00 00 06 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 90 f3 24 df 6f 7f 00 00
>> B = A;
>> printaddr(B);
0x7f6f4c7b6810
>> printmem(getaddr(A), 64);
<span style="background-color:#e6b8af;">10 68 7b 4c 6f 7f</span> 00 00 06 00 00 00 00 00 00 00
<span style="background-color:#e6b8af;">10 68 7b 4c 6f 7f</span> 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 <span style="background-color:#a4c2f4;">90 f3 24 df 6f 7f</span> 00 00
</pre>
<p>
What we see is that by setting B = A, we change the internal representation of A itself.  Two new memory address pointers are added to the <code>mxArray</code> for A.  As it turns out, both of these point to the address for array B, which makes sense; this is how Matlab internally keeps track of arrays that are copies of each other.  Note that because byte order is <a target="_blank" rel="nofollow" href="http://en.wikipedia.org/wiki/Endianness">little-endian</a>, the memory addresses from <i><b>printmem</b></i> are byte-wise, i.e. every two characters, reversed relative to the address from <i><b>printaddr</b></i>.<br />
We can also look into array B:</p>
<pre>
>> printmem(getaddr(B), 64);
<span style="background-color:#e6b8af;">f0 41 7a 4c 6f 7f</span> 00 00 06 00 00 00 00 00 00 00
<span style="background-color:#e6b8af;">f0 41 7a 4c 6f 7f</span> 00 00 02 00 00 00 00 00 00 00
00 00 00 00 00 02 00 00 01 00 00 00 00 00 00 00
0a 00 00 00 00 00 00 00 <span style="background-color:#a4c2f4;">90 f3 24 df 6f 7f</span> 00 00
>> printaddr(A);
<span style="background-color:#e6b8af;">0x7f6f4c7a41f0</span>
</pre>
<p>
As I have highlighted, there are two interesting points here.  First the red highlights show that array B has pointers back to array A.  Second the blue highlight shows that the Matlab data for array B actually just points back to the same memory as the data for array A (the values 1:10).<br />
Finally, we would like to understand why there are two pointers added.  Let us see what happens if we add a third linked variable:</p>
<pre>
>> C = B;
>> printaddr(A); printaddr(B); printaddr(C);
<span style="background-color:#e6b8af;">0x7f6f4c7a41f0</span>
<span style="background-color:#a4c2f4;">0x7f6f4c7b6810</span>
<span style="background-color:#00ff00;">0x7f6f4c7b69b0</span>
>> printmem(getaddr(A), 32)
<span style="background-color:#00ff00;">b0 69 7b 4c 6f 7f</span> 00 00 06 00 00 00 00 00 00 00
<span style="background-color:#a4c2f4;">10 68 7b 4c 6f 7f</span> 00 00 02 00 00 00 00 00 00 00
>> printmem(getaddr(B), 32)
<span style="background-color:#e6b8af;">f0 41 7a 4c 6f 7f</span> 00 00 06 00 00 00 00 00 00 00
<span style="background-color:#00ff00;">b0 69 7b 4c 6f 7f</span> 00 00 02 00 00 00 00 00 00 00
>> printmem(getaddr(C), 32)
<span style="background-color:#a4c2f4;">10 68 7b 4c 6f 7f</span> 00 00 06 00 00 00 00 00 00 00
<span style="background-color:#e6b8af;">f0 41 7a 4c 6f 7f</span> 00 00 02 00 00 00 00 00 00 00
</pre>
<p>
So it turns out that Matlab keeps track of a set of linked variables with a kind of cyclical, doubly-linked list structure; array A is linked to B in the forward direction and is also linked to C in the reverse direction by looping back around, etc.  The cyclical nature of this makes sense, as we need to be able to start from any of A, B, or C and find all the linked arrays.  But it is still not entirely clear why the list needs to be cyclical AND linked in both directions.  In fact, in earlier versions of Matlab this cyclical list was only singly-linked.</p>
<h3 id="Conclusions">Conclusions</h3>
<p>Obviously there is a lot more to <code>mxArray</code> and Matlab internals than what we have delved into here.  Still, with this basic introduction I hope to have whet your appetite for understanding more about Matlab internals, and provided some simple tools to help you explore.  I want to reiterate that in general MathWorks&#8217;s approach of an opaque <code>mxArray</code> type with access abstracted through an API layer is a good policy.  The last thing you would want to do is take the information here and write a bunch of code that relies on the structure of <code>mxArray</code> to work; next time MathWorks needs to add a new feature and change <code>mxArray</code>, all your code will break.  So in general we are all better off playing within the API that MathWorks provides.  And remember: poking into memory can crash your computer, so save your data!<br />
On the other hand, occasionally there are cases, like in-place editing, where it is useful to push the capabilities of Matlab a little beyond what MathWorks envisioned.  In these cases, having an understanding of Matlab&#8217;s internals can be critical, for example in understanding how to avoid conflicting with copy-on-write.  Therefore I hope the information presented here will prove useful.  Ideally, someone will be motivated to take this starting point and repair some of the tools like <i>headerdump</i> that made Matlab&#8217;s internal workings more transparent in the past.  I believe that having more of this information out in the Matlab community is good for the community as a whole.</p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/matlabs-internal-memory-representation">Matlab&#039;s internal memory representation</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/internal-matlab-memory-optimizations" rel="bookmark" title="Internal Matlab memory optimizations">Internal Matlab memory optimizations </a> <small>Copy-on-write and in-place data manipulations are very useful Matlab performance improvement techniques. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/couple-of-bugs-and-workarounds" rel="bookmark" title="A couple of internal Matlab bugs and workarounds">A couple of internal Matlab bugs and workarounds </a> <small>A couple of undocumented Matlab bugs have simple workarounds. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/profiling-matlab-memory-usage" rel="bookmark" title="Profiling Matlab memory usage">Profiling Matlab memory usage </a> <small>mtic and mtoc were a couple of undocumented features that enabled users of past Matlab releases to easily profile memory usage. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/accessing-internal-java-class-members" rel="bookmark" title="Accessing internal Java class members">Accessing internal Java class members </a> <small>Java inner classes and enumerations can be used in Matlab with a bit of tweaking. ...</small></li>
</ol>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://undocumentedmatlab.com/articles/matlabs-internal-memory-representation/feed</wfw:commentRss>
			<slash:comments>9</slash:comments>
		
		
			</item>
		<item>
		<title>Profiling Matlab memory usage</title>
		<link>https://undocumentedmatlab.com/articles/profiling-matlab-memory-usage?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=profiling-matlab-memory-usage</link>
					<comments>https://undocumentedmatlab.com/articles/profiling-matlab-memory-usage#comments</comments>
		
		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Thu, 01 Mar 2012 00:13:04 +0000</pubDate>
				<category><![CDATA[High risk of breaking in future versions]]></category>
		<category><![CDATA[Memory]]></category>
		<category><![CDATA[Stock Matlab function]]></category>
		<category><![CDATA[Undocumented function]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Profiler]]></category>
		<category><![CDATA[Pure Matlab]]></category>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=2768</guid>

					<description><![CDATA[<p>mtic and mtoc were a couple of undocumented features that enabled users of past Matlab releases to easily profile memory usage. </p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/profiling-matlab-memory-usage">Profiling Matlab memory usage</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/internal-matlab-memory-optimizations" rel="bookmark" title="Internal Matlab memory optimizations">Internal Matlab memory optimizations </a> <small>Copy-on-write and in-place data manipulations are very useful Matlab performance improvement techniques. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlabs-internal-memory-representation" rel="bookmark" title="Matlab&#039;s internal memory representation">Matlab&#039;s internal memory representation </a> <small>Matlab's internal memory structure is explored and discussed. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-java-memory-leaks-performance" rel="bookmark" title="Matlab-Java memory leaks, performance">Matlab-Java memory leaks, performance </a> <small>Internal fields of Java objects may leak memory - this article explains how to avoid this without sacrificing performance. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/function-call-timeline-profiling" rel="bookmark" title="Function call timeline profiling">Function call timeline profiling </a> <small>A new utility enables to interactively explore Matlab function call timeline profiling. ...</small></li>
</ol>
</div>
]]></description>
										<content:encoded><![CDATA[<p>Anyone who has had experience with real-life applications knows that Memory usage can have a significant impact on the application&#8217;s usability, in aspects such as performance, interactivity, and even (on some lousy memory-management Operating Systems) crashes/hangs.<br />
In Matlab releases of the past few years, this has been addressed by expanding the information reported by the built-in <i><b>memory</b></i> function. In addition, an undocumented feature was added to the Matlab Profiler that <a target="_blank" href="/articles/undocumented-profiler-options/">enables monitoring</a> memory usage.<br />
<center><figure style="width: 450px" class="wp-caption aligncenter"><br />
<img decoding="async" title="Profile report with memory &amp; JIT info" src="https://undocumentedmatlab.com/images/profile2d_450.png" alt="Profile report with memory &amp; JIT info" width="450" /><img decoding="async" title="Profile report with memory &amp; JIT info" src="https://undocumentedmatlab.com/images/profile2c_450.png" alt="Profile report with memory &amp; JIT info" width="450" /><br />
<img decoding="async" title="Profile report with memory &amp; JIT info" src="https://undocumentedmatlab.com/images/profile2.png" alt="Profile report with memory &amp; JIT info" width="416" /><br />
<figcaption class="wp-caption-text">Profile report with memory &amp; JIT info</figcaption></figure></center><br />
In Matlab release R2008a (but not on newer releases) we could also use a nifty parameter of the undocumented <a target="_blank" href="/articles/undocumented-feature-function/"><i><b>feature</b></i> function</a>:</p>
<pre lang='matlab'>
>> feature mtic; a=ones(100); feature mtoc
ans =
      TotalAllocated: 84216
          TotalFreed: 2584
    LargestAllocated: 80000
           NumAllocs: 56
            NumFrees: 43
                Peak: 81640
</pre>
<p>As can easily be seen in this example, allocating 100<sup>2</sup> doubles requires 80000 bytes of allocation, plus some 4KB others that were allocated (and 2KB freed) within the function <i><b>ones</b></i>. Running the same code line again gives a very similar result, but now there are 80000 more bytes freed when the matrix <code>a</code> is overwritten:</p>
<pre lang='matlab'>
>> feature mtic; a=ones(100); feature mtoc
ans =
      TotalAllocated: 84120
          TotalFreed: 82760
    LargestAllocated: 80000
           NumAllocs: 54
            NumFrees: 49
                Peak: 81328
</pre>
<p>This is pretty informative and very handy for debugging memory bottlenecks. Unfortunately, starting in R2008b, features mtic and mtoc are no longer supported <i>&#8220;under the current <a target="_blank" rel="nofollow" href="http://www.mathworks.com/support/tech-notes/1100/1106.html">memory manager</a>&#8220;</i>. Sometime around 2010 the mtic and mtoc features were completely removed. Users of R2008b and newer releases therefore need to use the internal structs returned by the <i><b>memory</b></i> function, and/or use the profiler&#8217;s memory-monitoring feature. If you ask me, using mtic/mtoc was much simpler and easier. I for one miss these features.<br />
<span id="Java"></span><br />
In a related matter, if we wish to monitor Java&#8217;s memory used within Matlab, we are in a bind, because there are no built-in tools to help us. there are several JVM switches that can be turned on in the <a target="_blank" rel="nofollow" href="http://www.mathworks.com/support/solutions/en/data/1-18I2C/"><i>java.opts</i></a> file: -Xrunhprof[:help]|[:option=value,&#8230;], -Xprof, -Xrunprof, -XX:+PrintClassHistogram <a target="_blank" rel="nofollow" href="http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html">and so on</a>. There are several memory-monitoring (so-called &#8220;heap-walking&#8221;) tools: the standard JDK jconsole, jmap, jhat and jvisualvm (with its useful plugins) provide good basic coverage. MathWorks has <a target="_blank" rel="nofollow" href="http://www.mathworks.com/support/solutions/en/data/1-3L4JU7/">posted</a> a tutorial on using jconsole with Matlab. There are a number of other third-party tools such as <a target="_blank" rel="nofollow" href="http://www.khelekore.org/jmp/">JMP</a> (for JVMs 1.5 and earlier) or <a target="_blank" rel="nofollow" href="http://www.khelekore.org/jmp/tijmp/">TIJMP</a> (for JVM 1.6). Within Matlab, we can use utilities such as <a target="_blank" rel="nofollow" href="http://www.javamex.com/classmexer/">Classmexer</a> to estimate a particular object&#8217;s size (both shallow and deep referencing), or use <code>java.lang.Runtime.getRuntime()</code>&#8216;s methods (<i>maxMemory(), freeMemory()</i> and <i>totalMemory()</i>) to monitor overall Java memory (<a target="_blank" rel="nofollow" href="https://www.mathworks.com/matlabcentral/newsreader/view_thread/296813#797410">sample usage</a>).<br />
We can monitor the Java memory (which is part of the overall Matlab process memory) usage using Java&#8217;s built-in <a target="_blank" rel="nofollow" href="http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/Runtime.html"><code>Runtime</code></a> class:</p>
<pre lang='matlab'>
>> r=java.lang.Runtime.getRuntime
r =
java.lang.Runtime@5fb3b54
>> r.freeMemory
ans =
    86147768
>> r.totalMemory
ans =
   268304384
>> usedMemory = r.totalMemory - r.freeMemory;
</pre>
<p>Specifically in R2011b (but in no other release), we can also use a built-in Java memory monitor. Unfortunately, this simple and yet useful memory monitor was removed in R2012a (or maybe it was just moved to another package and I haven&#8217;t found out where&#8230; <i>yet</i>&#8230;):</p>
<pre lang='matlab'>
com.mathworks.xwidgets.JavaMemoryMonitor.invoke
</pre>
<p><center><figure style="width: 220px" class="wp-caption aligncenter"><img decoding="async" alt="Matlab R2011b's Java memory monitor" src="https://undocumentedmatlab.com/images/Java_Memory_Monitor.png" title="Matlab R2011b's Java memory monitor" width="159" /><figcaption class="wp-caption-text">Matlab R2011b's Java memory monitor</figcaption></figure></center><br />
As I have already noted quite often, using undocumented Matlab features and functions carries the risk that they will not be supported in some future Matlab release. Today&#8217;s article is a case in point.</p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/profiling-matlab-memory-usage">Profiling Matlab memory usage</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/internal-matlab-memory-optimizations" rel="bookmark" title="Internal Matlab memory optimizations">Internal Matlab memory optimizations </a> <small>Copy-on-write and in-place data manipulations are very useful Matlab performance improvement techniques. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlabs-internal-memory-representation" rel="bookmark" title="Matlab&#039;s internal memory representation">Matlab&#039;s internal memory representation </a> <small>Matlab's internal memory structure is explored and discussed. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-java-memory-leaks-performance" rel="bookmark" title="Matlab-Java memory leaks, performance">Matlab-Java memory leaks, performance </a> <small>Internal fields of Java objects may leak memory - this article explains how to avoid this without sacrificing performance. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/function-call-timeline-profiling" rel="bookmark" title="Function call timeline profiling">Function call timeline profiling </a> <small>A new utility enables to interactively explore Matlab function call timeline profiling. ...</small></li>
</ol>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://undocumentedmatlab.com/articles/profiling-matlab-memory-usage/feed</wfw:commentRss>
			<slash:comments>8</slash:comments>
		
		
			</item>
		<item>
		<title>Matlab mex in-place editing</title>
		<link>https://undocumentedmatlab.com/articles/matlab-mex-in-place-editing?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=matlab-mex-in-place-editing</link>
					<comments>https://undocumentedmatlab.com/articles/matlab-mex-in-place-editing#comments</comments>
		
		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Wed, 08 Feb 2012 17:00:25 +0000</pubDate>
				<category><![CDATA[Guest bloggers]]></category>
		<category><![CDATA[High risk of breaking in future versions]]></category>
		<category><![CDATA[Memory]]></category>
		<category><![CDATA[Mex]]></category>
		<category><![CDATA[Stock Matlab function]]></category>
		<category><![CDATA[Undocumented feature]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Peter Li]]></category>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=2699</guid>

					<description><![CDATA[<p>Editing Matlab arrays in-place can be an important technique for optimizing calculations. This article shows how to do it using Mex.  </p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/matlab-mex-in-place-editing">Matlab mex in-place editing</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/serializing-deserializing-matlab-data" rel="bookmark" title="Serializing/deserializing Matlab data">Serializing/deserializing Matlab data </a> <small>Matlab does not provide a documented manner to serialize data into a byte stream, but we can do this with some undocumented functionality. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/internal-matlab-memory-optimizations" rel="bookmark" title="Internal Matlab memory optimizations">Internal Matlab memory optimizations </a> <small>Copy-on-write and in-place data manipulations are very useful Matlab performance improvement techniques. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/accessing-private-object-properties" rel="bookmark" title="Accessing private object properties">Accessing private object properties </a> <small>Private properties of Matlab class objects can be accessed (read and write) using some undocumented techniques. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlabs-internal-memory-representation" rel="bookmark" title="Matlab&#039;s internal memory representation">Matlab&#039;s internal memory representation </a> <small>Matlab's internal memory structure is explored and discussed. ...</small></li>
</ol>
</div>
]]></description>
										<content:encoded><![CDATA[<p><i>I would like to welcome Matlab Mex power-user <a target="_blank" rel="nofollow" href="http://absurdlycertain.blogspot.com/">Peter Li</a> to a first in a short series of articles about undocumented aspects of Mex programing</i><br />
Editing Matlab arrays in-place can be an important technique for optimizing calculations, especially when handling data that use large blocks of memory.  The Matlab language itself has some <a target="_blank" rel="nofollow" href="http://blogs.mathworks.com/loren/2007/03/22/in-place-operations-on-data/">limited support for in-place editing</a>, but when we are really concerned with speed we often turn to writing C/C++ extensions using the Mex interface.  Unfortunately, editing arrays in-place from Mex extensions is not officially supported in Matlab, and doing it incorrectly can cause data inconsistencies or can even cause Matlab to crash.  In this article, I will introduce the problem and show a simple solution that exhibit the basic implementation details of Matlab&#8217;s internal copy-on-write mechanism.</p>
<h3 id="Motivation">Why edit in-place?</h3>
<p>To demonstrate the techniques in this article, I use the <i>fast_median</i> function, which is part of <a target="_blank" rel="nofollow" href="http://www.mathworks.com/matlabcentral/fileexchange/29453-nthelement">my nth_element package</a> on Matlab&#8217;s File Exchange.  You can download the package and play with the code if you want.  The examples are fairly self-explanatory, so if you do not want to try the code you should be okay just following along.<br />
Let us try a few function calls to see how editing in-place can save time and memory:</p>
<pre lang='matlab'>
>> A = rand(100000000, 1);
>> tic; median(A); toc
Elapsed time is 4.122654 seconds.
>> tic; fast_median(A); toc
Elapsed time is 1.646448 seconds.
>> tic; fast_median_ip(A); toc
Elapsed time is 0.927898 seconds.
</pre>
<p>If you try running this, be careful not to make A too large; tune the example according to the memory available on your system. In terms of the execution time for the different functions, your mileage may vary depending on factors such as: your system, what Matlab version you are running, and whether your test data is arranged in a single vector or a multicolumn array.<br />
This example illustrates a few general points: first, <i>fast_median</i> is significantly faster than Matlab&#8217;s <i><b>native</b></i> median function. This is because <i>fast_median</i> uses a more efficient algorithm; see the nth_element page for more details.  Besides being a shameless plug, this demonstrates why we might want to write a Mex function in the first place: rewriting the median function in pure Matlab would be slow, but using C++ we can significantly improve on the status quo.<br />
The second point is that the in-place version, <i>fast_median_ip</i>, yields an additional speed improvement.  What is the difference?  Let us look behind the scenes; here are the CPU and memory traces from my system monitor after running the above:<br />
<center><figure style="width: 377px" class="wp-caption alignleft"><img loading="lazy" decoding="async" alt="Memory and CPU usage for median() vs. fast_median_ip()" src="https://undocumentedmatlab.com/images/median_vs_fast_median_ip.png" title="Memory and CPU usage for median() vs. fast_median_ip()" width="377" height="425"/><figcaption class="wp-caption-text">Memory and CPU usage for <i><b>median</b></i> vs. <i>fast_median_ip</i></figcaption></figure></center><br />
You can see four spikes in CPU use, and some associated changes in memory allocation:<br />
The first spike in CPU is when we created the test data vector; memory use also steps up at that time.<br />
The second CPU spike is the largest; that is Matlab&#8217;s median function.  You can see that over that period memory use stepped up again, and then stepped back down; the median function makes a copy of the entire input data, and then throws its copy away when it is finished; this is expensive in terms of time and resources.<br />
The <i>fast_median</i> function is the next CPU spike; it has a similar step up and down in memory use, but it is much faster.<br />
Finally, in the case of <i>fast_median_ip</i> we see something different; there is a spike in CPU use, but memory use stays flat; the in-place version is faster and more memory efficient because it does not make a copy of the input data.</p>
<div class="" style="width: 100%; overflow: auto;"></div>
<p>There is another important difference with the in-place version; it modifies its input array.  This can be demonstrated simply:</p>
<pre lang='matlab'>
>> A = randi(100, [10 1]);
>> A'
ans = 39    42    98    25    64    75     6    56    71    89
>> median(A)
ans = 60
>> fast_median(A)
ans = 60
>> A'
ans = 39    42    98    25    64    75     6    56    71    89
>> fast_median_ip(A)
ans = 60
>> A'
ans = 39     6    25    42    56    64    75    71    98    89
</pre>
<p>As you can see, all three methods get the same answer, but <i><b>median</b></i> and <i>fast_median</i> do not modify A in the workspace, whereas after running <i>fast_median_ip</i>, input array A has changed.  This is how the in-place method is able to run without using new memory; it operates on the existing array rather than making a copy.</p>
<h3 id="Pitfalls">Pitfalls with in-place editing</h3>
<p>Modifying a function&#8217;s input is common in many languages, but in Matlab there are only a few special conditions under which this is officially sanctioned.  This is not necessarily a bad thing; many people feel that modifying input data is bad programming practice and makes code harder to maintain.  But as we have shown, it can be an important capability to have if speed and memory use are critical to an application.<br />
Given that in-place editing is not officially supported in Matlab Mex extensions, what do we have to do to make it work?  Let us look at the normal, input-copying <i>fast_median</i> function as a starting point:</p>
<pre lang='cpp'>
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   mxArray *incopy = mxDuplicateArray(prhs[0]);
   plhs[0] = run_fast_median(incopy);
}
</pre>
<p>This is a pretty simple function (I have taken out a few lines of boiler plate input checking to keep things clean).  It relies on helper function <i>run_fast_median</i> to do the actual calculation, so the only real logic here is copying the input array <code>prhs[0]</code>.  Importantly, <i>run_fast_median</i> edits its inputs in-place, so the call to <i>mxDuplicateArray</i> ensures that the Mex function is overall well behaved, i.e. that it does not change its inputs.<br />
Who wants to be well behaved though?  Can we save time and memory just by taking out the input duplication step?  Let us try it:</p>
<pre lang='cpp'>
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   plhs[0] = run_fast_median(const_cast<mxArray *>(prhs[0]));  // </mxArray>
}
</pre>
<p>Very bad behavior; note that we cast the original <code>const mxArray*</code> input to a <code>mxArray*</code> so that the compiler will let us mess with it; naughty.<br />
But does this accomplish edit in-place for <i>fast_median</i>?  Be sure to save any work you have open and then try it:</p>
<pre lang='matlab'>
>> mex fast_median_tweaked.cpp
>> A = randi(100,[10 1]);
>> fast_median_tweaked(A)
ans = 43
</pre>
<p>Hmm, it looks like this worked fine.  But in fact there are subtle problems:</p>
<pre lang='matlab'>
>> A = randi(100,[10 1]);
>> A'
ans = 65    92    14    26    41     2    45    85    53     2
>> B = A;
>> B'
ans = 65    92    14    26    41     2    45    85    53     2
>> fast_median_tweaked(A)
ans = 43
>> A'
ans = 2     2    41    26    14    45    65    85    53    92
>> B'
ans = 2     2    41    26    14    45    65    85    53    92
</pre>
<p>Uhoh, spooky; we expected that running <i>fast_median_tweaked</i> would change input A, but somehow it has also changed B, even though B is supposed to be an independent copy.  Not good.  In fact, under some conditions this kind of uncontrolled editing in-place can crash the entire Matlab environment with a segfault.  What is going on?</p>
<h3 id="COW">Matlab&#8217;s copy-on-write mechanism</h3>
<p>The answer is that our simple attempt to edit in-place conflicts with Matlab&#8217;s internal copy-on-write mechanism.  Copy-on-write is an optimization built into Matlab to help avoid expensive copying of variables in memory (actually similar to what we are trying to accomplish with edit in-place).  We can see copy-on-write in action with some simple tests:<br />
<figure style="width: 393px" class="wp-caption alignright"><img loading="lazy" decoding="async" alt="Matlab's Copy-on-Write memory and CPU usage" src="https://undocumentedmatlab.com/images/copy-on-write.png" title="Matlab's Copy-on-Write memory and CPU usage" width="393" height="466"/><figcaption class="wp-caption-text">Matlab's Copy-on-Write memory and CPU usage</figcaption></figure></p>
<div>
<pre lang='matlab'>
% Test #1: update, then copy
>> tic; A = zeros(100000000, 1); toc
Elapsed time is 0.588937 seconds.
>> tic; A(1) = 0; toc
Elapsed time is 0.000008 seconds.
>> tic; B = A; toc
Elapsed time is 0.000020 seconds.
% Test #2: copy, then update
>> clear
>> tic; A = zeros(100000000, 1); toc
Elapsed time is 0.588937 seconds.
>> tic; B = A; toc
Elapsed time is 0.000020 seconds.
>> tic; A(1) = 0; toc
Elapsed time is 0.678160 seconds.
</pre>
</div>
<p>In the first set of operations, time and memory are used to create A, but updating A and &#8220;copying&#8221; A into B take no memory and essentially no time.  This may come as a surprise since supposedly we have made an independent copy of A in B; why does creating B take no time or memory when A is clearly a large, expensive block?<br />
The second set of operations makes things more clear.  In this case, we again create A and then copy it to B; again this operation is fast and cheap.  But assigning into A at this point takes time and consumes a new block of memory, even though we are only assigning into a single index of A.  This is copy-on-write: Matlab tries to save you from copying large blocks of memory unless you need to.  So when you first assign B to equal A, nothing is copied; the variable B is simply set to point to the same memory location already used by A.  Only after you try to change A (or B), does Matlab decide that you really need to have two copies of the large array.<br />
There are some additional tricks Matlab does with copy-on-write.  Here is another example:</p>
<pre lang='matlab'>
>> clear
>> tic; A{1} = zeros(100000000, 1); toc
Elapsed time is 0.573240 seconds.
>> tic; A{2} = zeros(100000000, 1); toc
Elapsed time is 0.560369 seconds.
>> tic; B = A; toc
Elapsed time is 0.000016 seconds.
>> tic; A{1}(1) = 0; toc
Elapsed time is 0.690690 seconds.
>> tic; A{2}(1) = 0; toc
Elapsed time is 0.695758 seconds.
>> tic; A{1}(1) = 0; toc
Elapsed time is 0.000011 seconds.
>> tic; A{2}(1) = 0; toc
Elapsed time is 0.000004 seconds.
</pre>
<p>This shows that for the purposes of copy-on-write, different elements of cell array A are treated independently.  When we assign B equal to A, nothing is copied.  Then when we change any part of A{1}, that whole element must be copied over.  When we subsequently change A{2}, that whole element must also be copied over; it was not copied earlier.  At this point, A and B are truly independent of each other, as both elements have experienced copy-on-write, so further assignments into either A or B are fast and require no additional memory.<br />
Try playing with some struct arrays and you will find that copy-on-write also works independently for the elements of structs.</p>
<h3 id="mxUnshareArray">Reconciling edit in-place with copy-on-write: mxUnshareArray</h3>
<p>Now it is clear why we cannot simply edit arrays in-place from Mex functions; not only is it naughty, it fundamentally conflicts with copy-on-write.  Naively changing an array in-place can inadvertently change other variables that are still waiting for a copy-on-write, as we saw above when <i>fast_median_tweaked</i> inadvertently changed B in the workspace. This is, in the best case, an unmaintainable mess.  Under more aggressive in-place editing, it can cause Matlab to crash with a segfault.<br />
Luckily, there is a simple solution, although it requires calling internal, undocumented Matlab functions.<br />
Essentially what we need is a Mex function that can be run on a Matlab array that will do the following:</p>
<ol>
<li>Check whether the current array is sharing data with any other arrays that are waiting for copy-on-write.</li>
<li>If the array is shared, it must be unshared; the underlying memory must be copied and all the relevant pointers need to be fixed so that the array we want to work on is no longer accessible by anyone else.</li>
<li>If the array is not currently shared, simply proceed; the whole point is to avoid copying memory if we do not need to, so that we can benefit from the efficiencies of edit in-place.</li>
</ol>
<p>If you think about it, this is exactly the operation that Matlab needs to run internally when it is deciding whether an assignment operation requires a copy-on-write.  So it should come as no surprise that such a Mex function already exists in the form of a Matlab internal called <i>mxUnshareArray</i>.  Here is how you use it:</p>
<pre lang='cpp'>
extern "C" bool mxUnshareArray(mxArray *array_ptr, bool noDeepCopy);
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
   mxUnshareArray(const_cast<mxArray *>(prhs[0]), true);  //</mxArray>
   plhs[0] = run_fast_median(const_cast<mxArray *>(prhs[0]));  //</mxArray>
}
</pre>
<p>This is the method actually used by <i>fast_median_ip</i> to efficiently edit in-place without risking conflicts with copy-on-write.  Of course, if the array turns out to need to be unshared, then you do not get the benefit of edit in-place because the memory ends up getting copied.  But at least things are safe and you get the in-place benefit as long as the input array is not being shared.</p>
<h3 id="Extra">Further topics</h3>
<p>The method shown here should allow you to edit normal Matlab numerical or character arrays in-place from Mex functions safely.  For a Mex function in C rather than C++, omit the &#8220;C&#8221; in the <code>extern</code> declaration and of course you will have to use C-style casting rather than <code>const_cast</code>.  If you need to edit cell or struct arrays in-place, or especially if you need to edit subsections of shared cell or struct arrays safely and efficiently while leaving the rest of the data shared, then you will need a few more tricks.  A good place to get started is <a target="_blank" rel="nofollow" href="http://www.mk.tu-berlin.de/Members/Benjamin/mex_sharedArrays">this article by Benjamin Schubert</a>.<br />
Unfortunately, over the last few years Mathworks seems to have decided to make it more difficult for users to access these kinds of internal methods to make our code more efficient.  So be aware of the risk that in some future version of Matlab this method will no longer work in its current form.<br />
Ultimately much of what is known about <i>mxUnshareArray</i> as well as the internal implementation details of how Matlab keeps track of which arrays are shared goes back to the work of Peter Boettcher, particularly his <a target="_blank" rel="nofollow" href="http://groups.google.com/group/comp.soft-sys.matlab/browse_thread/thread/c241d8821fb90275/">headerdump.c utility</a>.  Unfortunately, it appears that HeaderDump fails with Matlab releases >=R2010a, as Mathworks have changed some of the internal memory formats &#8211; perhaps some smart reader can pick up the work and adapt HeaderDump to the new memory format.<br />
In a future article, I hope to discuss headerdump.c and its relevance for copy-on-write and edit in-place, and some other related tools for the latest Matlab releases that do not support HeaderDump.</p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/matlab-mex-in-place-editing">Matlab mex in-place editing</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/serializing-deserializing-matlab-data" rel="bookmark" title="Serializing/deserializing Matlab data">Serializing/deserializing Matlab data </a> <small>Matlab does not provide a documented manner to serialize data into a byte stream, but we can do this with some undocumented functionality. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/internal-matlab-memory-optimizations" rel="bookmark" title="Internal Matlab memory optimizations">Internal Matlab memory optimizations </a> <small>Copy-on-write and in-place data manipulations are very useful Matlab performance improvement techniques. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/accessing-private-object-properties" rel="bookmark" title="Accessing private object properties">Accessing private object properties </a> <small>Private properties of Matlab class objects can be accessed (read and write) using some undocumented techniques. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlabs-internal-memory-representation" rel="bookmark" title="Matlab&#039;s internal memory representation">Matlab&#039;s internal memory representation </a> <small>Matlab's internal memory structure is explored and discussed. ...</small></li>
</ol>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://undocumentedmatlab.com/articles/matlab-mex-in-place-editing/feed</wfw:commentRss>
			<slash:comments>15</slash:comments>
		
		
			</item>
	</channel>
</rss>
