<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	
	>
<channel>
	<title>
	Comments on: rmfield performance	</title>
	<atom:link href="https://undocumentedmatlab.com/articles/rmfield-performance/feed" rel="self" type="application/rss+xml" />
	<link>https://undocumentedmatlab.com/articles/rmfield-performance?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=rmfield-performance</link>
	<description>Professional Matlab consulting, development and training</description>
	<lastBuildDate>Thu, 16 Jan 2025 15:33:04 +0000</lastBuildDate>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.7.2</generator>
	<item>
		<title>
		By: Yair Altman		</title>
		<link>https://undocumentedmatlab.com/articles/rmfield-performance#comment-526300</link>

		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Thu, 16 Jan 2025 15:29:56 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6427#comment-526300</guid>

					<description><![CDATA[In reply to &lt;a href=&quot;https://undocumentedmatlab.com/articles/rmfield-performance#comment-526298&quot;&gt;tommsch&lt;/a&gt;.

I agree, you have a good point :-)]]></description>
			<content:encoded><![CDATA[<p>In reply to <a href="https://undocumentedmatlab.com/articles/rmfield-performance#comment-526298">tommsch</a>.</p>
<p>I agree, you have a good point 🙂</p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: tommsch		</title>
		<link>https://undocumentedmatlab.com/articles/rmfield-performance#comment-526298</link>

		<dc:creator><![CDATA[tommsch]]></dc:creator>
		<pubDate>Thu, 16 Jan 2025 14:30:49 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6427#comment-526298</guid>

					<description><![CDATA[I suggest the name `rmfield_fast`, because after some time one usually forgot that there exists a fast version of `rmfield`. But, if &quot;fast&quot; is appended at the end, then the intellisense-thingy of Matlab will show you the function `rmfield_fast` whenever you type `rmfield`.]]></description>
			<content:encoded><![CDATA[<p>I suggest the name `rmfield_fast`, because after some time one usually forgot that there exists a fast version of `rmfield`. But, if &#8220;fast&#8221; is appended at the end, then the intellisense-thingy of Matlab will show you the function `rmfield_fast` whenever you type `rmfield`.</p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Hoi Wong		</title>
		<link>https://undocumentedmatlab.com/articles/rmfield-performance#comment-399230</link>

		<dc:creator><![CDATA[Hoi Wong]]></dc:creator>
		<pubDate>Mon, 30 Jan 2017 19:32:09 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6427#comment-399230</guid>

					<description><![CDATA[Thanks for pointing that out. I didn&#039;t even notice (or expect) that rmfield() is not a built-in low level function. 

Since the rmfield() code calls struct2cell() then cell2struct(), it looks like it&#039;s saying that behind the scene, struct() is basically a high level wrapper around cells using hash keys to map the name to indices: a useful piece of information to keep in mind for performance tuning. Actually, table() or dataset() object deals with cells under the hood, I just wasn&#039;t expecting struct() to be the same given its origins in C.

I found rmfield() so slow that I&#039;ve actually written a keepField() long time ago for the exact same reason as your application scenario: if I need to remove 5000 fields, I might as well keep what I want by adding to an empty (fieldless) struct one-field at a time. i.e.

&lt;pre lang=&#039;matlab&#039;&gt;
for k=1:length(fieldsToKeep)
  Y.(fieldsToKeep{k}) = X.(fieldsToKeep{k});
end
&lt;/pre&gt;

It turned out to be much faster too because there are no names to search for. Dynamic field names are done with hash table (checked with TMW, it&#039;s not documented), so it&#039;s on average O(1) time. It boils down to the same O(nlog(n)) time as the setdiff() proposed if you ultimately have to identify the fields to remove instead. 

Unfortunately MATLAB has only rmfield(), so I suspect a lot of people might have done a set-op (spent O(nlog(n)) time) to get the list to keep (the complementary set to remove) then run through the O(n) algorithm in rmfield() when they could have done it in average O(1) time by just transferring the wanted fields.]]></description>
			<content:encoded><![CDATA[<p>Thanks for pointing that out. I didn&#8217;t even notice (or expect) that rmfield() is not a built-in low level function. </p>
<p>Since the rmfield() code calls struct2cell() then cell2struct(), it looks like it&#8217;s saying that behind the scene, struct() is basically a high level wrapper around cells using hash keys to map the name to indices: a useful piece of information to keep in mind for performance tuning. Actually, table() or dataset() object deals with cells under the hood, I just wasn&#8217;t expecting struct() to be the same given its origins in C.</p>
<p>I found rmfield() so slow that I&#8217;ve actually written a keepField() long time ago for the exact same reason as your application scenario: if I need to remove 5000 fields, I might as well keep what I want by adding to an empty (fieldless) struct one-field at a time. i.e.</p>
<pre lang='matlab'>
for k=1:length(fieldsToKeep)
  Y.(fieldsToKeep{k}) = X.(fieldsToKeep{k});
end
</pre>
<p>It turned out to be much faster too because there are no names to search for. Dynamic field names are done with hash table (checked with TMW, it&#8217;s not documented), so it&#8217;s on average O(1) time. It boils down to the same O(nlog(n)) time as the setdiff() proposed if you ultimately have to identify the fields to remove instead. </p>
<p>Unfortunately MATLAB has only rmfield(), so I suspect a lot of people might have done a set-op (spent O(nlog(n)) time) to get the list to keep (the complementary set to remove) then run through the O(n) algorithm in rmfield() when they could have done it in average O(1) time by just transferring the wanted fields.</p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Malcolm Lidierth		</title>
		<link>https://undocumentedmatlab.com/articles/rmfield-performance#comment-379354</link>

		<dc:creator><![CDATA[Malcolm Lidierth]]></dc:creator>
		<pubDate>Mon, 06 Jun 2016 08:34:48 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6427#comment-379354</guid>

					<description><![CDATA[I found this too with a package that was heavily profiled up to  R2012a. Maybe things have changed as JIT acceleration has improved but there were two &#039;tricks&#039; I used often. 
Conditional statements  were frequently the bottleneck, but served little purpose in the specific context e.g.
&lt;pre lang=&quot;matlab&quot;&gt;
if ~isa(x,&#039;double&#039;)
   x=double(x);
end
&lt;/pre&gt;
could often safely be replaced with  
&lt;pre lang=&quot;matlab&quot;&gt;x=double(x);&lt;/pre&gt;

Also, a try-catch sequence in place of conditional tests was often faster.

ML]]></description>
			<content:encoded><![CDATA[<p>I found this too with a package that was heavily profiled up to  R2012a. Maybe things have changed as JIT acceleration has improved but there were two &#8216;tricks&#8217; I used often.<br />
Conditional statements  were frequently the bottleneck, but served little purpose in the specific context e.g.</p>
<pre lang="matlab">
if ~isa(x,'double')
   x=double(x);
end
</pre>
<p>could often safely be replaced with  </p>
<pre lang="matlab">x=double(x);</pre>
<p>Also, a try-catch sequence in place of conditional tests was often faster.</p>
<p>ML</p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Fernando		</title>
		<link>https://undocumentedmatlab.com/articles/rmfield-performance#comment-379286</link>

		<dc:creator><![CDATA[Fernando]]></dc:creator>
		<pubDate>Sun, 05 Jun 2016 18:36:37 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6427#comment-379286</guid>

					<description><![CDATA[This is really cool. I remember years ago having to accelerate an algorithm that used Matlab&#039;s bultin kronecker tensor product. Luckily, I was able to find this: http://www.mathworks.com/matlabcentral/fileexchange/23606-fast-and-efficient-kronecker-multiplication]]></description>
			<content:encoded><![CDATA[<p>This is really cool. I remember years ago having to accelerate an algorithm that used Matlab&#8217;s bultin kronecker tensor product. Luckily, I was able to find this: <a href="http://www.mathworks.com/matlabcentral/fileexchange/23606-fast-and-efficient-kronecker-multiplication" rel="nofollow ugc">http://www.mathworks.com/matlabcentral/fileexchange/23606-fast-and-efficient-kronecker-multiplication</a></p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Yair Altman		</title>
		<link>https://undocumentedmatlab.com/articles/rmfield-performance#comment-378029</link>

		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Thu, 26 May 2016 10:02:20 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6427#comment-378029</guid>

					<description><![CDATA[In reply to &lt;a href=&quot;https://undocumentedmatlab.com/articles/rmfield-performance#comment-377970&quot;&gt;Peter&lt;/a&gt;.

@Peter - excellent usage example. Thanks for sharing.]]></description>
			<content:encoded><![CDATA[<p>In reply to <a href="https://undocumentedmatlab.com/articles/rmfield-performance#comment-377970">Peter</a>.</p>
<p>@Peter &#8211; excellent usage example. Thanks for sharing.</p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Peter		</title>
		<link>https://undocumentedmatlab.com/articles/rmfield-performance#comment-377970</link>

		<dc:creator><![CDATA[Peter]]></dc:creator>
		<pubDate>Thu, 26 May 2016 00:08:37 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6427#comment-377970</guid>

					<description><![CDATA[I recently wrote a function that needed to calculate many many thousands of dot products. When I profiled my function, it was spending a ton of time in the &lt;i&gt;&lt;b&gt;dot&lt;/b&gt;&lt;/i&gt; function. When I opened &lt;i&gt;dot.m&lt;/i&gt;, it was mostly sanity checks I didn&#039;t need, so I just inlined the dot calculation. The function went from minutes to about a second to complete.]]></description>
			<content:encoded><![CDATA[<p>I recently wrote a function that needed to calculate many many thousands of dot products. When I profiled my function, it was spending a ton of time in the <i><b>dot</b></i> function. When I opened <i>dot.m</i>, it was mostly sanity checks I didn&#8217;t need, so I just inlined the dot calculation. The function went from minutes to about a second to complete.</p>
]]></content:encoded>
		
			</item>
	</channel>
</rss>
