<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	
	>
<channel>
	<title>
	Comments on: Convolution performance	</title>
	<atom:link href="https://undocumentedmatlab.com/articles/convolution-performance/feed" rel="self" type="application/rss+xml" />
	<link>https://undocumentedmatlab.com/articles/convolution-performance?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=convolution-performance</link>
	<description>Professional Matlab consulting, development and training</description>
	<lastBuildDate>Sun, 25 Dec 2022 16:16:47 +0000</lastBuildDate>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.7.3</generator>
	<item>
		<title>
		By: Coo Coo		</title>
		<link>https://undocumentedmatlab.com/articles/convolution-performance#comment-516677</link>

		<dc:creator><![CDATA[Coo Coo]]></dc:creator>
		<pubDate>Tue, 06 Dec 2022 20:26:46 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6256#comment-516677</guid>

					<description><![CDATA[FFT-based convolution is circular whereas MATLAB&#039;s conv functions have several options (&#039;valid&#039;, &#039;same&#039;, &#039;full&#039;) but unfortunately not &#039;circ&#039;. For that you need to wrap your own conv function at a cost of replicating the array with padding.

&lt;pre lang=&quot;matlab&quot;&gt;
function C = cconvn(A,B)
% cconvn  N-dimensional circular convolution

sA = size(A);
sB = size(B);

% indices with wrapped endpoints
for k = 1:numel(sA)
    if sA(k)==1 &#124;&#124; k &gt; numel(sB) &#124;&#124; sB(k)==1
        s{k} = &#039;:&#039;;
    else
        s{k} = [sA(k)-ceil(sB(k)/2)+2:sA(k) 1:sA(k) 1:floor(sB(k)/2)];
    end
end

% pad array for convn valid
C = convn(A(s{:}),B,&#039;valid&#039;);
&lt;/pre&gt;
]]></description>
			<content:encoded><![CDATA[<p>FFT-based convolution is circular whereas MATLAB&#8217;s conv functions have several options (&#8216;valid&#8217;, &#8216;same&#8217;, &#8216;full&#8217;) but unfortunately not &#8216;circ&#8217;. For that you need to wrap your own conv function at a cost of replicating the array with padding.</p>
<pre lang="matlab">
function C = cconvn(A,B)
% cconvn  N-dimensional circular convolution

sA = size(A);
sB = size(B);

% indices with wrapped endpoints
for k = 1:numel(sA)
    if sA(k)==1 || k > numel(sB) || sB(k)==1
        s{k} = ':';
    else
        s{k} = [sA(k)-ceil(sB(k)/2)+2:sA(k) 1:sA(k) 1:floor(sB(k)/2)];
    end
end

% pad array for convn valid
C = convn(A(s{:}),B,'valid');
</pre>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Yair Altman		</title>
		<link>https://undocumentedmatlab.com/articles/convolution-performance#comment-414271</link>

		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Sun, 01 Oct 2017 14:42:38 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6256#comment-414271</guid>

					<description><![CDATA[In reply to &lt;a href=&quot;https://undocumentedmatlab.com/articles/convolution-performance#comment-414199&quot;&gt;Alex&lt;/a&gt;.

@Alex - this is due to your use of the optional &lt;code&gt;&#039;replicate&#039;&lt;/code&gt; option in your call to &lt;i&gt;&lt;b&gt;imfilter&lt;/b&gt;&lt;/i&gt;. You are not doing the same with &lt;i&gt;&lt;b&gt;conv2fft&lt;/b&gt;&lt;/i&gt; or &lt;i&gt;&lt;b&gt;convn&lt;/b&gt;&lt;/i&gt;, which causes the results to look different. Border-pixels replication is especially important in cases such as yours where the kernel size is the same size as the input image;

If you remove the &lt;code&gt;&#039;replicate&#039;&lt;/code&gt; option in your call to &lt;i&gt;&lt;b&gt;imfilter&lt;/b&gt;&lt;/i&gt;, you will see that the results look the same (to the naked eye at least...).

If you want to use &lt;i&gt;&lt;b&gt;conv2fft&lt;/b&gt;&lt;/i&gt; or &lt;i&gt;&lt;b&gt;convn&lt;/b&gt;&lt;/i&gt; rather than the slow &lt;i&gt;&lt;b&gt;imfilter&lt;/b&gt;&lt;/i&gt;, and yet you still want to see a nice-looking image, then you should either reduce the kernel size, or enlarge the input image (so that the original image is at its center) and take care of the boundary pixels. You can either do it the same way as the &lt;code&gt;&#039;replicate&#039;&lt;/code&gt; option, or in a different way. For example, here is a simple implementation that at least in my eyes gives superior results even compared to &lt;i&gt;&lt;b&gt;imfilter&lt;/b&gt;&lt;/i&gt;:

&lt;pre lang=&quot;matlab&quot;&gt;
c2 = repmat(CICcut,3,3);  % c2 is 3072x3072, CICcut is 1024x1024
filteredN = convn (g, c2, &#039;same&#039;);
subplot 155, imshow (filteredN, []);   title ({&#039;Gravitational potential&#039; &#039;convn&#039;})
&lt;/pre&gt;]]></description>
			<content:encoded><![CDATA[<p>In reply to <a href="https://undocumentedmatlab.com/articles/convolution-performance#comment-414199">Alex</a>.</p>
<p>@Alex &#8211; this is due to your use of the optional <code>'replicate'</code> option in your call to <i><b>imfilter</b></i>. You are not doing the same with <i><b>conv2fft</b></i> or <i><b>convn</b></i>, which causes the results to look different. Border-pixels replication is especially important in cases such as yours where the kernel size is the same size as the input image;</p>
<p>If you remove the <code>'replicate'</code> option in your call to <i><b>imfilter</b></i>, you will see that the results look the same (to the naked eye at least&#8230;).</p>
<p>If you want to use <i><b>conv2fft</b></i> or <i><b>convn</b></i> rather than the slow <i><b>imfilter</b></i>, and yet you still want to see a nice-looking image, then you should either reduce the kernel size, or enlarge the input image (so that the original image is at its center) and take care of the boundary pixels. You can either do it the same way as the <code>'replicate'</code> option, or in a different way. For example, here is a simple implementation that at least in my eyes gives superior results even compared to <i><b>imfilter</b></i>:</p>
<pre lang="matlab">
c2 = repmat(CICcut,3,3);  % c2 is 3072x3072, CICcut is 1024x1024
filteredN = convn (g, c2, 'same');
subplot 155, imshow (filteredN, []);   title ({'Gravitational potential' 'convn'})
</pre>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Alex		</title>
		<link>https://undocumentedmatlab.com/articles/convolution-performance#comment-414199</link>

		<dc:creator><![CDATA[Alex]]></dc:creator>
		<pubDate>Sat, 30 Sep 2017 13:50:28 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6256#comment-414199</guid>

					<description><![CDATA[In reply to &lt;a href=&quot;https://undocumentedmatlab.com/articles/convolution-performance#comment-370701&quot;&gt;Alex&lt;/a&gt;.

Hello,

I am having a problem trying to do FFT-based convolution in 2D. &lt;i&gt;&lt;b&gt;convnfft&lt;/b&gt;&lt;/i&gt; is definitely the fastest one, but only &lt;i&gt;&lt;b&gt;imfilter&lt;/b&gt;&lt;/i&gt; produces a valid result. For &lt;i&gt;&lt;b&gt;convnfft&lt;/b&gt;&lt;/i&gt; and &lt;i&gt;&lt;b&gt;convn&lt;/b&gt;&lt;/i&gt; the result is wrong, as can be seen in the minimal working example below:

&lt;pre lang=&quot;matlab&quot;&gt;
% generate image
len = 2^10;
CICcut = zeros (len);
CICcut = imnoise (CICcut, &#039;salt &amp; pepper&#039;, 0.0001);
CICcut = CICcut.*(rand(len)).^2;
gauss = fspecial(&#039;gaussian&#039;, round(sqrt(len)), sqrt(sqrt(len)));
CICcut = imfilter (CICcut, gauss, &#039;replicate&#039;, &#039;conv&#039;);

% generate kernel
g = zeros(len);
lenMone = len-1;
for i = 1:len
    for j = 1:len
        g(i, j) = ((i-1)/lenMone - 0.5)^2 + ((j-1)/lenMone - 0.5)^2;
    end
end
g = -log(sqrt(g));

% convolution
tic
filtered    = imfilter (g, CICcut, &#039;replicate&#039;, &#039;conv&#039;);
toc
tic
filteredFFT = conv2fft (g, CICcut, &#039;same&#039;);
toc
tic
filteredN   = convn (g, CICcut, &#039;same&#039;);
toc

% display
figure(&#039;units&#039;, &#039;normalized&#039;, &#039;outerposition&#039;, [0 0.25 1 0.5])
subplot 151, imshow (CICcut, []);      title (&#039;Mass density&#039;)
subplot 152, imshow (g, []);           title (&#039;Green`s function&#039;)
subplot 153, imshow (filtered, []);    title ({&#039;Gravitational potential&#039; &#039;imfilter&#039;})
subplot 154, imshow (filteredFFT, []); title ({&#039;Gravitational potential&#039; &#039;conv2fft&#039;})
subplot 155, imshow (filteredN, []);   title ({&#039;Gravitational potential&#039; &#039;convn&#039;})
&lt;/pre&gt;

Best regards,
Alex]]></description>
			<content:encoded><![CDATA[<p>In reply to <a href="https://undocumentedmatlab.com/articles/convolution-performance#comment-370701">Alex</a>.</p>
<p>Hello,</p>
<p>I am having a problem trying to do FFT-based convolution in 2D. <i><b>convnfft</b></i> is definitely the fastest one, but only <i><b>imfilter</b></i> produces a valid result. For <i><b>convnfft</b></i> and <i><b>convn</b></i> the result is wrong, as can be seen in the minimal working example below:</p>
<pre lang="matlab">
% generate image
len = 2^10;
CICcut = zeros (len);
CICcut = imnoise (CICcut, 'salt &#038; pepper', 0.0001);
CICcut = CICcut.*(rand(len)).^2;
gauss = fspecial('gaussian', round(sqrt(len)), sqrt(sqrt(len)));
CICcut = imfilter (CICcut, gauss, 'replicate', 'conv');

% generate kernel
g = zeros(len);
lenMone = len-1;
for i = 1:len
    for j = 1:len
        g(i, j) = ((i-1)/lenMone - 0.5)^2 + ((j-1)/lenMone - 0.5)^2;
    end
end
g = -log(sqrt(g));

% convolution
tic
filtered    = imfilter (g, CICcut, 'replicate', 'conv');
toc
tic
filteredFFT = conv2fft (g, CICcut, 'same');
toc
tic
filteredN   = convn (g, CICcut, 'same');
toc

% display
figure('units', 'normalized', 'outerposition', [0 0.25 1 0.5])
subplot 151, imshow (CICcut, []);      title ('Mass density')
subplot 152, imshow (g, []);           title ('Green`s function')
subplot 153, imshow (filtered, []);    title ({'Gravitational potential' 'imfilter'})
subplot 154, imshow (filteredFFT, []); title ({'Gravitational potential' 'conv2fft'})
subplot 155, imshow (filteredN, []);   title ({'Gravitational potential' 'convn'})
</pre>
<p>Best regards,<br />
Alex</p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Yair Altman		</title>
		<link>https://undocumentedmatlab.com/articles/convolution-performance#comment-387485</link>

		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Tue, 06 Sep 2016 07:12:22 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6256#comment-387485</guid>

					<description><![CDATA[In reply to &lt;a href=&quot;https://undocumentedmatlab.com/articles/convolution-performance#comment-387464&quot;&gt;Jackie Shan&lt;/a&gt;.

@Jackie - I believe this is due to a sub-optimal implementation. MathWorks has limited engineering resources and probably decided that 2D convolution is much more common than 3D. I assume that MathWorks focused its engineers on improving the performance of the 2D case and then moved on to more pressing matters, instead of also solving the harder and less-used 3D case. In a world with limited resources this is certainly understandable.]]></description>
			<content:encoded><![CDATA[<p>In reply to <a href="https://undocumentedmatlab.com/articles/convolution-performance#comment-387464">Jackie Shan</a>.</p>
<p>@Jackie &#8211; I believe this is due to a sub-optimal implementation. MathWorks has limited engineering resources and probably decided that 2D convolution is much more common than 3D. I assume that MathWorks focused its engineers on improving the performance of the 2D case and then moved on to more pressing matters, instead of also solving the harder and less-used 3D case. In a world with limited resources this is certainly understandable.</p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Jackie Shan		</title>
		<link>https://undocumentedmatlab.com/articles/convolution-performance#comment-387464</link>

		<dc:creator><![CDATA[Jackie Shan]]></dc:creator>
		<pubDate>Mon, 05 Sep 2016 22:27:33 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6256#comment-387464</guid>

					<description><![CDATA[When looking at the CPU utilization, I noticed that the ND convolution function (convn) does not use multiple cores when operating on greater than 2D arrays.
&lt;pre lang=&quot;matlab&quot;&gt;
A=randn(500,500);
B=randn(500,500);
C=convn(A,B,&#039;same&#039;); % all 12 CPUs are utilized

A=randn(500,50,10);
B=randn(500,50,10);
C=convn(A,B,&#039;same&#039;); % only 1 CPU is utilized
&lt;/pre&gt;

I was wondering if there&#039;s any reason for this?]]></description>
			<content:encoded><![CDATA[<p>When looking at the CPU utilization, I noticed that the ND convolution function (convn) does not use multiple cores when operating on greater than 2D arrays.</p>
<pre lang="matlab">
A=randn(500,500);
B=randn(500,500);
C=convn(A,B,'same'); % all 12 CPUs are utilized

A=randn(500,50,10);
B=randn(500,50,10);
C=convn(A,B,'same'); % only 1 CPU is utilized
</pre>
<p>I was wondering if there&#8217;s any reason for this?</p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Yair Altman		</title>
		<link>https://undocumentedmatlab.com/articles/convolution-performance#comment-370705</link>

		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Sun, 28 Feb 2016 19:00:34 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6256#comment-370705</guid>

					<description><![CDATA[In reply to &lt;a href=&quot;https://undocumentedmatlab.com/articles/convolution-performance#comment-370701&quot;&gt;Alex&lt;/a&gt;.

@Alex - you can take a look at the m-code within Bruno&#039;s &lt;i&gt;&lt;b&gt;convnfft&lt;/b&gt;&lt;/i&gt; utility for this. The speedup depends on several factors, including the size of the data, the Matlab release, and your available memory. So it is quite possible that on your specific system with your specific data you do not see significant speedup, but in many cases  Bruno&#039;s &lt;i&gt;&lt;b&gt;convnfft&lt;/b&gt;&lt;/i&gt; does improve the processing speed.]]></description>
			<content:encoded><![CDATA[<p>In reply to <a href="https://undocumentedmatlab.com/articles/convolution-performance#comment-370701">Alex</a>.</p>
<p>@Alex &#8211; you can take a look at the m-code within Bruno&#8217;s <i><b>convnfft</b></i> utility for this. The speedup depends on several factors, including the size of the data, the Matlab release, and your available memory. So it is quite possible that on your specific system with your specific data you do not see significant speedup, but in many cases  Bruno&#8217;s <i><b>convnfft</b></i> does improve the processing speed.</p>
]]></content:encoded>
		
			</item>
		<item>
		<title>
		By: Alex		</title>
		<link>https://undocumentedmatlab.com/articles/convolution-performance#comment-370701</link>

		<dc:creator><![CDATA[Alex]]></dc:creator>
		<pubDate>Sun, 28 Feb 2016 17:22:21 +0000</pubDate>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=6256#comment-370701</guid>

					<description><![CDATA[Could you please provide a code for 2D version? In case of linked .mex is not working any faster than standard convolution.]]></description>
			<content:encoded><![CDATA[<p>Could you please provide a code for 2D version? In case of linked .mex is not working any faster than standard convolution.</p>
]]></content:encoded>
		
			</item>
	</channel>
</rss>
