<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jim Hokanson &#8211; Undocumented Matlab</title>
	<atom:link href="https://undocumentedmatlab.com/articles/tag/jim-hokanson/feed" rel="self" type="application/rss+xml" />
	<link>https://undocumentedmatlab.com</link>
	<description>Professional Matlab consulting, development and training</description>
	<lastBuildDate>Wed, 21 Mar 2012 18:00:01 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.7.3</generator>
	<item>
		<title>Expanding urlread capabilities</title>
		<link>https://undocumentedmatlab.com/articles/expanding-urlreads-capabilities?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=expanding-urlreads-capabilities</link>
					<comments>https://undocumentedmatlab.com/articles/expanding-urlreads-capabilities#comments</comments>
		
		<dc:creator><![CDATA[Yair Altman]]></dc:creator>
		<pubDate>Wed, 21 Mar 2012 18:00:01 +0000</pubDate>
				<category><![CDATA[Guest bloggers]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Low risk of breaking in future versions]]></category>
		<category><![CDATA[Stock Matlab function]]></category>
		<category><![CDATA[Jim Hokanson]]></category>
		<guid isPermaLink="false">http://undocumentedmatlab.com/?p=2808</guid>

					<description><![CDATA[<p>The built-in urlread functions has severe limitations. This article explains how to solve them. </p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/expanding-urlreads-capabilities">Expanding urlread capabilities</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/multi-line-uitable-column-headers" rel="bookmark" title="Multi-line uitable column headers">Multi-line uitable column headers </a> <small>Matlab uitables can present long column headers in multiple lines, for improved readability. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/setting-status-bar-components" rel="bookmark" title="Setting status-bar components">Setting status-bar components </a> <small>Matlab status-bars are Java containers in which we can add GUI controls such as progress-bars, not just simple text labels...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/setting-status-bar-text" rel="bookmark" title="Setting status-bar text">Setting status-bar text </a> <small>The Matlab desktop and figure windows have a usable statusbar which can only be set using undocumented methods. This post shows how to set the status-bar text....</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-and-the-event-dispatch-thread-edt" rel="bookmark" title="Matlab and the Event Dispatch Thread (EDT)">Matlab and the Event Dispatch Thread (EDT) </a> <small>The Java Swing Event Dispatch Thread (EDT) is very important for Matlab GUI timings. This article explains the potential pitfalls and their avoidance using undocumented Matlab functionality....</small></li>
</ol>
</div>
]]></description>
										<content:encoded><![CDATA[<p><i>I would like to welcome guest blogger <a target="_blank" rel="nofollow" href="http://sites.google.com/site/jimhokanson/">Jim Hokanson</a>. Today, Jim will explain some of the limitations that at one time or another many of us have encountered with Matlab&#8217;s built-in <b>urlread</b> function. More importantly, Jim has spent a lot of time in creating an expanded-capabilities Matlab function, which he explains below. Note that while <b>urlread</b>&#8216;s internals are undocumented, the changes outlined below rely on pretty standard Java and HTTP, and should therefore be quite safe to use on multiple Matlab versions.</i></p>
<h3 id="Abstract">Abstract</h3>
<p>I recently tried to implement the <a target="_blank" rel="nofollow" href="http://dev.mendeley.com">Mendeley</a> API but quickly found that <i><b>urlread</b></i> was not going to be sufficient for my needs. The first indication of this was my inability to send the proper authorization information for POST requests. It became even more obvious with the need to perform DELETE and PUT methods, since <i><b>urlread</b></i> only supports GET and POST. My implementation of <i><b>urlread</b></i>, which I refer to as <i><b>urlread2</b></i>, addresses these and a couple of other issues and can be <a target="_blank" rel="nofollow" href="http://www.mathworks.com/matlabcentral/fileexchange/35693-urlread2">found on the Matlab File Exchange</a>. Other developers have tackled <i><b>urlread</b></i>&#8216;s shortcomings (<a target="_blank" rel="nofollow" href="http://www.mathworks.com/matlabcentral/fileexchange/8474-rewrites-of-urlread-and-urlwrite">this example</a> added timeout support, and <a target="_blank" rel="nofollow" href="http://www.mathworks.com/matlabcentral/fileexchange/27189-urlreadpost-url-post-method-with-binary-file-uploading">this example</a> added support for binary file upload) &#8211; today&#8217;s article will focus on my solution, but others are obviously possible.</p>
<h3 id="Introduction">Introduction</h3>
<p><a target="_blank" rel="nofollow" href="http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol">HTTP</a> is the underlying computer networking protocol that enables us to read webpages on the Internet. It consists of a request made by the user to an Internet server (typically located via <a target="_blank" rel="nofollow" href="http://en.wikipedia.org/wiki/Uniform_resource_locator">URL</a>), and a response from that server. Importantly, the request and response consist of three main parts: a resource line (for requests) or status line (for responses), followed by headers, and optionally a message body.<br />
Matlab&#8217;s built-in <i><b>urlread</b></i> function enables Matlab users to easily read the server&#8217;s response text into a Matlab string:</p>
<pre lang='matlab'>text = urlread('http://www.google.com');</pre>
<p>This is done internally using Java code that connects to the specified URL and reads the information sent by the URL&#8217;s server (<a target="_blank" rel="nofollow" href="http://docs.oracle.com/javase/tutorial/networking/urls/index.html">more on this</a>).<br />
<i><b>urlread</b></i> accepts optional additional inputs specifying the request type (&#8216;get&#8217; or &#8216;post&#8217;) and parameter values for the request.<br />
Unfortunately, <i><b>urlread</b></i> has the following limitations:</p>
<ol>
<li>It does not allow specification of request headers</li>
<li>It makes assumptions as to the request headers needed based on the input method</li>
<li>It does not expose the response headers and status line</li>
<li>It assumes the response body contains text, and not a binary payload</li>
<li>It does not enable uploading binary contents to the server</li>
<li>It does not enable specifying a timeout in case the server is not responding</li>
</ol>
<h3 id="urlread2">urlread2</h3>
<p>The <i><b>urlread2</b></i> function addresses all of these problems. The overall design decision for this function was to make it more general, requiring more work up front to use in some cases, but more flexibility.<br />
For reference, the following is the calling format for <i><b>urlread2</b></i> (which is reminiscent of <i><b>urlread</b></i>&#8216;s):</p>
<pre lang='matlab'>urlread2(url,*method,*body,*headersIn, varargin)</pre>
<p>The * indicate optional inputs that must be spatially maintained.</p>
<ul>
<li>url &#8211; (string), url to request</li>
<li>method &#8211; (string, default GET) HTTP request method</li>
<li>body &#8211; (string, default &#8221;), body of the request</li>
<li>headersIn &#8211; (structure, default []), see the following section</li>
<li>varargin &#8211; extra properties that need to be specified via property/pair values</li>
</ul>
<h3 id="Headers">Addressing Problem 1 &#8211; Request header</h3>
<p><i><b>urlread</b></i> internally uses a Java object called <code>urlConnection</code> that is generally an instance of the class <a target="_blank" rel="nofollow" href="http://docs.oracle.com/javase/1.5.0/docs/api/java/net/HttpURLConnection.html"><code>sun.net.www.protocol.http.HttpURLConnection</code></a>. The method <i>setRequestProperty()</i> can be used to set headers for the request. This method has two inputs, the header name and the value of that header. A simple example of this can be seen below:</p>
<pre lang='matlab'>urlConnection.setRequestProperty('Content-Type','application/x-www-form-urlencoded');</pre>
<p>Here &#8216;Content-Type&#8217; is the header name and the second input is the value of that property. My function requires passing in nearly all headers as a structure array, with fields for the name and value. The preceding header would be created using a helper function <i><b>http_createHeader.m</b></i>:</p>
<pre lang='matlab'>header = http_createHeader('Content-Type','application/x-www-form-urlencoded');</pre>
<p>Multiple headers can be passed in to the function by concatenating header structures into a structure array.</p>
<h3 id="Parameters">Addressing Problem 2 &#8211; Request parameters</h3>
<p>When making a POST request, parameters are generally specified in the message body using the following format:<br />
<code>[property]=[value]&[property]=[value]</code><br />
The properties and values are also encoded in a particular way, generally termed <a target="_blank" rel="nofollow" href="http://en.wikipedia.org/wiki/Percent-encoding">urlencoded</a> (encoding and decoding can be done using Matlab&#8217;s built-in <i><b>urlencode</b></i> and <i><b>urldecode</b></i> functions). For GET requests this string is appended to the url with the &#8220;?&#8221; symbol. Since urlencoding methods can vary, and in the spirit of reducing assumptions, I use separate functions to generate these strings outside of <i><b>urlread2</b></i>, and then pass the result in either as the url (for GET) or as the body input (for POST). As an example, I might <a target="_blank" rel="nofollow" href="http://www.mathworks.com/matlabcentral/fileexchange/?search_submit=fileexchange&#038;term=undocumented+matlab&#038;query=undocumented+matlab">search the Mathworks website</a> using the upper right search bar on its site for &#8220;undocumented matlab&#8221; under file exchange (<i>hmmm&#8230; pretty cute stuff there!</i>). Doing this performs a GET request with the following property/value pairs:</p>
<pre lang='matlab'>params = {'search_submit','fileexchange', 'term','undocumented matlab', 'query','undocumented matlab'};</pre>
<p>These property/value pairs are somewhat obvious from looking at the URL, but could also be determined by using programs such as <a target="_blank" rel="nofollow" href="http://www.fiddler2.com/fiddler2/">Fiddler</a>, <a target="_blank" rel="nofollow" href="http://getfirebug.com/">Firebug</a>, or <a target="_blank" rel="nofollow" href="http://www.httpwatch.com/">HttpWatch</a>.<br />
After urlencoding and concatenating, we would form the following string:<br />
<code>search_submit=fileexchange&term=undocumented+matlab&query=undocumented+matlab</code><br />
This functionality is normally accomplished internally in <i><b>urlread</b></i>, but I use a function <i><b>http_paramsToString</b></i> to produce that result. That function also returns the required header for POST requests. The following is an example of both GET and POST requests:</p>
<pre lang='matlab'>
[queryString,header] = http_paramsToString(params,1);
% For GET:
url = [url '?' queryString];
urlread2(url)
% For POST:
urlread2(url,'POST',queryString,header)
</pre>
<h3 id="Response">Addressing Problem 3 &#8211; Response header</h3>
<p>According to the HTTP protocol, each server response starts with a simple header that indicates a numeric <a target="_blank" rel="nofollow" href="http://en.wikipedia.org/wiki/List_of_HTTP_status_codes">response status</a>. The following Matlab code provides access to the status line using the <code>urlConnection</code> object:</p>
<pre lang='matlab'>
status = struct('value',urlConnection.getResponseCode(), 'msg',char(urlConnection.getResponseMessage))
status =
    value: 200
      msg: 'OK'
</pre>
<p><code>urlConnection</code>&#8216;s <i>getHeaderField()</i> and <i>getHeaderFieldKey()</i> methods enable reading the specific parts of the response header:</p>
<pre lang='matlab'>
headerValue = char(urlConnection.getHeaderField(headerIndex));
headerName  = char(urlConnection.getHeaderFieldKey(headerIndex));
</pre>
<p><code>headerIndex</code> starts at 0 and increases by 1 until both <code>headerValue</code> and <code>headerName</code> return empty.<br />
It is important to note that header keys (names) can be repeated for different values. Sometimes this is desired, such as if there are multiple cookies being sent to the user. To generically handle this case, two header structures are returned. In both cases the header names are the field names in the structure, after <a target="_blank" rel="nofollow" href="http://www.mathworks.com/help/techdoc/ref/genvarname.html">replacing hyphens with underscores</a>. In one case, allHeaders, the values are cell arrays of strings containing all values presented with the particular key. The other structure, firstHeaders, contains only the first instance of the header as a string to avoid needing to dereference a cell array.</p>
<h3 id="Body">Addressing Problem 4 &#8211; Response body</h3>
<p><i><b>urlread</b></i> assumes text output. This is fine for most webpages, which use HTML and are therefore text-based. However, <i><b>urlread</b></i> fails when trying to download any non-text resource such as an image, a ZIP file, or a PDF document. I have added a flag in <i><b>urlread2</b></i> called CAST_OUTPUT, which defaults to true, i.e. text response, just as <i><b>urlread</b></i> assumes. Using <i><b>varargin</b></i>, this flag can be set to false ({&#8216;CAST_OUTPUT&#8217;,false}) to indicate a binary response.</p>
<h3 id="Summary">Summary</h3>
<p><i><b>urlread2</b></i>&#8216;s functionality has been expanded to also address other limitations of <i><b>urlread</b></i>: It enables binary inputs, better character-set handling of the output, redirection following, and read timeouts.<br />
The modifications described above provide direct access to the key components of the HTTP request and response messages. Its more generic nature lets <i><b>urlread2</b></i> focus on HTTP transmission, and leaves request formation and response interpretation up to the user. I think ultimately this approach is better than providing one-off modifications of the original <i><b>urlread</b></i> function to suit a particular need. <i><b>urlread2</b></i> and supporting files can be <a target="_blank" rel="nofollow" href="http://www.mathworks.com/matlabcentral/fileexchange/35693-urlread2">found</a> on the Matlab File Exchange.</p>
<p>The post <a rel="nofollow" href="https://undocumentedmatlab.com/articles/expanding-urlreads-capabilities">Expanding urlread capabilities</a> appeared first on <a rel="nofollow" href="https://undocumentedmatlab.com">Undocumented Matlab</a>.</p>
<div class='yarpp-related-rss'>
<h3>Related posts:</h3><ol>
<li><a href="https://undocumentedmatlab.com/articles/multi-line-uitable-column-headers" rel="bookmark" title="Multi-line uitable column headers">Multi-line uitable column headers </a> <small>Matlab uitables can present long column headers in multiple lines, for improved readability. ...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/setting-status-bar-components" rel="bookmark" title="Setting status-bar components">Setting status-bar components </a> <small>Matlab status-bars are Java containers in which we can add GUI controls such as progress-bars, not just simple text labels...</small></li>
<li><a href="https://undocumentedmatlab.com/articles/setting-status-bar-text" rel="bookmark" title="Setting status-bar text">Setting status-bar text </a> <small>The Matlab desktop and figure windows have a usable statusbar which can only be set using undocumented methods. This post shows how to set the status-bar text....</small></li>
<li><a href="https://undocumentedmatlab.com/articles/matlab-and-the-event-dispatch-thread-edt" rel="bookmark" title="Matlab and the Event Dispatch Thread (EDT)">Matlab and the Event Dispatch Thread (EDT) </a> <small>The Java Swing Event Dispatch Thread (EDT) is very important for Matlab GUI timings. This article explains the potential pitfalls and their avoidance using undocumented Matlab functionality....</small></li>
</ol>
</div>
]]></content:encoded>
					
					<wfw:commentRss>https://undocumentedmatlab.com/articles/expanding-urlreads-capabilities/feed</wfw:commentRss>
			<slash:comments>17</slash:comments>
		
		
			</item>
	</channel>
</rss>
