Among the many changes that were introduced in R2012b, the new desktop understandably took the centerpiece, with the new documentation layout coming a close second. I already reported on a change to the online website that has gone largely unnoticed, which broke multiple links, causing much frustration.
Today I report on another such change, which has the potential of frustrating no fewer users. As in the online URLs change, this change was also not documented, only adding to the frustration. To the best of my knowledge, this particular case was first detected by chance and reported by Andreas Pomp in CSSM last week, although the change was apparently already introduced in R2012a:
In case that the first columns of your Excel table have only numeric numbers, then the command
[num,txt] = xlsread(...)
returned in older Matlab releases fortxt
a cell array with blanks in the corresponding columns. In Release 2012b [sic – should be R2012a] these columns are missing at all. This has the consequence that column numbers must be changed if you read from cell array txt.
This caused some trouble in our code and I want to forward this warning to other users.
For example, let’s take the following simple Excel data file:
R2011b and earlier | R2012a and later | ||
---|---|---|---|
|
|
The response from Mathworks (1-K21ZFE) was:
It is certainly true that there is an inconsistency. If you use the syntax:
[num, txt] = xlsread('DeleteMe.xls','Sheet1','','basic');you obtain the same outputs dimension as in the previous releases. Our documentation was not updated with the change of behavior in xlsread.
Note: the sheet-name is optional – you can specify ''
(an empty string) instead of 'Sheet1'
.
Unfortunately, this workaround leaves much to be desired.
The problem here is not only the missing documentation about the change in functionality. This is actually the lesser of the problems. The main issue is backward compatibility. A major function such as xlsread, which is widely used by numerous Matlab users and has remained stable for years, can not – MUST NOT! – change existing functionality, with or without proper documentation. Numerous Matlab programs rely on the existing functionality and behavior of xlsread. MathWorks cannot seriously expect thousands of Matlab users to change all their existing Matlab programs when upgrading to R2012a. And what if the program needs to work on both R2012a AND earlier Matlab releases, which do not accept the ‘basic’ input parameter? Or what if we need to use a range sub-set, which is unavailable in ‘basic’ mode? Adding insult to injury, this change was never documented in the release notes, nor in xlsread‘s doc page.
This tells us not to rely on the official release notes when deciding whether or not to upgrade. It causes such a maintenance headache that many users would simply prefer to remain with their older Matlab release, and not to upgrade. A real shame.
As an alternative to the official workaround above, I suggest modifying the %matlabroot%/toolbox/matlab/iofun/private/xlsreadSplitNumericAndText.m function, which is responsible for the functionality change. Specifically, comment line #45:
43: % Trim the leading and trailing empties from textData 44: emptyTextMask = cellfun('isempty', textData); 45: %textData = filterDataUsingMask(textData, emptyTextMask); % Comment out this line! |
This will restore the previous functionality in one go, without having to modify any user programs. Of course, we would need to redo this fix in every future Matlab release as well. Still, considering the alternatives, this appears to be the best choice in my eyes.
While the case with xlsread seems to break a new record in backward-incompatibility, other similar cases have also been reported. One such recent issue was raised over the [this-time] documented removal of some functionality in the widely-used interp1 function. No fewer than 54 comments (and counting) were added within a few days of the initial CSSM post, which indicates the amount of frustration that such changes cause.
Looks like some people may have forgotten the first rule and top commandment of engineering:
OK, is this an exception? This MathWorks fix didn’t work for an XLSX file looking like:
1,a,b,c
2,7,8,9
3,7,8,9
4,7,8,9
R2009b gives:
>> [a,b] = xlsread('badcol.xlsx','Sheet1','')
a =
1 NaN NaN NaN
2 4 5 6
3 4 5 6
4 4 5 6
b =
'' 'a' 'b' 'c'
R2012b gives:
>> [a,b] = xlsread('badcol.xlsx','Sheet1','','basic')
a =
1 NaN NaN NaN
2 4 5 6
3 4 5 6
4 4 5 6
b =
'a' 'b' 'c'
(CORRECTION: obviously my real file didn’t contain 7,8,9 in the latter columns, but instead contained 4,5,6. Just didn’t want to muddy the water with that difference. The problem is in the “b” output)
Same frustration with importdata as of R2012a that I describe here:
http://blog.developpez.com/matlab4geek/?p=140&page=2
On 16 Dec 2011 , when testing the BETA R2012a I had contacted tech-support since this new behavior was not mentioned anywhere in the doc. Now my simple example 2 lines of 6 numeric values is even used in the doc : http://www.mathworks.fr/help/releases/R2012a/techdoc/rn/bs7oakc-1.html#bteesx_ lol!
As you imagin, we have a lot of codes which use importdata and which now require additional test for backward compatibility.
I definitely prefer MATLAB R2011* versions!!
Yair –
We’ve had a chance to look into this issue some more to try to understand what happened. This behavior change was indeed fallout from the major work we did to greatly enhance the basic mode support in 12b, giving the ability to read xlsx files on Mac and Linux, and Windows machines without Excel (among other goodies). As he was digging through the code, the developer noticed a bug – that the trimming behavior (which wasn’t well specified in the help) was inconsistent. Leading column behavior was different from leading rows and trailing rows and columns.
In my view, the issue here is that the incompatible nature of the bug fix slipped through the cracks, so we didn’t follow our normal process of weighing the benefits and costs to users. More importantly, this meant that we did not document the change in a release note, leaving it to users to stumble across on their own. Thanks, as always, for making sure we were aware of the impact on Andreas and potentially on additional users.
@Aurélien – This is the first I’m noticing your blog post. It seems related, but perhaps a different code change. I haven’t looked into it, so I don’t know what assessment we made about the change.
As I’ve said before, it can be a difficult balancing act between bug fixing and maintaining compatibility, and particularly in figuring out the impact (bad or good) of making changes. We try our best to do the right thing, and always appreciate when our users let us know when our change has impacted them in a way we might not have anticipated.
@Michelle – thank you for taking the time to look into this, and for providing this update.
One has to admit that this sort of situation is indeed quite rare in Matlab. Apparently MathWorks’ standard workflow is able to prevent most similar occurrences.
[…] the past 20+ years of my professional life as engineer and development manager, my number one rule has always been: DON’T FIX IT, IF IT AIN’T BROKE! But sometimes, we should also remember an alternative […]
It looks like there’s been another backwards in-compatibility introduced in R2014a (if not earlier).
used to read sheet 1 of data.xlsx
Now it gives:
[]
Now I have to do:
In order to make it work.
Also undocumented, as the documentation says SHEET can still be a number.
Am I the only one with this problem ?
Also, why does it say this comment was written on 29 October 2013, it is actually 26 July 2014 today.
@Nike – this sounds on the face of it to be a bug, since the behavior is indeed documented. I suggest that you contact MathWorks (support@mathworks.com) to report this issue. Please repost a comment here if and when you learn anything new.