Along with the extensive facelift done to Matlab’s documentation system in R2012b, one important aspect has gone largely unnoticed. Apparently, in the interest of making the doc pages more SEO-friendly, MathWorks have modified the online documentation URLs from something that looks like http://www.mathworks.com/help/techdoc/creating_guis/f16-999606.html to http://www.mathworks.com/help/matlab/creating_guis/writing-code-for-callbacks.html.
This change in itself is actually a good idea, but what is absolutely disastrous (IMHO) is the fact that MathWorks have not bothered to link all the old URLs to the new ones. Most serious websites that have tons of links floating around for a decade or two, as in this case, take the trouble to automatically redirect old URLs to the new ones in the server side. This was the case, for example, when Oracle took over Sun a couple of years ago and redirected the Java links. So did Microsoft in their online MSDN. Neither Oracle nor Microsoft redirected all the URLs perfectly, but they made a pretty good effort. Not in Matlab’s case, unfortunately.
It seems that only the basic function reference pages have been redirected, not any of the other doc pages. Tons of links, many of them in MathWorks webpages themselves, are now broken and lead to a generic “page not found” webpage. This includes many thousands of links referenced by loyal community users in response to queries in the Matlab CSSM newsgroup (in the past ~15 years) and MathWork’s own Answers forum (past 1.5 years), not to mention numerous references in books, reports, websites and other resources. These links are now broken. Users who use them will end up in page-not-found. How user-friendly would this be for them? How likely do you think they would use the newsgroup/forum again, as opposed to flooding MathWorks customer support with simple questions that have already been answered online, only to be rendered unusable by an SEO fix that should have been fully transparent?! How likely do you think that loyal CSSM/Answers contributors would post MathWorks links again?! This is about as self-defeating a release as I have seen in a very long time.
In the past, I have highlighted changes to the online documentation center that I found useful. These included the addition of historical documentation archives, and the plot gallery. But today I must say that IMHO the latest change to the doc center should never have gone live, and I fail to understand how it could ever have passed internal QA. Furthermore, I cannot understand why this is still not fixed, a month after the official release.
I don’t believe that I have ever used such strong language in the 3.5 years that this blog has been live, and I do this very reluctantly now. Hopefully my call for action will be heeded (soon!).
So what can be done?
MathWorks can and should correct this problem by adding the server-side redirects, which is typically accompanied by an HTTP 301 Moved Permanently response to the requesting browser. In fact, doing this is also good for SEO, since all the tons of old links will be counted by search engines as directing to the new pages, thereby adding incoming page-rank “juice” to them. Here’s Google’s advise on this matter.
Doing these redirects does not require a Matlab release and can be done immediately and incrementally. Granted it’s not a simple task, given the huge number of URLs involved, but the sooner MathWorks start, the faster they’ll finish. Frankly, they should have thought about it in advance… I have a hunch that the servers’ HTTP 404 (Page Not Found) list has exploded since the latest release. I would suggest going through this list, sorted down by number of occurrences, so that the most frequently-accessed webpages are fixed first.
Until the URLs are restored or redirected, we can try the following workaround:
- First try the URL directly (e.g., http://www.mathworks.com/help/techdoc/creating_guis/f16-999606.html) – who knows, maybe it will work by the time you try using it…
- If the URL that doesn’t work, try to add releases/R2012a/ following the “help/” term in the URL. In our example: http://www.mathworks.com/help/releases/R2012a/techdoc/creating_guis/f16-999606.html. If successful, jump to step #4 below.
- If the URL is still not found, enter it into the Way-Back Machine in the following format: http://wayback.archive.org/web/*/<URL>. For example: http://wayback.archive.org/web/*/http://www.mathworks.com/help/techdoc/creating_guis/f16-999606.html
If we’re lucky, the Way-Back archive has at least one archived copy of the requested webpage. This also happens to be the case in this specific instance:
Click the highlighted date (when an archive copy was taken) to see our requested webpage in all its past glory. We are not assured that it is the latest version of the webpage, but this is ok for now.
- If all you need is to read the contents of the doc page, then you can stop here. But if you also wish to get the new URL (in its R2012b SEO format), then read on:
- Copy a long distinctive sentence from the webpage content in to the clipboard; paste into Google within “” (quotation marks) and add the term “site:mathworks.com”. In our case, let’s take “A callback is a function that you write and associate with a specific component in the GUI“. Here is the corresponding Google query: site:mathworks.com “A callback is a function that you write and associate with a specific component in the GUI”.
- If we are lucky again, Google should turn up the new webpage and by navigating there we should (finally!) get the new URL. In this case, the requested URL is the top answer: http://www.mathworks.com/help/matlab/creating_guis/writing-code-for-callbacks.html
- Sometimes Google fails to find the sentence in the Mathworks.com website. This can be due to text changes that MathWorks do in their doc pages from time to time. Don’t give up – try a different sentence from the archived webpage, maybe it will show up instead. You can also try to enter the page title as the distinctive sentence. There is a good chance that at least one of these will work. In our specific example, MathWorks changed the title from “Writing Code for Callbacks” to “Write Code for Callbacks” in R2011b (after the Way-Back Machine archived the page). Luckily, the title query still manages to find the new page in this specific case (good ol’ dependable Google!).
- Any of the above may fail, typically because the Way-Back archive did not archive the specific page we need, or because MathWorks changed the page contents too much. In such a case, too bad for us… In this case, I suggest you email MathWorks to ask for the new URL.
Vociferous complaints over the new documentation system in R2012a have been heard loud and clear at MathWorks. Wendy Fullam, Matlab’s documentation product manager, has made a point to make it clear that MathWorks is not ignoring these out-of-the-ordinary complaints and will indeed improve the documentation accordingly. Wendy also responded to my SEO complaint, saying that they will indeed try to fix the URL redirects:
However some pages have not been redirected for various reasons. Given feedback like this, we are looking into creating more redirects as soon as possible.
I’m holding my fingers that they will indeed fix this issue, not just for some URLs (as the quote above seems to indicate) but for all of them. MathWorks’ quick and serious response to the complaints on R2012b make me optimistic that there is indeed a good chance that this will be resolved soon. I’ll be following up on this issue in the upcoming months, and will be most happy to post an update here if and when this issue is fully resolved as it should be, on the MathWorks server side.
Addendum Oct 14, 2012 (that was quick!): Wendy Fullam has posted the following comment on the relevant CSSM thread (snipped for brevity):
In response to customer feedback, MathWorks is deploying an additional 700 redirects for documentation URLs.
When combined with the redirects already deployed (for reference pages, landing pages, installation sections, and release notes), we will be redirecting roughly 70% of R2012a documentation URLs to updated page locations.
We will also continue to monitor “page not found” reports, and establish additional redirects as needed.
Additionally, I’d like to note that the discussion has been incredibly insightful in helping the documentation team establish future direction. Customers noted that the doc appears to have undergone a change in organization and have speculated on the motivation. While SEO was part of our motivation (we think it’s a big step in having URLs provide an indication of page focus), we are also working towards serving up information to support common customer workflows. … I want to encourage you all to continue to share experiences and requests for missing information, so our documentation team can continue to expand and improve our content.
Addendum Nov 19, 2012: Looks like the http://www.mathworks.com/help/techdoc/creating_guis/f16-999606.html link is alive again, or rather redirected to the new location which is great. I hope this means that the redirection effort mentioned by Wendy above has finally occured. Thanks Wendy (and all the nameless others behind the scenes)!
- xlsread functionality change in R2012a The functionality of the xlsread function has changed without documentation or warning in the R2012a release. ...
- Removing user preferences from deployed apps An unsupported MathWorks Technical Solution explains how to remove private information from deployed (compiled) matlab applications. ...
- Docs of old Matlab releases MathWorks recently posted archived documentation for many previous Matlab releases...
- FIG files format FIG files are actually MAT files in disguise. This article explains how this can be useful in Matlab applications....
- Spy Easter egg take 2 The default spy Easter-egg image in the spy function has recently changed. ...
- Preallocation performance Preallocation is a standard Matlab speedup technique. Still, it has several undocumented aspects. ...