Mlint, Matlab’s static code-analysis parser, was written by Stephen Johnson (the original developer of the enormously successful lint parser for C/C++ back in 1977), when he was lured by MathWorks in 2002 to develop a similar tool for Matlab. Since its development (in R14 I believe), and especially since its incorporation in Matlab’s Editor in R2006a (Matlab 7.2), mlint has become a very important tool for reporting potential problems in m-files.
Unfortunately, to this day (R2013a), there is no documented manner of programmatically separating mlint warnings and errors, nor for accessing any of the multitude of features that are readily available in mlint. Naturally, there is (and has always been) an undocumented back door.
From its earliest beginnings, mlint has relied on C code (presumably modeled after lint). For many years mlint relied on a mex file (%matlabroot%/toolbox/matlab/codetools/mlintmex.mex*), which is basically just a wrapper for mlint.dll where the core algorithm resides. In recent releases, mlintmex, just like many other core mex files, was ported into a core Matlab library (libmwbuiltins.dll on Windows). However, the name and interface of the mlintmex function have remained unchanged over the years. Wrapping the core mlintmex function is the mlint m-function (%matlabroot%/toolbox/matlab/codetools/mlint.m) that calls mlintmex internally. In R2011b (Matlab 7.13) its official function name has changed to checkcode, although this was never documented in the release notes for some reason. However, using mlint still works even today. Wrapping all that is the mlintrpt function, which calls mlint/checkcode internally.
The core function mlintmex returns a long string with embedded newlines to separate the messages. For example:
>> str = mlintmex('perfTest.m') str = L 3 (C 1): The value assigned to variable 'A' might be unused. L 4 (C 1): The value assigned to variable 'B' might be unused. L 5 (C 1-3): Variable 'ops', apparently a structure, is changed but the value seems to be unused. L 12 (C 9): This statement (and possibly following ones) cannot be reached. L 53 (C 19-25): The function 'subFunc' might be unused. L 53 (C 27-35): Input argument 'iteration' might be unused. If this is OK, consider replacing it by ~.
We can parse this long string ourselves, but there is no need since mlint/checkcode do this for us, returning a struct array:
>> results = mlint('perfTest.m') results = 6x1 struct array with fields: message line column fix >> results(5) ans = message: 'The function 'subFunc' might be unused.' line: 53 column: [19 25] fix: 0
As can be seen, the message severity (warning/error) does not appear. This severity is obviously available since it is integrated in the Editor and the Code Analyzer report – orange for warnings, red for errors.
In one of my projects I needed to enable the user to dynamically create executable Matlab code that would then be run interactively. This enabled users to create dynamic data analyses functions without actually needing to know Matlab or to code all the nuts-and-bolts of a regular Matlab function. For this I needed to display warnings and errors-on-the-fly (the dynamic cell tooltips used a custom table cell-renderer). Here’s the end-result:
My solution was to use mlintmex, as follows:
% Get the relevant message strings errMsgs = mlintmex('-m2', srcFileName); allMsgs = mlintmex('-m0', srcFileName); % Parse the strings to find newline characters numErrors = length(strfind(regexprep(errMsgs,'\*\*\*.*',''),char(10))); numAllMsg = length(strfind(regexprep(allMsgs,'\*\*\*.*',''),char(10))); numWarns = numAllMsg - numErrors;
(and from the messages themselves [
errMsgs,allMsgs] I extracted the actual error/warning location)
Alternatively, I could have used mlint directly, as I have recently explained:
% Note that mlint returns struct arrays, so the following are all structs, not strings errMsgs = mlint('-m2',srcFileNames); % m2 = errors only m1Msgs = mlint('-m1',srcFileNames); % m1 = errors and severe warnings only allMsgs = mlint('-m0',srcFileNames); % m0 = all errors and warnings
The original information about mlintmex and the undocumented -m0/m1/m2 options came from Urs (us) Schwartz, whose contributions are an endless source of such gems. Urs also provided a list of other undocumented mlint options (the comment annotations are mostly mine):
'-all' % ??? '-allmsg' % display the full list of possible mlint messages and their codes '-amb' % display all possibly-ambiguous identifiers (variable/function) '-body' % ??? '-callops' % display the internal call tree, with nesting levels and function types '-calls' % (looks similar to -callops, not sure what the difference is) '-com' % ??? '-cyc' % display McCabe complexity value of all functions in the analyzed file % '-db' % == -set + -ud + -tab '-dty' % debug info for the mlint parsing tree '-edit' % display all encountered identifiers and their assumed types '-en' % messages in English '-id' % display the mlint code associated with each message '-ja' % messages in Japanese '-lex' % display the LEX parse-tree for the analyzed file '-m0' % + other opt '-m1' % + other opt '-m2' % + other opt '-m3' % + other opt '-mess' % debug info for mlint message-reporting (start/end locations etc.) '-msg' % (looks similar to -allmsg above, not sure what the difference is) '-notok' % disregard %#ok directives and report messages on lines having them '-pf' % ??? '-set' % debug info for the mlint parsing tree '-spmd' % ??? (presumably display SPMD-related messages) '-stmt' % display the number of statements in each function within the analyzed file '-tab' % set-by/used-by table for all identifiers (see -edit) '-tmtree' % not valid anymore '-tmw' % not valid anymore '-toks' % ??? '-tree' % debug info for the mlint parsing tree '-ty' % display the line numbers where each of the file's identifiers are used '-ud' % debug info for the mlint parsing tree '-yacc' % ONLY: !mlint FILE -yacc -...
to which were added in recent years ‘-eml’, ‘-codegen’ etc. – see the checkcode doc page. Also note that not all Matlab releases support all options. For example, ‘-tmw’ is ignored in R2013a, returning the same data as ‘-all’ plus a warning about the ignored option.
Urs prepared a short utility called doli that accepts an m-file name and returns a struct whose fields are the respective outputs of mlint for each of the corresponding options:
>> results = doli('perfTest.m') MLINT > C:\Yair\Books\MATLAB Performance Tuning\Code\perfTest.m OPTION> -all 6 OPTION> -allmsg 501 OPTION> -amb 17 OPTION> -body 6 OPTION> -callops 15 OPTION> -calls 15 OPTION> -com 6 OPTION> -cyc 8 OPTION> -dty 162 OPTION> -edit 92 OPTION> -en 7 ...
Some of these options are used by Urs’ farg and fdep utilities. Their usage of mlint rather than direct m-code parsing, is part of the reason that these functions are so lightningly fast.
For example, we can use the ‘-calls’ options to parse an m-file and get the names, type, and code location of its contained functions (explanation):
>> mlint('-calls','perfTest.m') M0 1 10 perfTest E0 51 3 perfTest U1 3 5 randi U1 4 5 num2cell U1 4 14 randn U1 6 1 whos U1 7 1 tic U1 7 6 save U1 7 45 toc U1 9 6 savefast S0 53 19 subFunc E0 60 3 subFunc U1 55 8 isempty U1 56 20 load U1 57 29 sin
With so many useful features, I really cannot understand why they were never exposed to the public in a documented manner. After all, they have remained pretty-much unchanged for many years and can provide enormous benefits for developers of unit-tests and interactive analysis frameworks (as I have shown above).
As a side-note, in R2010a (Matlab 7.10), mlint was renamed “Code Analyzer”, but this was really just a name change – its core functionality has changed little in the past decade. Some might argue that new checks were added and the Editor interface has improved by allowing auto-fixes and message suppression. But for a tool that is over a decade old (much more, if you count lint’s development), I contend that these are not much. Don’t get me wrong – I have the utmost respect for Steve. Serious unix C/C++ development relies on his lint and yacc tools on a regular basis. I think they show astonishing ingenuity and intelligence. It’s just that I had expected more after a decade of mlint development (I bet it’s not due to Steve suddenly losing the touch).
Addendum: A little birdie tells me that Steve left MathWorks a few years ago, which does explain things… I apologize to Steve for any misguided snide on my part. As I said above, I have nothing but the utmost respect for his work. The question of why MathWorks left his mlint work hanging without serious continuation remains open.
Addendum 2: Additional and much more detailed information about the nature of functions can be found using the semi-documented mtree function (or rather, Matlab class: %matlabroot%/toolbox/matlab/codetools/@mtree/mtree.m). This is a huge class-file (3200+ lines of code) that is well worth a dedicated future article, so stay tuned…
Awesome post! What I have been actually wondering is if one can augment the settings of Code Analyzer so that a group can enforce their own programming best practices such as camelBack notation, load() with left side assignment (no magic variables), no eval() etc.
I would basically love to go under Preferences->Code Analyzer-> Default Settings and add completely new tests, not just enable/disable the existing ones ..
Of course, one can write their own parser for MATLAB but it’s rather hard with all the syntactic sugar and weak types …