- Undocumented Matlab - https://undocumentedmatlab.com -

Explicit multi-threading in Matlab part 1

Posted By Yair Altman On February 19, 2014 | 31 Comments

One of the limitations of Matlab already recognized by the community [1], is that it does not provide the users direct access to threads without the PCT (Parallel Computing Toolbox). For example, letting some expensive computations or I/O to be run in the background without freezing the main application. Instead, in Matlab there is either implicit multiprocessing which relies on built-in threading support in some MATLAB functions, or explicit multiprocessing using PCT (note: PCT workers use heavyweight processes, not lightweight threads). So the only way to achieve truly multi-threading in Matlab is via MEX, Java or .Net, or by spawning external standalone processes (yes, there are a few other esoteric variants – don’t nit-pick).
Note that we do not save any CPU cycles by running tasks in parallel. In the overall balance, we actually increase the amount of CPU processing, due to the multi-threading overhead. However, in the vast majority of cases we are more interested in the responsivity of Matlab’s main processing thread (known as the Main Thread, Matlab Thread, or simply MT) than in reducing the computer’s total energy consumption. In such cases, offloading work to asynchronous C++, Java or .Net threads could remove bottlenecks from Matlab’s main thread, achieving significant speedup.
Today’s article is a derivative of a much larger section on explicit multi-threading in Matlab, that will be included in my upcoming book MATLAB Performance Tuning [2], which will be published later this year. It is the first in a series of articles that will be devoted to various alternatives.

Sample problem

In the following example, we compute some data, save it to file on a relatively slow USB/network disk, and then proceed with another calculation. We start with a simple synchronous implementation in plain Matlab:

tic
data = rand(5e6,1);  % pre-processing, 5M elements, ~40MB
fid = fopen('F:\test.data','w');
fwrite(fid,data,'double');
fclose(fid);
data = fft(data);  % post-processing
toc
Elapsed time is 9.922366 seconds.

~10 seconds happens to be too slow for our specific needs. We could perhaps improve it a bit with some fancy tricks for save [3] or fwrite [4]. But let’s take a different approach today, using multi-threading:

Using Java threads

Matlab uses Java for numerous tasks, including networking, data-processing algorithms and graphical user-interface (GUI). In fact, under the hood, even Matlab timers employ Java threads for their internal triggering mechanism. In order to use Java, Matlab launches its own dedicated JVM (Java Virtual Machine) when it starts (unless it’s started with the -nojvm startup option). Once started, Java can be directly used within Matlab as a natural extension of the Matlab language. Today I will only discuss Java multithreading and its potential benefits for Matlab users: Readers are assumed to know how to program Java code and how to compile Java classes.
To use Java threads in Matlab, first create a class that implements the Runnable [5] interface or extends java.lang.Thread [6]. In either case we need to implement at least the run() method, which runs the thread’s processing core.
Now let us replace the serial I/O with a very simple dedicated Java thread. Our second calculation (fft) will not need to wait for the I/O to complete, enabling much faster responsiveness on Matlab’s MT. In this case, we get a 58x (!) speedup:

tic
data = rand(5e6,1);  % pre-processing (5M elements, ~40MB)
javaaddpath 'C:\Yair\Code\'  % path to MyJavaThread.class
start(MyJavaThread('F:\test.data',data));  % start running in parallel
data = fft(data);  % post-processing (Java I/O runs in parallel)
toc
Elapsed time is 0.170722 seconds.   % 58x speedup !!!

Note that the call to javaaddpath only needs to be done once in the entire Matlab session, not repeatedly. The definition of our Java thread class is very simple (real-life classes would not be as simplistic, but the purpose here is to show the basic concept, not to teach Java threading):

import java.io.DataOutputStream;
import java.io.FileOutputStream;
public class MyJavaThread extends Thread
{
    String filename;
    double[] doubleData;
    public MyJavaThread(String filename, double[] data)
    {
        this.filename = filename;
        this.doubleData = data;
    }
    @Override
    public void run()
    {
        try
        {
            DataOutputStream out = new DataOutputStream(
                                     new FileOutputStream(filename));
            for (int i=0; i < doubleData.length; i++)
            {
                out.writeDouble(doubleData[i]);
            }
            out.close();
        } catch (Exception ex) {
            System.out.println(ex.toString());
        }
    }
}

Note: when compiling a Java class that should be used within Matlab, as above, ensure that you are compiling for a JVM version that is equal to, or lower than Matlab's JVM, as reported by Matlab's version function:

% Matlab R2013b uses JVM 1.7, so we can use JVMs up to 7, but not 8
>> version –java
ans =
Java 1.7.0_11-b21 ...

Matlab synchronization

Java (and C++/.Net) threads are very effective when they can run entirely independently from Matlab’s main thread. But what if we need to synchronize the other thread with Matlab’s MT? For example, what if the Java code needs to run some Matlab function, or access some Matlab data? In MEX this could be done using the dedicated and documented MEX functions; in Java this can be done using the undocumented/unsupported JMI (Java-Matlab Interface) package [7]. Note that using standard Java Threads without Matlab synchronization is fully supported; it is only the JMI package that is undocumented and unsupported.
Here is the relevant code snippet for evaluating Matlab code within a Java thread:

import com.mathworks.jmi.Matlab;  //in %matlabroot%/java/jar/jmi.jar
...
Matlab matlabEngine = new Matlab();
...
Matlab.whenMatlabReady(runnableClass);

Where runnableClass is a class whose run() method includes calls to com.mathworks.jmi.Matlab methods such as:

matlabEngine.mtEval("plot(data)");
Double value = matlabEngine.mtFeval("min",{a,b},1); //2 inputs 1 output

Unfortunately, we cannot directly call matlabEngine‘s methods in our Java thread, since this is blocked in order to ensure synchronization Matlab only enables calling these methods from the MT, which is the reason for the runnableClass. Indeed, synchronizing Java code with MATLAB could be quite tricky, and can easily deadlock MATLAB. To alleviate some of the risk, I advise not to use the JMI class directly: use Joshua Kaplan’s MatlabControl [8] class, a user-friendly JMI wrapper.
Note that Java’s native invokeAndWait() method cannot be used to synchronize with Matlab. M-code executes as a single uninterrupted thread (MT). Events are simply queued by Matlab’s interpreter and processed when we relinquish control by requesting drawnow, pause, wait, waitfor etc. Matlab synchronization is robust and predictable, yet forces us to use the whenMatlabReady(runnableClass) mechanism to add to the event queue. The next time drawnow etc. is called in M-code, the event queue is purged and our submitted code will be processed by Matlab’s interpreter.
Java threading can be quite tricky even without the Matlab synchronization complexity. Deadlock, starvation and race conditions are frequent problems with Java threads. Basic Java synchronization is relatively easy, using the synchronized keyword. But getting the synchronization to work correctly is much more difficult and requires Java programming expertise that is beyond most Java programmers. In fact, many Java programmers who use threads are not even aware that their threads synchronization is buggy and that their code is not thread-safe.
My general advise is to use Java threads just for simple independent tasks that require minimal interactions with other threads, Matlab engine, and/or shared resources.

Additional alternatives and musings

In addition to Java threads, we can use other technologies for multi-threading in Matlab: Next week’s article will explore Dot-Net (C#) threads and timers, and that will be followed by a variety of options for C++ threads and spawned-processes IPC. So don’t let anyone complain any longer about not having explicit multi-threading in Matlab. It’s not trivial, but it’s also not rocket science, and there are plenty of alternatives out there.
Still, admittedly MT’s current single-threaded implementation is a pain-in-the-so-and-so, relic of a decades-old design. A likely future improvement to the Matlab M-code interpreter would be to make it thread-safe. This would enable automatic conversion of for loops into multiple threads running on multiple local CPUs/cores, significantly improving Matlab’s standard performance and essentially eliminating the need for a separate parfor in PCT (imagine me drooling here). Then again, this might reduce PCT sales…

Advanced Matlab Programming course – London 10-11 March, 2014

If Matlab performance interests you, consider joining my Advanced Matlab Programming course [9] in London on 10-11 March, 2014. In this course/seminar I will explore numerous other ways by which we can improve Matlab’s performance and create professional code. This is a unique opportunity to take your Matlab skills to a higher level within a couple of days. Registration closes this Friday, so don’t wait too long.

Categories: Java, Low risk of breaking in future versions


31 Comments (Open | Close)

31 Comments To "Explicit multi-threading in Matlab part 1"

#1 Comment By Thierry Dalon On February 19, 2014 @ 14:16

Hi Yair
Nice post!
Another possibility you haven’t mentioned for multi-threading is also to run a new Matlab instance from Matlab.

#2 Comment By Yair Altman On February 19, 2014 @ 14:27

@Thierry – I did mention “spawning external standalone processes” in my opening paragraph. Just note that it is not multi-threading but rather multi-processing. There’s a wide variety of things that you can do by spawning external processes, but it will always be less efficient to spawn an external heavyweight process than an in-process thread, not to mention the fact that it is harder to synchronize the data and coordinate execution. Perhaps I’ll dedicate a special post about spawning external processes, but this is a wide topic that opens the way to Matlab parallelization alternatives, and this could take me a full year of posts, so I guess I need to stop somewhere…

#3 Comment By oro77 On February 19, 2014 @ 17:54

I guess it is different but what about the Matlab Parallel Toolbox ? Can it be compared to the Java thread you explain in your article ?

#4 Comment By Yair Altman On February 20, 2014 @ 02:58

@Oro77 – PCT is different in many respects:

  1. PCT costs $$$, multi-threading is free
  2. PCT is much easier to use than creating multi-threaded classes that need to be compiled, debugged etc.
  3. PCT enables easy integration/synchronization with Matlab data & execution; multi-threading does not (at least not easily)
  4. PCT is supported by MathWorks, multithreading is your own code that nobody will support for you
  5. PCT uses spawned Matlab processes (headless workers – Matlab processes that simply have no GUI); multi-threading uses much lighter and more efficient threads

It is not that one is generally better than the other – both are good, for different use-cases. Depending on your specific needs you can select either one or the other (or both).

#5 Comment By oro77 On February 20, 2014 @ 19:11

Thank you for your complete comment on PCT 🙂

#6 Comment By Eric On February 20, 2014 @ 11:59

Very intriguing Yair… 🙂 Have you tried writing .MAT files in a background Java thread? If so what .MAT library did you use. This could be very handy functionality in certain circumstances!

#7 Comment By Yair Altman On February 20, 2014 @ 12:56

@Eric – you can use [16] for MAT-file I/O in Java

#8 Comment By oro77 On February 20, 2014 @ 19:12

I did use this library. I had some issues with big MAT files. Except this problem, it is quite easy to use.

#9 Comment By Amro On February 21, 2014 @ 07:46

You briefly mentioned doing multithreading in MEX-functions. I just wanted to clarify that the MEX API is *not* thread-safe. So while it is possible to spawn threads in your MEX-files and perform independent computations, you should never call any mx*/mex* functions from those threads, and should be restricted to the main running thread of the MEX-function.

Here is an example of multithreaded C/C++ using simple OpenMP compiler directives: [17] .

#10 Comment By Yair Altman On February 21, 2014 @ 07:53

@Amro – thanks for the clarification, but you are providing a spoiler… My MEX C++ multithreading article will appear on March 5, as part 3 of this series.

#11 Comment By Amro On February 21, 2014 @ 11:57

@Yair: sorry for giving it away 🙂 Interesting articles as always, keep up the good work!

#12 Comment By Wolfgang On February 19, 2015 @ 17:07

Great post Yair!
I am sure this is a stupid question (I am rather new to java), but I can’t figure it out:

I’d like to start a java thread from Matlab and continue to execute the Matlab script, analogous to what you show above.
Then, I’d like Matlab at some point in my Matlab script check whether the java thread is finished and execute some code (for instance, wait until it’s really finished to retrieve some output arguments. Any suggestions on how this could work?

Thanks a lot!
Wolfgang

#13 Comment By Yair Altman On February 20, 2015 @ 01:28

@Wolfgang – if you wish to wait for the Java thread to exit, you could try to use the Thread.join() method. See additional/related information [18].

#14 Comment By Wolfgang On February 24, 2015 @ 12:38

Thanks, Yair!

I can start java threads from matlab, but I am still stuck on how to get Matlab interact with them somehow. My problem is the following: How do start a java thread in Matlab and then retrieve any information about it in my Matlab script later, e.g. ask in the Matlab script whether the thread that was just started above is still running? Please, pardon my ignorance….
Any help greatly appreciated!

Wolfgang

#15 Comment By Wolfgang On February 25, 2015 @ 21:55

@Yair: Don’t bother. I figured it out. Btw: Your blog is awesome!
Wolfgang

#16 Comment By Stefano On July 20, 2015 @ 01:11

Hi Yair!

I may be a bit out of topic, but trying to add multithreading to a Matlab script I developed I have ecountered a problem I think worths to be posted and put in the public limelight. The problem regards variable transfer from client to workers when the “variable” is a Java object. I have experieced that, without multithreading, passing a Java object from a main script to a function is not a problem, while using (for instance) parfeval to implement multithreading, the object is not passed correctly resulting in the error “Attempt to reference field of non-structure array”. Something similar happens to me also trying to add multithreading exploiting a parfor cycle, therefore I suspect it may be a generalized issue with the usage of Matlab multithreading tools. May I kindly ask some clarification with this regards?

Many thanks in advance
Stefano

#17 Comment By Yair Altman On March 23, 2016 @ 09:58

@Stefano – This may answer your question: [19]

(sorry for the late response, but better late than never I guess…)

#18 Comment By Ben On March 23, 2016 @ 01:28

Hi Yair,

So I am very new to Java and the concept of multithreading so what I might ask may be child’s play, but why does the java thread take so long on complete? I understand the matlab is now free to do other things, but the test.data file takes a very very long time to complete. Much longer than just creating and saving.

Also, is there a way to stop this thread once it is sent out?

#19 Comment By Yair Altman On March 23, 2016 @ 09:55

@Ben – I don’t know what’s taking so long in your specific case. It is certainly not something general but specific to your particular implementation (perhaps the file is slow to access on some remote network drive for example?).

Anyway, you can temporarily stop a thread via its stop() method; you can terminate it via suspend(). Read [20], or in any standard Java textbook. This is a Matlab blog and not a Java one so if you need more information on the Java aspects you should go elsewhere.

#20 Comment By Ben On March 23, 2016 @ 20:19

@Yair – So i am writing to the network and was thinking that might be the issue with the length of time, but if I write out the same file to the same network path using

fid = fopen('w:\test.data','w');
fwrite(fid,data,'double');
fclose(fid);

it only taken 0.06 sec. compared to the Java thread that takes 15sec.

Do you have any thoughts on this?

Thank you for your post by the way, it’s awesome!

#21 Comment By JoeV On May 18, 2017 @ 21:41

It appears the MatlabControl project has been migrated to GitHub (due to Google Code shutting down): [8]

Documentation at the Wiki: [21]

#22 Comment By Serge On June 23, 2017 @ 20:26

Hello Yair,
I wish to spawn a figure that shows the time while the script continues to run.
I thought figures run in their own thread so I should be able to do this with addlistener, but i am not having luck.
Do i need java? if so can you give any pointers?
Many Thanks,

#23 Comment By Yair Altman On June 25, 2017 @ 00:57

@Serge – you can use a simple Matlab timer for this. Read the documentation for the timer function.

#24 Comment By Leo On January 27, 2018 @ 20:48

Great post, Yair. I was wondering if it is possible to interrupt the java thread from the matlab main thread, given that the java thread can catch it and exit gracefully. Is there a matlab method available to do it? Thank you very much!

#25 Comment By Yair Altman On January 28, 2018 @ 00:36

@Leo – you can stop a Java thread using its stop() method:

jThread = MyJavaThread('F:\test.data',data);
start(jThread); % start running in parallel
...
stop(jThread);  % stop thread

[22] for more details on Java threads.
Of course, if you create your own version of a Java thread, you can create a custom public method that signals the thread in a more elegant manner than the brute-force stop() method.

#26 Comment By Peyman On February 17, 2018 @ 01:29

Thank you for the post. It seems to be the exact solution I was looging for but I cannot start the Thread like you showed above. I get the error message:

??? Undefined function or method 'start' for input arguments of type 'ParallelPortReaderThread'.
Error in ==> testThread at 8
start(ParallelPortReaderThread(rr_intervals));

This is my Matlab code:

import java.lang.Thread;
import java.util.LinkedList;
rr_intervals = LinkedList();
for i = 1:10
    rr_intervals.add(0);
end
javaaddpath 'C:\Documents and Settings\user\My Documents\MATLAB\Marieke_Experiment';
start(ParallelPortReaderThread(rr_intervals));
%ParallelPortReaderThread pThr = new ParallelPortReaderThread(rr_intervals);
%pThr.start();

And this my java class:

import java.util.Random;
import java.util.LinkedList;
import java.lang.Thread;
        
public class ParallelPortReaderThread extends Thread {

    private LinkedList rrIntervals;

    public ParallelPortReaderThread(LinkedList queue) {
        this.rrIntervals = queue;
    }

    @Override
    public void run() {
        Random random = new Random();
        long x = (long) (800 + 200 * random.nextFloat());
        rrIntervals.remove();
        rrIntervals.add(x);
	try {
           Thread.sleep(x);
        } catch (InterruptedException ie) {
           ie.printStackTrace();
        }
    }

    public LinkedList getRrIntervals() {
        return rrIntervals;
    }
}

I’d be very thankful about any help and/or advice

#27 Comment By Peyman On February 17, 2018 @ 01:34

Mybe important: I’m using matlab R2008a and jdk 1.6.0 (both matlab and compilation)

#28 Comment By Yair Altman On February 17, 2018 @ 19:05

@Peyman – it’s probably due to one of the possible reasons that I listed here: [23]

#29 Comment By Peyman On February 18, 2018 @ 22:33

Thanks Yair. Clearing java solved the problem.

#30 Comment By Muhammad Muaaz On June 19, 2020 @ 14:03

Great post!

I use the example you provided on Matlab 2019b, and I am getting similar results on speed. However, I noticed that if we read the file back (no matter if it is written using Matlab, or Java) they both have 40,000,000 elements in them, whereas we have written 50,000,000 elements. Kindly comment, why it is so?

#31 Comment By Muhammad Muaaz On June 19, 2020 @ 14:05

Kindly ignore the previous msg, I did not mention the data type while reading the file. It works prefect. Thanks.


Article printed from Undocumented Matlab: https://undocumentedmatlab.com

URL to article: https://undocumentedmatlab.com/articles/explicit-multi-threading-in-matlab-part1

URLs in this post:

[1] recognized by the community: http://stackoverflow.com/questions/2713218/how-to-do-threading-in-matlab

[2] MATLAB Performance Tuning: http://www.crcpress.com/product/isbn/9781482211290

[3] save: http://undocumentedmatlab.com/blog/improving-save-performance/

[4] fwrite: http://undocumentedmatlab.com/blog/improving-fwrite-performance/

[5] Runnable: http://docs.oracle.com/javase/7/docs/api/java/lang/Runnable.html

[6] java.lang.Thread: http://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html

[7] JMI (Java-Matlab Interface) package: http://undocumentedmatlab.com/blog/jmi-java-to-matlab-interface/

[8] MatlabControl: https://github.com/jakaplan/matlabcontrol

[9] Advanced Matlab Programming course: http://undocumentedmatlab.com/training

[10] Explicit multi-threading in Matlab part 2 : https://undocumentedmatlab.com/articles/explicit-multi-threading-in-matlab-part2

[11] Explicit multi-threading in Matlab part 3 : https://undocumentedmatlab.com/articles/explicit-multi-threading-in-matlab-part3

[12] Explicit multi-threading in Matlab part 4 : https://undocumentedmatlab.com/articles/explicit-multi-threading-in-matlab-part4

[13] Multi-threaded Mex : https://undocumentedmatlab.com/articles/multi-threaded-mex

[14] Multi-line uitable column headers : https://undocumentedmatlab.com/articles/multi-line-uitable-column-headers

[15] Multi-line tooltips : https://undocumentedmatlab.com/articles/multi-line-tooltips

[16] : http://sourceforge.net/projects/jmatio/

[17] : http://www.walkingrandomly.com/?p=1795

[18] : http://stackoverflow.com/questions/289434/how-to-make-a-java-thread-wait-for-another-threads-output

[19] : http://fluffynukeit.com/tag/loadobj

[20] : https://docs.oracle.com/javase/tutorial/essential/concurrency/procthread.html

[21] : https://github.com/jakaplan/matlabcontrol/wiki

[22] : https://docs.oracle.com/javase/7/docs/api/java/lang/Thread.html

[23] : https://undocumentedmatlab.com/blog/java-class-access-pitfalls

Copyright © Yair Altman - Undocumented Matlab. All rights reserved.