General-use object copy

May 6, 2015 15 Comments

When using Matlab objects, either a Matlab class (MCOS) or any other (e.g., Java, COM, C# etc.), it is often useful to create a copy of the original object, complete with all internal property values. This enables modification of the new copy without affecting the original object. This is not important for MCOS value-class objects, since value objects use the COW (Copy-on-Write/Update, a.k.a. Lazy Copy) and this is handled automatically by the Matlab interpreter when it detects that a change is made to the copy reference. However, it is very important for handle objects, where modifying any property of the copied object also modifies the original object.
Most OOP languages include some sort of a copy constructor, which enables programmers to duplicate a handle/reference object, internal properties included, such that it becomes entirely separate from the original object. Unfortunately, Matlab did not include such a copy constructor until R2011a (matlab.mixin.Copyable.copy()).
On Matlab R2010b and older, as well as on newer releases, we do not have a readily-available solution for handle object copy. Until now, that is.

There are several ways by which we can create such a copy function. We might call the main constructor to create a default object and then override its properties by iterating over the original object’s properties. This might work in some cases, but not if there is no default constructor for the object, or if there are side-effects to object property modifications. If we wanted to implement a deep (rather than shallow) copy, we’d need to recursively iterate over all the properties of the internal objects as well.
A simpler solution might be to save the object to a temporary file (tempname, then load from that file (which creates a copy), and finally delete the temp file. This is nice and clean, but the extra I/O could be relatively slow compared to in-memory processing.
Which leads us to today’s chosen solution, where we use Matlab’s builtin functions getByteStreamFromArray and getArrayFromByteStream, which I discussed last year as a way to easily serialize and deserialize Matlab data of any type. Specifically, getArrayFromByteStream has the side-effect of creating a duplicate of the serialized data, which is perfect for our needs here (note that these pair of function are only available on R2010b or newer; on R2010a or older we can still serialize via a temp file):

% Copy function - replacement for matlab.mixin.Copyable.copy() to create object copies
function newObj = copy(obj)
    try
        % R2010b or newer - directly in memory (faster)
        objByteArray = getByteStreamFromArray(obj);
        newObj = getArrayFromByteStream(objByteArray);
    catch
        % R2010a or earlier - serialize via temp file (slower)
        fname = [tempname '.mat'];
        save(fname, 'obj');
        newObj = load(fname);
        newObj = newObj.obj;
        delete(fname);
    end
end

This function can be placed anywhere on the Matlab path and will work on all recent Matlab releases (including R2010b and older), any type of Matlab data (including value or handle objects, UDD objects, structs, arrays etc.), as well as external objects (Java, C#, COM). In short, it works on anything that can be assigned to a Matlab variable:

obj1 = ... % anything really!
obj2 = obj1.copy();  % alternative #1
obj2 = copy(obj1);   % alternative #2

Alternative #1 may look “nicer” to a computer scientist, but alternative #2 is preferable because it also handles the case of non-object data (e.g., [] or ‘abc’ or magic(5) or a struct or cell array), whereas alternative #1 would error in such cases.
In any case, using either alternatives, we no longer need to worry about inheriting our MCOS class from matlab.mixin.Copyable, or backward compatibility with R2010b and older (I may possibly be bashed for this statement, but in my eyes future compatibility is less important than backward compatibility). This is not such a wild edge-case. In fact, I came across the idea for this post last week, when I developed an MCOS project for a consulting client that uses both R2010a and R2012a, and the same code needed to run on both Matlab releases.
Using the serialization functions also solves the case of creating copies for Java/C#/COM objects, which currently have no other solution, except if these objects happen to contain their own copy method.
In summary, using Matlab’s undocumented builtin serialization functions enables easy implementation of a very efficient (in-memory) copy constructor, which is expected to work across all Matlab types and many releases, without requiring any changes to existing code – just placing the above copy function on the Matlab path. This is expected to continue working properly until Matlab decides to remove the serialization functions (which should hopefully never happen, as they are so useful).
Sometimes, the best solutions lie not in sophisticated new features (e.g., matlab.mixin.Copyable), but by using plain ol’ existing building blocks. There’s a good lesson to be learned here I think.
p.s. – I do realize that matlab.mixin.Copyable provides the nice feature of enabling users to control the copy process, including implementing deep or shallow or selective copy. If that’s your need and you have R2011a or newer then good for you, go ahead and inherit Copyable. Today’s post was meant for the regular Joe who doesn’t need this fancy feature, but does need to support R2010b, and/or a simple way to clone Java/C#/COM objects.

15 Responses

Bill May 6, 2015 at 08:42 Reply
The call
getByteStreamFromArray(object)
getByteStreamFromArray(object)
fails for me in r2014a when used with object from my older (non-classdef) classes. Instead I get this error:
```
Temporary object of class :all: is missing a constructor or
an error occurred in calling constructor with no inputs.
```
So in this case it would revert to serializing via the temp file. I’m not sure if it is a problem with all older non-classdef defined classes, or if perhaps my constructor isn’t supporting some expected syntax.

Bill

Yair Altman May 6, 2015 at 08:47 Reply

@Bill – try to add a default (no-args) constructor function to your old class.

For reference, on new (MCOS classdef) objects, the constructor-invocation feature can be controlled via the ConstructOnLoad classdef attribute (this doesn’t answer your specific question but I thought to mention it here for reference to readers).

Bill May 6, 2015 at 14:20
The old-style non-classdef classes I’m using do have default constructors. For example the simple test class below give the same error:

@foo/foo.m
function [obj] = foo(varargin) obj = class(struct('x',1), 'foo');
function [obj] = foo(varargin) obj = class(struct('x',1), 'foo');
>> getArrayFromByteStream(getByteStreamFromArray(foo)) Temporary object of class :all: is missing a constructor or an error occurred in calling constructor with no inputs.
>> getArrayFromByteStream(getByteStreamFromArray(foo)) Temporary object of class :all: is missing a constructor or an error occurred in calling constructor with no inputs.

Scott Koch July 7, 2017 at 21:15 Reply
Hi Bill – You might try creating a simple wrapper class to hold your old object. It’s a little hacky but seemed to work for me.
classdef wrapper properties oldobj end end
classdef wrapper properties oldobj end end
Then use the property to store your old object:
a = wrapper; a.oldobj = oldobject; b = getByteStreamFromArray(a); c = getArrayFromByteStream(b); c.oldobj
a = wrapper; a.oldobj = oldobject; b = getByteStreamFromArray(a); c = getArrayFromByteStream(b); c.oldobj
Yair Altman July 8, 2017 at 20:53 Reply

@Scott – nice idea!
Thanks for sharing

Sam Roberts May 6, 2015 at 09:09 Reply

One other difference (unimportant for most people) between your copy method and matlab.mixin.Copyable.copy is that matlab.mixin.Copyable.copy will copy properties that are Transient. Saving to a temp file and then reloading will (intentionally, by design) lose them – and interestingly, this also appears to be the case when serializing with getByteStreamFromArray.

Yair Altman May 6, 2015 at 09:20 Reply

@Sam – Thanks for this. I contend that it is actually a design error for matlab.mixin.Copyable to copy transient properties, since transient values should never be relied on, and should typically be regenerated whenever needed. Still, I agree with you that users should at least be aware of this inconsistent behavior.

The consistency between getByteStreamFromArray and save is not surprising – I strongly suspect that save uses getByteStreamFromArray under the hood, and vice versa for getArrayFromByteStream and load.
Sam Roberts May 6, 2015 at 09:32 Reply

@Yair – this time I agree with you: I think it’s a design error for matlab.mixin.Copyable to copy Transient properties.
Martin May 6, 2015 at 23:46 Reply

@Yair – To my knowledge, the relationship between load/getArrayFromByteStream and save/getByteStreamFromArray is a bit more complicated than just one using the other. But they both rely on some common internal infrastructure.

The best way to see the similarity is to compare the bytes coming out of getByteStreamFromArray(data) and save(filename, ‘data’, ‘-v6’). The latter has a header that the former doesn’t have, but this is not the only difference …

Yair Altman May 7, 2015 at 02:58

@Martin –

“the relationship between load/getArrayFromByteStream and save/getByteStreamFromArray is a bit more complicated than just one using the other. But they both rely on some common internal infrastructure.”

I assume you mean they all use mxSerialize/mxDeserialize under the hood. That would be pretty close…

Balint June 16, 2015 at 10:57 Reply

I am trying to fix containers.Map’s crazy choice of handle semantics by wrapping it into some auto-copy class.

So far it appears that both this approach, and the matlab.mixin.Copyable facility requires an explicit call to the copy() method to actually do something. copy() is not called for an A = B call if A inherits from it. Thus to name Copyable as a copy constructor equivalent is a bit of a joke I think, as the copy constructor’s most important feature is that it gets called automatically whenever a new object is created from another object.

Am I missing something here? If not, do you have an idea how could I make containers.Map to follow value semantics, apart from reimplementing it completely in a custom variant?

Yair Altman June 16, 2015 at 11:05 Reply

@Balint – You could perhaps create a new user class that inherits from container.Map. However, container.Map also has poor performance compared to java.util.Hashtable (for example). So my advise is to implement your own user class based on an underlying Java object, using serialization/deserialization (as shown in the article above) as needed to store the Matlab objects in the Java object.

Mario Koddenbrock August 16, 2017 at 12:41 Reply
Hi Yair,

from time to time I get the following error:
"Error using getArrayFromByteStream Unable to read data stream because the data contains a bad version or endian-key"
"Error using getArrayFromByteStream Unable to read data stream because the data contains a bad version or endian-key"
It seems not to be reproducable.
Do you know reasons or fixes for that?

Best regards,
Mario

Yair Altman August 16, 2017 at 12:45 Reply

@Mario – I suspect that you are trying to move data between different computer types and/or different Matlab releases, and this is apparently not supported.
Mario Koddenbrock August 16, 2017 at 13:44 Reply

Thanks for your fast reply @Yair.

I’m really just trying to copy an object inside my workspace. There is no file IO involved…

HTML tags such as <b> or <i> are accepted.
Wrap code fragments inside <pre lang="matlab"> tags, like this:

<pre lang="matlab">
a = magic(3);
disp(sum(a))
</pre>

I reserve the right to edit/delete comments (read the site policies).
Not all comments will be answered. You can always email me (altmany at gmail) for private consulting.

Click here to cancel reply.

Related posts:

15 Responses

Leave a Reply