In the spirit of the Jewish New Year that begins tonight, I would like to share a workaround that I received from blog reader Ro’ee Gilron of Tel-Aviv University:
Matlab users who use non-Latin computer Locales are aware of the issues that the Matlab Command Window has had with such languages for many years. I am not sure whether these problems are due to the LTR nature of Hebrew/Arabic, or their use of a non-supported code-page, or some other reason. To this day (R2011b), I am not aware of any fix or workaround for these issues.
But it seems that in addition, Matlab has a problem reading files that contain text in these languages, even when the computer’s Locale is set correctly, to a Locale that supports the non-Latin text. This is where Ro’ee’s workaround helps. In his words:
To give some more background, this used to work with a 32bit system, and an older version of Matlab (7.1). Now it doesn’t. Saving the file in UTF-8 and using fopen and textscan instead of importdata gives me this:
nowords =
‘שלבק’
‘התלכב’
‘× ×™×›×˜×¨’
‘תלפורש’
‘×œ×§×˜× ‘
‘מזוחש’
‘שלטיק’
‘טיבר’
‘עולג’
‘סלבוחד’
‘משוחגות’
‘מלוגסות’
‘סבק’
‘צמשר’
‘הכריב’
‘תמציל’
The solution is as follows (requires Simulink):
1) Change system Locale to Hebrew: http://windows.microsoft.com/en-US/windows7/Change-the-system-locale
(this doesn’t change the language of the OS etc.).
2) Change the encoding that Matlab uses:
http://www.mathworks.com/help/toolbox/simulink/slref/slcharacterencoding.html
They tell you not to, but I did… – you must change it to encoding that works for Hebrew: http://www.iana.org/assignments/character-sets
Any other language should work as well (I hope…). For Hebrew the code that works for me is ISO_8859-8
3) You should now be able to read TXT files that have Hebrew characters in them.
>> a='הצלחה!' a = ! >> currentCharacterEncoding = slCharacterEncoding(); >> currentCharacterEncoding = get_param(0, 'CharacterEncoding') % equivalent alternative currentCharacterEncoding = windows-1252 % Now modify the default encoding to something more useful >> slCharacterEncoding('ISO_8859-8') >> set_param(0, 'CharacterEncoding', 'ISO_8859-8'); % equivalent alternative >> currentCharacterEncoding = slCharacterEncoding() currentCharacterEncoding = ISO-8859-8 >> a='הצלחה!' a = ! % still no good in the Command Window... % Let's try to read a file with some Hebrew words: >> neutral = importdata('neutral.txt') neutral = שולחן' 'כסא' 'מנורה' 'צלחת' 'סיר' 'מזלג' |
So, it appears that while we did not solve the problems with the Command Window, at least we can now read the prayer book for our New Year prayers…
Let this be a year of fulfillment, prosperity, health and happiness to all. Shana Tova everybody!
Do you know how can I change character encoding from within a compiled code.
ie. set_param(0, ‘CharacterEncoding’, ‘ISO_8859-8’) could not be added to the matlab compiled exe file.
Thanks
Try to place this command in a startup.m file in your code folder, and then recompile your application. I’m not sure it will help, but it’s worth a try.