Unicode character to decimal and Hexadecimal conversions, and rendering decimal using javascript

Thursday, February 19, 2009

If you need to develop web pages in languages which cannot be represented by ASCII characters alone (like Arabic language), you have use encodings like UTF-8. The HTML page can be set the correct encoding using the META tag -

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

If the charset is set correctly, advanced editors lets you directly type in the contents in the language you want. Otherwise the language contents will appear as question marks or squares. If for some reason you cannot change the charset of your HTML, you will have to enter the language content in the decimal or hexadecimal notation. While rendering the page, the browser will display the content correctly.

Unicode to Decimal/Hexadecimal

The conversion of a string from a language to the decimal or hexadecimal format can be done in a few ways -

1) Create/Paste the language contents in Microsoft Word or Excel. Save the page as "Web page (*.htm, *.html)". Now view the saved page in notepad. You can see the language contents in decimal format. (A sample decimal format is &#1575;&#1604;&#1605;&#1585;&#1575;&#1576;&#1593;)

2) Use dreamweaver for editing the page. It automatically does the conversion to decimal.

3) Use this page - http://code.cside.com/3rdpage/us/unicode/converter.html. (This page can do the reverse also)

Decimal/Hex rendering problem while using javascript

In some scenarios, the browsers will not render the Decimal/Hexadecimal correctly to display in the language we want. Instead the browsers display the content as Decimal/Hexadecimal itself.

E.g. You want to populate a drop down using a javascript function. The function looks like this -

function populateCombo(values){
for(i=0;i
myCombo.options[i] = new Option(values[i], values[i]);
}
}

where "values" is a string array. If "values" contained Decimal/Hexadecimal characters, then the browser displays the Decimal/Hexadecimal itself instead of the actual string.

e.g.
values= new Array('&#1575;&#1604;&#1605;&#1585;&#1575;&#1576;&#1593;',
'&#1575;&#1604;&#1593;&#1585;&#1576;&#1610;&#1577;');

To overcome this problem we need a small javascript method -

function decimal2Unicode(decimalValue){
var elem = document.createElement("div");
elem.innerHTML = str;
return elem.firstChild.data;
}

Now, the drop down can be populated like -

myCombo.options[i] = new Option(decimal2Unicode(values[i]), decimal2Unicode(values[i]));


The drop down would display the string correctly now.

0 comments: