What is unicode? Well, to make some deductions from the name alone we can guess that “uni” could refer to “singular” or possibly “unique”- “code” most likely refers to coding.

Whether accurate or not these guesses basically summarise what Unicode is. Unicode gives a unique number to each character in a script, therefore unifying the use of scripts globally by standardising the process of encoding characters. The Unicode Consortium manages this date by dividing them into character code sets. The code each character in Unicode is given is called a code point and is usually written as follows: A “U+” (minus the quotation marks) and then four hexadecimal numbers (which must be four byte characters). An example would be U+0023, the code point for #.

Screenshot from Endmemo

There are also some websites available online that provide a search function for Unicode code points.

The Unicode database is ever expanding to try and accommodate and account for the many scripts that exist in the world. A particular challenge faced by Unicode is to account for all the variations of Han characters in use in many East Asian countries and regions.

To learn more about this, I have a page containing a presentation on Unihan pinned to the top of this websites homepage-alternatively, you can find it by clicking here (o˘◡˘o).


Leave a Reply

Your email address will not be published. Required fields are marked *