Friday, February 24, 2006

utf-8 is encoding for unicode standard

just put a pot full of cold water on my head & you will soon get a boiling pot of water, for tea or for ramen.

UCS, unicode, ISO 10646, RFC 3629, CJK, Jamo, Plane, encoding, decoding, ..., too many vocabularies and too many acronyms.

After reading about 10 dozen different articles over past 2 weeks, re-reading some, I think I finally understand how ISO 10646 and Unicode differs, and what utf-8 and UCS encodings/decodings are.

unicode and ISO 10646 are standards. like brits and australians and japanese have driving standard to be on opposite side of US whereas many other countries have standard of driving on the same side as US. and utf-8, UCS are encoding methods/rules to implement unicode and ISO 10646 respectively.
Thus utf-8 and UCS would be vehicles with steering wheels and drivers' seat configured to comply with driving standard.

So far so good.

Next, numbers of possible local encoding/decodings different countries came up with, and possibilities of conjugations in encodings of certain characters are overwhelming. utf-8 seem to be good enough in many cases in current use, but there's utf-16 and utf-16wl, utf-32, utf-32wl to support wide-characters, and extended set of all existing/possible languages on earth.
utf-8 is supposed to handle up to 2^16 (or was it 2^31?) combinations of characters in different language set.

no wonder computer hardware and memory had to evolve so fast. with old day computer hardware, we can't even implement half the language set.

we're not using unicode to fullest extent and yet, the required hardware would be pretty big as is.

it's good i can related hardware history with unicode history.

encoding/decoding details of unicode also look about as obscure as assembly codes, like 4th languages that are very easy for human read.

slowly, i'm making progress at understanding bits of utf-8 and its usage.

hmm.... maybe i'll change to frying pan from pot on my head, it's time to fry some eggs for breakfast.


At 5:57 PM, Anonymous Anonymous said...

This comment has been removed by a blog administrator.

At 9:10 PM, Anonymous Anonymous said...

This comment has been removed by a blog administrator.


Post a Comment

<< Home