Technology Character Sets Chinese Character Sets Contact

Chinese Character Sets

Under Construction

Chinese characters may appear on Web pages as images (gif or jpeg) or special character sets. When they appear as special character sets you must have those fonts downloaded to your computer for them to display. The language and character set names will appear under Character Set or Encoding in the View menu your browser even though the fonts have not been downloaded. See an example page with the Traditional Chinese Big5 character set.

These characters are activated automatically when the document header contains the following META tag:
<meta http-equiv="Content-Type" content="text/html; charset=big5"> .
For Unicode:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta http-equiv="Content-Language" content="zh">
(you must select View > Source to see meta tags).
This should work you have downloaded the character set and selected it in preferences. (See below)

Chinese characters may also be activated by other tags such as HTML, P and FONT. In this case and when there is no META tag in the header, you have to select the character set from the menu bar View > Encoding/Character Set

You may also use something like:
Traditional Chinese character <span lang="zh-Hant" xml:lang="zh-Hant">農曆;</span>.
Simplified Chinese: <span lang="zh-Hans" xml:lang="zh-Hans">农历</span>

The character for "East" Encoded with "STYLE="font-family:MingLiU;" ,™F, should appear as if you select Big 5/Traditional Chinese in the View > Encoding/Character Set menu and have the Big5 character set installed.
Another Way: "<FONT LANG="ZH-TW">™F</FONT>" ™F
or "<P LANG="zh">™F</P>"

™F

or use character entity coding with the Unicode character number. e.g. "&#x6771;", (東)

Viewing Chinese Character Sets in Netscape and Internet Explorer

View Menu (Encoding or Character Sets)
View Menu EncodingFont files
LanguageEx.
East
Where usedCharacter SetDownloadFont Name
Chinese
Traditional
Taiwan, HongKong etc.Big5ie3lpktw.exe MingLiU
EUC-TW  
UTF-8 * MingLiU
Chinese
Simplified
China mainland, Singapore
and Malaysia
GB2312ie3lpkcn.exe MS Song
GBK/GBX  MS Hei
UTF-8 * MS Song
* UTF - Unicode uses "Han Radical-Sroke" for Chinese, Japanese, Korean languages.
PMingLiU is a replacement for MingLiU
SimSun s a replacement for MS Song
Preferences
Fonts
Encoding/Language ScriptWindows
Font
Mac Font
Serif
Mac Font
MonoSpace
Chinese TraditionalMingLiUApple LiSung LightTaipei
Chinese SimplifiedMS SongSongBeijing
Languages 
LanguageCode *
Chinese (Taiwan)zh-tw
Chinese (PRC)zh-cn
Chinese (Hong Kong, S.A.R. China)zh-hk
Chinese (Singapore)zh-sg
* Note: The two digit code zh is from the ISO639-1 Specification, New 3 char. codes (chi/zho) were specified in iso639-2 (1992), but most programs still use the old codes.

Display Chinese Character Sets in your browser

Yahoo Chinese Page
Viewing Chinese using Netscape

PC Operating System Support

Links:
Windows:

Macintosh OS X

  • Apple LiSung Included with Mac OS X 10.2
  • Taipei Included with Mac OS X 10.2
  • PMingLiU in Office 2004 for OS X
Test Pages:
Traditional Chinese www.microsoft.com/Taiwan/
Simplifed Chinese www.microsoft.com/China/

Written Chinese

Written Chinese can take several forms, similar to script/cursive or printing for western languages. In 1956 and in 1964 China simplified several thousand characters to make learning Chinese less difficult.
Note: The fonts below will not display correctly until you install the character set and select it from the View > Encoding/Character Set menu.
Only one character set (plus English) can be used with a document, so only one the fonts below will display correctly at any time.
Examples for East
Graphical Char
Hex Coded Char
Only 1 will be correct. Change w/
View > Encoding/Char. Set
Big5GBUTF-8Pinyin
Traditional™F Ê±dong1
Hex Code\xAA\x46 \xE6\xB1
Simplified ∂´ dong1
 Hex Code\xB6\xAB 
Simplified characters are now used in China and Singapore.
Traditional characters are used in Taiwan, Hong Kong, and most overseas communities.
Some other forms are: pinyin romanization, or zhuyin symbols.

Han unification is the process used by the authors of Unicode and the Universal Character Set to map multiple character sets of the CJK languages into a single set of unified characters. The Chinese characters are common to Chinese (where they are called hanzi), Japanese (where they are called kanji), and Korean (where they are called hanja).

Unicode CJK come in several variations: CJK compatibility, CJK Unified Ideographs, CJK Compatibility Ideographs, CJK compatibility Forms, CJK Miscellaneous

Romanization of Chinese Characters
There are several different translations of Chinese names to the roman alphabet (Romanization or Chinese Phonetic Alphabet).
Wade-Giles (1912) was was developed by Thomas Francis Wade, a British ambassador in China and Chinese scholar who was the first professor of Chinese at Cambridge University.
Taiwanese (Hō-ló-oē) whose century-old Pe̍h-ōe-jī (POJ) was developed by Presbyterian missionaries in Taiwan in the 19th century, is similar to Wade-Giles.

Pinyin (1949)developed by
It is based on the pronunciation of the Beijing dialect of Mandarin Chinese.
In 1978 mainland China officially adopted Pinyin.
The U.S. Library of Congress adopted Pinyin in 1999.
Taiwan stuck with Wade-Giles.

Spoken Chinese
Although the written languages started with basically the same characters (even though they have evolved differently over time), pronunciations for the same character is different for: Mandarin (pinyin), Cantonese, Taiwanese, Mandarin, Xiang, Gan, Hakka, Wu, Fujianese, and Cantonese groups. Dialects of the Mandarin group are spoken in three-quarters of the country by roughly two-thirds of the population, which is one of the reasons why Mandarin was chosen as the national language. Cantonese is the main dialect of Guangdong Province, Hong Kong, and many overseas Chinese communities. The major dialects spoken in Taiwan are Southern Fujianese or Taiwanese and Hakka, spoken by 75 percent and 15 percent of the natives respectively. Mandarin is the primary language used in schools, government, and most business offices....
Mandarin is the de-facto standard spoken language.
See Also: Chinese Language at Taiwan Govt. Info. Office.

References:
Fonts at WorldLanguage.com
Displaying Chinese On a Macintosh
Displaying Chinese On the Web
Reading Chinese in the Internet
Displaying Chinese Text
Unicode
Learning Chinese Reference List
Pinyin Luc Devroye's Chinese Font page
Chinese character encoding standards - Big 5, GB code, GB2312, GBK, Unicode :: Pinyin Joe
Displaing Foreign Languages

Chinese Dictionaries
Dictionaries at: Chinese-English Online Dictionary, Chinese Char. & Culture, Chinese Encyclopedia, Yahoo, Mandarin Tools, Sun Rain

Chinese writing software

Twin Bridge
Unionway
Mandarin Tools

Chinese Pronunciations

ocrat.com

Other Language Codes

Language Codes

Terms

CJK (Chinese/Japanese/Korean)
GB (GuoBiao) - Character Set used for Simplified Characters
HZ - a 7-bit data format proposed for arbitrarily mixed GB and ASCII text file exchange.
Big5 - Character Set used for Traditional Characters
EUC - Extended UNIX Code
Glyph - In typography,  the shape given in a particular typeface to a specific symbol.
UTF - Universal Transformation Format - A method for converting 16-bit Unicode
      characters into 7- or 8-bit characters
ISO-2022-CN - yet another new standard being quietly developed by
            Chinese software engineers in China and Taiwan
SIP Supplementary Ideographic Plane

Links:
Character styles at transname.com

Return to the Chien Page.
last updated 16 Feb 2002