General Chinese: romanization chaos in early 20th century China
In the early 1900s, the Chinese intelligentsia decided that China needed modernization. This was a common position among non-Western nations, who at the time (Japan in 1868 and onward, Turkey in the 1920s) saw that their prosperity and military power was categorically inferior to those of industrialized European nations, and sought to do something about it.
China became a Republic in 1912. The previous regime, the Qing empire, did not see itself as a Westphalian nation-state. Rather, it was the centre of all civilization, from which civilization radiated outwards towards tributaries (e.g. Vietnam), until gradually it decayed into barbarians. The Republic sought to become a Westphalian nation-state, with well-defined borders and a list of citizens who belonged there, and foreigners who did not.
A modern nation-state requires a shared national culture, that invests the common people in the success of the state. This requires mass media like radio and newspapers, and the last one requires literacy for the masses. Literacy posed a massive challenge to China: its writing system is infamously difficult, and the intelligentsia of the time knew it. How would it be possible to make random peasants learn the 2000 characters required for basic newspaper comprehension?
Of course, in our timeline, we know this worked out. Chinese schoolchildren are required to learn 3500 characters by graduation, and many of them surely do. Chinese still uses its ancient and beautiful characters, in some cases simplified to require fewer pencil-strokes to write, but no less numerous in variety.
Nevertheless, reformers of the 1920s were very concerned with massifying Chinese. Some factions actually wanted to do away with Chinese characters in mass media altogether, and leave them for specialized academics who study the classics. One such reformer was Yuen Ren Chao (1892 Tianjin, China–1982 Cambridge, Massachusetts, USA) who came up with the delightfully unhinged system we’ll talk about today.
Gwoyeu Romatzyh
Chao’s main proposal was Gwoyeu Romatzyh, which was a system that encoded the Beijing dialect, including its tones, all in roman letters. It was adopted as the official system by the Republic of China in 1928. It is the same type of thing as the modernly used Pinyin, though unlike Pinyin, it might have been intended as the writing print media (Pinyin is only an educational and computer input tool).
Gwoyeu Romatzyh corresponds to Guóyǔ Luómǎzì in Pinyin (国语罗马字, lit. national language latin alphabet), to give you an idea how it encodes sounds. It indicates tones using different spellings: for example, an -r suffix, when it appears, is silent and indicates tone 2, doubling a letter indicates tone 3. But tone 2 is indicated in a different way sometimes (e.g. writing gwo instead of tone-1 guo). It’s a mess (Bernhard Karlgren 1928, pg. 18). But it gets even better.
General Chinese
Chinese dialects aren’t mutually intelligible when spoken, but they use very similar grammar and almost all words are cognates, which are written the same way in Chinese characters. One cool feature about written Chinese, therefore, is that it is universal among dialects: a Shanghainese text can kinda be read in Beijingese (Mandarin) through the characters on the page, and vice versa.
Note: they wouldn’t write the exact same sentence, they just share lots of cognates. It’s as if wrote european Romance languages using ideograms, so for example Spanish “me gusta oler las flores”, Italian “mi piace odorare i fiori” and French “J’aime sentir les fleurs”, Romanian “îmi place să miros florile”, Catalan “m’agrada olorar les flors” were all written with common characters for “me” and “flower”.
For example, see this Shanghainese phrase.
If you read it in Mandarin, it’s kind of weird, but understandable. You have to learn that 侬 (nóng) is 你 (ní), that is “you”. But then 吃过饭了 is straightforwardly “have eaten (a meal)” in both dialects, so you get where the whole thing is going.
Yuen Ren Chao liked this feature of Chinese characters so much, that he came up with a romanization system which would encode several dialects simultaneously.
How can this be possible? Well, each word is written preserving the distinctions between dialects, and when you read it, you just drop the distinction with other dialects. For example, Yuan Ren Chao’s name would be spelled “Dhyao Qiuan Remm”, and each dialect ignores some distinctions of vowels. The list of “dialects” includes Japanese, Korean and Vietnamese, since they also have cognate Chinese words that were borrowed at some point.
I don’t yet understand which consonants are merged, but here is the Wikipedia table in its full glory.
Unhinged.



