Guide for Canto Font v2 Early Access

What Problem This Solves

Some people like videos, some like to read.

People say “Cantonese is a hard language”. I disagree. There are 86,000,000 native speakers; half would be below average in whatever measure you choose, and they can nonetheless speak Cantonese. What Cantonese is: it is a frustrating language to learn and teach.

Cantonese is a frustrating language to learn. Modern polyglots agree that certain materials are indispensible. If I were to learn Japanese today, I would go out and get myself:

  • a frequency dictionary
  • subtitled movies / shows / song-with-lyrics
  • some grammar workbooks
  • an Anki deck for vocabulary
  • computer games

I would take a few days to memorize hiragana (the phonetic system, which is used to annotate the Hanzi); within a week I will be well on my way.

For Cantonese? You’ll have a hard time finding comprehensive, quality versions of the above for colloquial Cantonese that are compatible with one another. The few material with romanized pronunciation exists can be in one of two not exactly compatible forms (some have numbers and others have squiggles), maybe in Simplified or Traditional Chinese (but you can’t tell), and then your friends tell you “we don’t talk the way it’s written anyway”. It is frustrating trying to learn with such a paucity of support.


Cantonese is a frustrating language to teach. You may be a teacher (or parent) trying to follow best practices, and to start teaching with phonetics. After all, English reading coalesced to phonics, Japanese uses hiragana/katagana, Korean uses (phonetic) Hangul, Taiwan uses bopomofo and PRC uses pinyin for Mandarin… and Vietnamese replaced the written script altogether. So you start looking for a romanization system (which you weren’t taught in school).

Quickly you run into the Yale/Jyutping split, and noted that works in the last 15 years seem to mostly be Jyutping-focussed; but choosing Jyutping means you forgo Matthews and Yip, the only comprehensive grammar / workbook for the language, which was written with Yale. (Are you supposed to learn both just to get started?) You run into the same issues as our self-directed learner above.

You try to make your own stuff for your kid / student, and find that “Cantonese to Jyutping converters” just don’t seem to work:

Others gave up the fight altogether:

Yet another that isn’t even up for the job wants to charge you:

You spend 5 minutes putting together the Jyutping that you think is right (while having nagging doubts about the tones, the c/j, the –yu– and the aa/a), and spend another 5 minutes trying to finesse them over the characters (but they just never line up right):

…and your student look at those tone numbers like alien hieroglyphics. Kiddo don’t hear the difference and you have no way to show them. They keep forgetting what the numbers mean. You keep forgetting what the numbers mean and repeat 思史試時市事 with increasing despair.

This is a basic task. Why is it so hard? Why should annotating a sentence take 10 minutes? Why can’t it be just like this 👇?


In my mind, all languages starts out frustrating and were tamed through decades (or centuries) of deliberate cultivation, with investment from the host nation-states, and integrated as part of education through concerted efforts. People came together; they believed that their language and culture is worth something. This faith moved mountains.

Hong Kong is the closest to a “host nation-state” for Cantonese. Colonizing powers view local languages with ambivalence, running at various time along the gamut of curiosity—indifference—hostility. For locals, “home” is a borrowed place on borrowed time. We live at mercy of great powers, and the future surely lies must align with the more useful, more prestigious languages of English and Mandarin. Cantonese needed focussed, deliberate cultivation fifty years ago, but only a small band of brothers and sisters believed that taming the language was doable and worth doing.

And so Cantonese remains a frustrating language, and this frustration feeds on itself. Outside of high-density immersion, and/or high-intensity study, acquisition is “too hard”. Reputation as a “hard language” blocks new learners, assertion as “dying language” removes incentives for learning, and indifference makes it impossible to fund and sustain the long, patient efforts needed to cultivate language pedagogy.


I believe we need to modernize Cantonese pedagogy. Hong Kong remains a cosmopolitan city (at least for now), and making the language accessible to non-native Chinese is important. With our recent (unwilling) diaspora: some parents want to pass on some cultural identity but find it hard to do so, and some children, in time, wishes to find where they came from.

The key piece is to make “pedagogy with phonetics” accessible to learners, teacher/parents, and for commercial entities. We must fulfill these four FUNCTIONAL REQUIREMENTS:

  • reduce the 10 minutes task to 10 milliseconds
  • with annotations of 99.9% accuracy
  • encompasses 100% readings of 100% readable characters
  • presented in an attractive, inviting form

The right solution should return this 👇 the moment you’ve finished input, with clearly marked placement and inflections of tones, and suitable colors to guide the eye onto the Jyutping or Chinese text as necessary.

Why a Font Is the Right Solution

The conventional expectation of the 2020s is to prepare an app, be it a desktop application, a mobile app, or a browser-based tool. If we could accomplish the three functional requirements with a (web) app, it would be an excellent contribution; and it would be dreaming too small.

Web stuff survive as long as the creator expend time and money tending to them. On the whole, web-based material are frighteningly epheremal, with optimistic lifetimes of 5-15 years. Mobile and desktop applications are equally transient: everyone have the experience of an Win/MacOS/Linux/Android/iOS upgrade, and then some applications just don’t work any more. (Who knows why.)

Applications also have the issue of being sensitive to the device and operating system. We need to build and maintain the app, at a minimal, for current and future versions of Windows and Mac and Linux. We cannot build apps for devices and platforms that do not yet exist.

Lastly, standalone applications may fulfill the requirements, for example, by giving the users an image to copy from. However, static images are not useful to typesetters and preparing published works; nor would it be useful for, say, writing lyrics in sheet music. The ideal tool would compose, extend the capacities of existing applications, and anticipate uses in future that we cannot yet imagine. It should be able to export / represent its data in a way that other programmers can build on top of. Our ideal tool thus has five more TECHNICAL REQUIREMENTS:

  • Composable into all that the user wishes to do, present and future,
  • Works on all platform and devices, present and future,
  • Import/use/export a transparent data structure that is accessible programmatically
  • Require no internet connection
  • Require no maintenance once delivered

A solution that fulfills both the Functional Requirements and the Technical Requirements would sets a solid, universal foundation for the next thirty years. A new genre of technology needed to be invented, and it would be found in the old, classic genre of fonts. I am relieved and gratified to let you know that the Cantonese Font v2 (“Pokfield”) has mostly met all the Requirements on MacOS/iOS, and has additional helpful features for teaching and learning built-in.

Beyond the functional and technical requirements, as a unique, foundational piece of culture-technology, the artefact must also fulfill PUBLIC DUTIES:

  • The tool must be permissively licensed.
  • Everyone must be able to use the central features, even for commercial purposes, with no need to seek permission.
  • The core edition must be available on a pay-if-you-can basis.

These societal obligations needs to be carefully balanced with financial needs for future projects. With v1 release, 5% of users provided a > $5 donation and another 5% provided a $0.5 – $5 donation (mostly eaten by payment provider’s fixed charges). Overall, pay-if-you-can on its own returned 1% of what is needed to support a single developer. In the next months, I’ll seek to build and provide extra, add-on material as incentives for financially supporting the project (you can let me know of your suggestions).

RequirementMacOSWindowsLinuxiOSAndroidKindle/Kobo
Instantaneous conversion
Accuracy99.7% / 99.9% with segmentation.
See benchmarks here.
99.7% / 99.9%99.7% / 99.9%99.7% / 99.9%
100% readings of 100% readable characters
Colored presentation with tone marks✅ Vector Styler

⚠️ Usually tone marks only
n/a
Composable into applications✅ Vector Styler, Affinity Designern/a
Version compatibility2019 onwardsdepends on applications2022.10 onwards2019 onwards
transparent data structuren/a
Require no internet connection
Does not “rot”
Override pronunciation✅ Vector Styler, Affinity Designer

🚫 in general
n/a
Bi-directional translation, idioms, grammar✅ Vector Styler

⚠️ most apps translation En -> Zh only
n/a
NotesAll applications that uses CoreText enjoys the full features. Applications rolling their own font renderer often do not follow specifications with full care, and supports only partial features (Adobe!)✅ Full support in Vector Styler

⚠️ support often partial in other applications.
Tested on Ubuntuneeds to be wrapped into an app for App StoreeReaders are monochrome, and not for editing.
Fulfilment of Functional and Technical Requirements

Credits

Eight people / groups instrumental to the development of Pokfield:

  • Words.hk 粵典. The decade-long effort of this group, and their willingness to make their data open-usage, make it possible to identify the contexts of Cantonese usage.
  • Prof Qin Lu’s group at Poly U, who had assembled dictionary readings for characters
  • KT Shek 石見田 (pseudonym) who not only digitized every dictionary with Cantonese pronunciations, but also added recommendations for other usages, and provided a very convenient source of usage at jyut.net.
  • Nathan Hammond, who provided the performant draw code for individual glyphs, reducing what would have taken weeks of processing down to seconds; and for his insights and enthusiasm.
  • Simon Cozens, whose handbook on OpenType programming was indispensible in my ab initio learning of the arcane arts of fonts.
  • Georg / Rainer from Glyphs had entertained many of my eccentric asks.
  • v1 supporters, whose feedback and encouragements kept up morale; and morale is essential for the long lonely slog.
  • Eli Sanchez Arteaga, the Lilo that kept the grumbly eccentric Stitch from self-destructing; Eli Sanchez Tango which funded the open-ended (but thankfully “only” two years) development.

How to

How to Install the Font

Installing the font on MacOS
  1. select the font files you wish to install. For example, omit the monochrome (black and white) variants if you only want to see your Jyutping in color.
  2. Right-click (ctrl-click) and open in Font Book. Font Book is the MacOS font management software. When opening font files, it will first validate the content to see if it is correct and safe. If all checks out, the Install button becomes available.
  3. Click Install. Depending on how many font variants you are installing, this may take up to 10 seconds.

That’s it! You should now be able to choose the font (and its variants) in any applications. The whole process takes 4 clicks and 30 seconds.

(Are you a Windows / Linux user? Can you post some screenshots / steps for installation, or record a raw video for me to edit?)

First Steps

First take (Mac/Keynote)

If you are a MacOS user, I recommend starting with Keynote.

If you are a Windows user, VectorStyler.

  1. Begin with a plain text box, and enter some Chinese text. You can do this in any way you please; the video shows hand-writing, typing (the fancy phrase recognition is courtesy of TypeDuck), voice, and copy-pasted prose.
  2. With the Chinese text selected, change the font to VF Cantonese. (During the test period, the name contains the version number; the public release will be just VF Cantonese.)
  3. Change to the Standard Jyutping variant for color. (黑白粵拼 for Win / Affinity Designer.)

What used to be uncertain, tedious, and time-consuming has now been automated for you.

Over-riding Suggestions

The suggestions provided by the font is bench-marked at 99.7% for normal text, but you may wish to correct for the remaining words. There are cases where you want to make available standalone special sounds; the following is how you can override the font’s default suggestions.

Mechanism 1: Identifying Word Boundaries

Canto Font suggests the reading based on the context it is in. When multiple contexts are plausible, it can be confused. For example, the phrase 天上落雪 is read by the font as 天 上落 雪, and thus 上 is assigned as soeng5. In this case, all you need to do is to add a vertical bar | (usually found on top of the Enter key on US/UK keyboards) to indicate word boundaries: 天上|落雪. The bar is subsumed. You should always attempt to do this first before trying the next step.

Mechanism 2: Providing the Jyutping Yourself

Support for this is almost universal in Mac, and only in Affinity Designer in Windows.

We will show this manual override with 區議員 as an example. The font recognizes 區議員 as “district councillor”, but you meant 議員 as in “councillor Au”. What you do is to append a `.jyutping` to the character you wish to modify; in this case, we append .au1 to 區, and our entire typing is now 區.au1議員.

Note that you need to provide a jyutping that is permissible for that character: 區.siu2議.zyu1員.zyu1 will not work!

“Translation”

The font lets you “translate” English text to Chinese, by tagging a word in mustache braces { }. Try for yourself: type the following text, then tag any (or all!) or the words:

Monkeys ate bananas in the forest, before one monkey took one banana to the Admiralty MTR station.

The “dictionary” contains about 1,000 proper nouns (covering countries, major cities, religions, and locations in Hong Kong), and about 3,000 general terms and their variations (capitalized or not, pluralized, and conjugated / tenses).

It also “translates” in the other direction by tagging Chinese words. Try

滿地可 人 愛 民主 政府

Toggling Jyutping

Often you want to show Jyutping on some but not all characters. If you have installed a No Jyutping font variant earlier, you can now toggle the Jyutping by changing the font variant.

Direct selection or Italics keyboard shortcut both works

For your convenience, the No Jyutping and Standard Jyutping variants are defined with an “Italics”-like relationship. This means you can select some text, and hit ctrl/cmd-I to turn on and off the Jyutping. (If you installed the Bold variant, you can also hit ctrl/cmd-B to toggle bold. Useful for headings.)

Aligning the metrics is surprisingly difficult, and there is a small shift encountered with characters annotated with long Jyutping. This may be fixed in Summer 2024.

Progressions for All Levels

Beyond just “on” and “off”, Canto Font v2 comes with other variants, serving the very early beginner who relies heavily on Jyutping, to the advanced learner, or even traditional scholars.

The Large Jyutping font variant gives users who currently favors romanization, or just interested in speaking/listening something more tailored to their needs.

The Heritage variant uses 1.Ming as the base-font, which contains classical print components (see 請,飲), along with classic LSHK monochrome Jyutping-as-superscript notation. With the ease of toggling between the variety of styles, there is always a gentle ramp to go up and down wherever your students (or you) find yourself.

Cultural/Convenience Content

The font embeds a range of cultural/convenience content for the benefit of teachers and learners. These can be accessed using the {keyword: number} syntax. At the moment there is only an idiom 成語 and classifiers 量詞 libraries implemented, but more will follow from now to May 15.

Idioms

To access the idioms, you type {idiom: 321}, where the number can be anything from 1 to 1080. The idioms are arranged by difficulty of the characters. The difficulty is scored by a combination of the characters’ grade level and stroke count.

Prefer writing in Chinese? {成語: 321} would give you the equivalent function.

Classifiers / Measure Words

Nouns in Chinese are preceded by measure words 量詞, such as 一糖 (one <small, round count of> candy). The choice of measure words must be related to the noun, and there is a bewilderingly large number of them. Pokfield contains over a hundred classifiers, example usage, and explanation; the most commonly used classifiers are ranked first.

The measure words can be accessed in one of the three ways:

  1. {measure: n} (where n is a number from 1-110)
  2. {classifier: n}
  3. {量詞: n}

The examples can be accessed by appending an ex before closing the braces, for example, {measure: 23 ex}, whereas the brief explanations (does not replace consulting a book or a teacher!) can be gotten to by appending a ? before closing the braces, for example {measure: 23 ?}.

Tone-marks

If you are teaching Cantonese / Jyutping, there will be occasions where you wish to access the tone-marks or jyutping without the characters. You can do so by typing {tone: n}, where the number can be 1 to 6.

Usage in Browsers

General

The Cantonese Font gives you Jyutping for all the Chinese text on the Internet. This includes regular websites, Spotify lyrics, and YouTube / Vimeo subtitles. We do this by asking the browser to show text using Pokfield instead of what they originally came with. Google Chrome works well for this purpose.

To over-ride a website’s default styling,

  1. Install the Stylebot Chrome extension,
  2. Navigate to the website you wish to read, and add the CSS rules to change the text to use the Canto Font.

A selection of these stylings are provided here, and you are encouraged to add your usage / styling to the forums.

Styles

Wikipedia

The Yue Wikipedia and Chinese (Zh) Wikipedia contain large amount of written Cantonese and standard Chinese texts. Text in these can be targeted with the following styling:

div div p {
  font-family: "VF Canto v2-3-0";
  font-variant: "Standard Jyutping";
  font-size: 42pt;
}

div ul li {
  font-family: "VF Canto v2-3-0";
  font-variant: "Standard Jyutping";
  font-size: 34pt;
}
Spotify Lyrics
[data-testid="lyrics-container"] p {
  font-family: "VF Canto v2-3-0";
  font-variant: "Standard Jyutping";
  font-size: 42pt;
}
YouTube subtitles
.captions-text * {
  font-family: "VF Canto v2-3-0" !important;
  font-variant: "Standard Jyutping";
  font-size: 50pt;
  color: gray;
}
Vimeo subtitles

The MuiBox line let you use the Dual Subtitles for Vimeo extension (paid) to see two language subtitles at once. This can be very helpful for language learning.

.vp-captions-line {
  font-family: "VF Canto v2-3-0" !important;
  font-variant: "Standard Jyutping";
  color: gray;
  font-size: 52pt;
}

.MuiBox-root {
  font-family: "VF Canto v2-3-0 syntax" !important;
  font-variant: "Standard Jyutping";
}

Usage in eReaders

Canto Font on Kobo Libra 2

Pokfield is known to work on Kindles and Kobos. The OTF monochrome format should be what you load into the device.

Feedback

Please register an account at https://visualfonts.discoursehosting.net/ to read more in-depth documentation, changelog/roadmap, report errors, and provide suggestions. (This will move to forums.visual-fonts.com in time.)

Our Discord is here: https://discord.com/invite/vgHmfBV2V7 but the preference is for asynchronous interactions on the forums.

How You Can Help

While the font is largely “done,” there remains much supporting work to do. Some of these I need help with, in resource/technical capacities; sometimes that is simply from having been immersed in this for too long and its unearthing needs fresh eyes/ideas.

1. Compatibility findings

It is important to document the capacities and limitations. I only have a Mac machine, and will need help finding out the behaviour of different applications on different platforms.

You may have wild ideas you want to try. Give that a shot!

2. Jyutping errors / benchmark

The core competence of the font is accurate Jyutping assignments. If you find errors in your regular usage, I’ll be happy to know. What is preferred is doing specific pieces; this adds documents to the benchmark, and the corrected versions become part of the repository.

3a. Repository building

Documents that had gone through error correction (see Overriding Suggestions above) is exceedingly valuable to the Cantonese community as a whole. They provide prose where the pronunciation and characters are linked and vetted. Others can, for example, take the markup’ed prose and, by simple Italics, create a grade-suitable version for their own teaching / learning.

A specific section of the forum is dedicated to archiving this repo.

3b. Working with Subtitles

I think dual-subtitled multimedia is exceedingly valuable for learning. For Cantonese, this needed to be Jyutping-En or Jyutping-Es subtitles, and with the Font, this becomes a reality.

I have worked out a pipeline for automatic transcription into spoken Cantonese and simultaneous quality translation into other languages. If you:

  • want to edit subtitles (this is generally reduced to watching and minor editing; 1.5 min editing / 1 min video)
  • work with a channel that uses spoken Cantonese and wants subtitles

You can get in touch through the forums.

4. Editing Dictionaries

The dictionary had been half manually translated by me, and half Google Translated. It may be surprising if you don’t know me, but my Chinese / Cantonese is… not so good.

The En -> Zh version needs attention in:

  • checking the Chinese translations / rewriting to be more Canto-oriented
  • suggest addition of terms
  • checking the inflections / conjugations

Right now the Zh -> En is simply the inverse of the En -> Zh. I think a custom dictionary is needed here, and should be one that hits all of the needs of the early-intermediate learner.

I have the skeleton of an Es (spanish) -> Zh dictionary. My Spanish is quite non-existent, so a native speaker would be much needed here. The status is rough and lots of tender loving care will be needed.

If your favorite language is written in Latin, Greek, Cyrillic, or Hebrew script, you want to have a dictionary feature for YFL -> Zh, and you are willing to spearhead the effort, I’ll be happy to work with you. (I do not know enough about Devanagari, Arabic, or Thai scripts to even get started; if there is )

5. Ideas for Monetizing “Extras”

If you have ideas for what “extras” would be good incentives for users to pay $30-50 to support the project, I am all ears. Currently I am thinking about producing a series of 12 x 10 min beginner lessons + notes + Anki deck.

After the font, I have three Cantonese projects in mind. All three require hiring / tooling, and two require additional heavy investments. Pokfield ought to be accessible to all who need it, but income from Pokfield also needs to fund these next projects.

6. Spread the Word!

If you have a good experience with Canto Font, you can help:

  • tell your friends and family,
  • show it to your students (or teacher),
  • write a review on the forums (I’ll pick them out for the main website),
  • if you have a blog / channel / podcast / magazine / … write about the Canto Font

If you are in a HK community that wants me to come talk about the font, or Cantonese sounds, I’ll be happy to try find time.

7. Technical help

Technical guys are often bewildered at how little basic tech stuff I know. (I was a chemistry teacher.) These are all very specialised ask, but hey, can’t hurt putting my todo-list out there?

Elixir / Rust

My primary language is Elixir, and the Cantonese Font is backed by homegrown packages. I would like to

  1. bring in some Rust NIF (notably a PR on the ExJieba module for custom dictionaries and ZhT / HK segmentation)
  2. fork / PR Pegasus, a library that converts PEG grammar into NimbleParsec (Elixir’s “default” parser-combinator). I want to work with PEG because the grammar can then be ported out to (most) other languages. Pegasus currently neither do Unicode nor does byte-matching, which makes it impossible to do CJK stuff with.
    • When Pegasus works, write parsers for all the things
  3. re-write the whole Python set of tooling into Elixir (this, I think, is as much for my learning)
  4. build and deploy some CI/CD system that automatically produces grade-level output from the community repository
  5. …I have so many on the Elixir wishlist. The eventual goal for all of the above is to provide comprehensive, easy access tooling for doing Cantonese-related work through an API and through web interface (Ash 3.0).
  6. The other goal is real-time subtitling of videos with Jyutping/Zh and an additional language.

Machine learning

On the machine-learning side:

  1. I’m interesting in {char, jyutping} -> audio, and audio -> {char, jyutping}, especially in low-resource contexts.
    • For Speech-to-Text, I want to build
      • one of those 50 Mb VOSK models, and
      • a Whisper-v3 large that somehow involves Jyutping
    • For text-to-speech, also low-compute/low-resource context:
      • I want to build a system that takes in a List of [ {char, jyutping}, ... ], segments / assigns phonemes, and adds prosody to stitched together mp3s. Maybe something with Tacotron 2. Besides the machine learning parts, what I don’t know is how to map Jyutping into phonemes, and how tones work in the tokenization process.
      • Probably help to know what mp3s to record, and how (plan, record, post-process, split into phonemes etc)
      • The first voice would probably be mine (since my time is “free”), and eventually need a female voice.
  2. Babysit me deploying transformer models onto Replicate (lol so I can call them from Elixir…)
    • 書面語 轉 白話 models, and
    • Surya (multilingual OCR)

Mainstream languages

JS / Python:

  1. converting the PEG parser for the markup into an internal structure, then to various outputs. This removes the reliance on Elixir. (The BEAM is excellent, Elixir is extraordinary productive and joyful to use, but I know functional programming isn’t everyone’s cup of tea.)

Typst:

  1. Improve the furigana (ruby-text) package. Typst had a furigana package, but it did little more than center-aligning.
  2. some start of parallel typesetting

Animation / design

  1. I am interested in what Rive can do with the fonts, and with out-of-band loading of SVG and audio.

Sound / audio

  1. Recording sets of jyutping / char / words mp3s suitable as input to TTS (see machine learning above)

General stuff

  1. I think it is important and timely to have medical dictionaries accessible. The books for that is largely out of print. The only copies I know of are in the reference (no-take-out) shelf of HKPL. In the future I’ll be looking to track down copies (know anyone with them?), and scanning the books; or find some device that makes it feasible to do reasonably good scans of 300 pages. Then OCR (see Surya above) of this largely tabular data.

Hardware

If you have experience with development / manufacture of toy electronics, please get in touch.


Leave a Reply

Discover more from jon.hk

Subscribe now to keep reading and get access to the full archive.

Continue reading