At 13:37, Friday 2th Feb 2024, the Cantonese Font v2 (Pokfield 蒲飛) was first built correctly, with the version number of 2.0.20.2. This was such a monumental moment for me, and such a relief. Started mid-2022, most of the last 21 months was best characterised as stumbling in a dark maze which probably have no exit.
My vision of the Cantonese Font is
- to make access to accurate, attractive Jyutping zero-cost, thus
- facilitating the creation of a large and diverse body of learning material
v1, named Boundary 界限, tested what was plausible. First technically completed in March 2023 and released in May, it showed that there are, indeed, New Territories beyond.

Boundary received many patches after release. The final version accumulated 8,500 characters; it is 99.5% sufficient for day-to-day usage, but when the user asked for a missing character, it can provide no recourse. Boundary showed that a font can be programmed with sophisticated and large quantity of rules (~10,000); using these, the font can recognize an additional 5,300 contexts (sifted from a data bank of ~30,000 most frequent words) and substitute for the correct reading. What Boundary does not do is recognize that it had provided a wrong sound, and it will insist that what it knows is right (even when not).
As a product, Boundary was incomplete and far too opinionated. Boundary was the stereotype Young Genius, cock-sure and full of promises, but is both ignorant of its deficiencies and incapable of correcting them.

Pokfield 蒲飛 is the Young Genius gotten older, wiser, more gracious having had its sharp edges smoothed round. The font — offline, no-compute, any application, cross-platform, durable for the next decades —
- contains 29,146 characters,
- contains 39,419 readings for these characters,
- swapping in 10,000+ unique contexts (from data bank of 58,000+ known contexts), with
- clean and exhaustive mechanisms for user override
- can choose between jyutping sizes (or mixing with no-jyutping),
- can choose between fluid and fixed width
In other words, 100% of readings, for 100% of readable characters, in 100% known contexts, with full flexibility. It’s a buffet where everything is here; users fulfil their unique needs in a time and manner of their choosing.
Getting here was really, really hard. Soul-crushing, suicidal hard. People who know about Cantonese linguistics, font making, and business each think this is foolish, and each of them is justified to think so.
The central assumption of this project, language-wise, is that we can tell the sound (and meaning) from the script. This is obviously not true, and we can easily see this with the following examples:
- 好平 (
hou2 peng4: very cheap, orhou2 ping4: very flat) - 區議員 (
kui1 ji5 jyun4: district councillor, orau1 ji5 jyun4: councillor with surname Au) - 背書 (背 as
bui6: to recite by rote, 背 asbui3: to underwrite (financial term)) - 學生會好好 (會 as
wui3: “students will be in a good spot”;wui6: “Student union is really good”) - 小明話佢做晒功課喎 (喎 as
wo3: “But Ming said he finished his homework”;wo2: “Ming said he finished his homework but I don’t believe it“)
In all these cases, the written script is identical. The script and the sound both are needed together to convey semantics, so it is clear that the central assumption of the project is wrong. The long ancestry of modern Cantonese, influenced by a multitude of other languages, also creates many other exceptions, neither predictable nor explanable, and often the sound in a particular context is unknown even to native speakers (or dictionary writers!). PyCantonese, as the state of the art at that time, with full Turing-complete access, had an error rate of 1-in-28; this is unusable. How can a no-compute environment do better?
The font situation is even more dire. A typical “complete” modern font may contain shape-only outlines for ~1,000 characters, each having a nice name, and behaviour augmented by 300 rules. In other words, it is atypical to
- include colors,
- work with 50,000+ glyphs,
- sitting in odd Unicode Extension planes, and
- driven with 100,000+ rules
It is atypical to do one of the above: attempting one of the above often requires special tooling and/or expertise. Pokfield required all-of-the-above; most of the special tools / process which fulfilled “one of the above” cannot accommodate the rest, or collapsed from the sheer magnitude.
Business people have broad and usually conflicting advices, but exactly zero of them would suggest taking on a multi-year sui generis R&D project, incurring significant risks, and then offer the product as pay-if-you-can, take-if-you-need. And certainly not go-for-broke on this non-business model.
I have exceptional confidence in my skills and judgment. I went for broke, on the font… and simultaneously several other fronts. Through these 21 months (really, five years of crazies now in Hong Kong), I toiled away in obscurity; watched civil society crushed, clowns exalted while noble characters persecuted; I had been legally discriminated, fired with no justification nor expectations of returning to teach, excommunicated from what I thought was my community, evicted from where we lived; and yes, broke. Working with the font was, for the most part, endless disappointments: the build is 2.0.20.2 because all the 2.0.1.0 through 2.0.20.1 — often requiring days of exacting tedium for a revision — all failed.
And that explains my relief. The maze has an exit! Not only it has an exit, but it leads to a secret location whose scenary is stunning, and now this scenary is for all current and future Cantonese community to enjoy.
Weeks turning into months turning into years, of failing and being unemployed and destitute and forsaken with no reprieve in sight, is enough to grind down any man. Eli, mom, and Nathan had more faith in me than me, and without their trust, understanding, and enthusiasm, my drift into insanity would have been complete.
I’ll be spending the next three months cleaning up, testing, writing documentation, and preparing exemplars. Pokfield / 蒲飛 (buffet 😎 🙈) will be available 1st May 2024. I can’t wait to see what you create with it.
Ah yes, the business side: Pokfield will be so good, so magical, so beautiful, so useful every single day, that you want to pay for it.
Leave a Reply