Cantonese Font 粵語字體

為繁體字自動添上全彩精準廣東話拼音

無需網絡, 讓你現在使用的各種程式錦上添花。 包含超過8500個繁體字, 隨着前文後理準確選擇發音及聲調。 廣東話教學、進階閱讀的不二之選。 不單止可以在 Microsoft Office, Apple iWork, LibreOffice 為文書以及 演講片上使用,亦可Chrome瀏覽網頁時使用。

支援蘋果電腦 / Ubuntu (Kinetic Kudu+)


粵語字體能如何使用?

粵語字體是我們「是像字體」第一成員。是像字體,就像已經廣泛使用的emoji 🥳 一樣,為傳統黑白字形注入彩色圖片。 無需特殊學習或者訓練。

粵語字體利用這些彩色圖片,幫助廣東話的教育與學習。 你只需要手寫、倉頡、語音、或者複製任何繁體字,將字型改為粵語字體, 所有文字就會自動加上粵拼。支援各種軟件程式,無需聯網。

驚艷與智慧的結合

洋溢的字庫

我們日常使用大概三四千字。 粵語字體包括超過8500字, 完整覆蓋口語使用嘅廣東話、香港異體字,以及古文 中出現,日常比較少用嘅字眼。

由小朋友初學閱讀,到成年人久别再遇唐詩三百首、四書,都唔會遇上缺字嘅問題。


一字多音冇難度

廣東話一個字,可以有多個讀音、變調,有時甚至乎用廣東話作為母語嘅人都未必盡知正確嘅讀法。(例:「從」字「從容」中作「鬆 sung1」讀。)

設計粵語字體的時候,我們分析超過十萬個詞彙, 令到字體能夠自動根據前文後理選擇最適合嘅發音。

在兩個測試之中, 粵語字體在綠野仙蹤廣東話譯本、以及日常對話節錄,準確性超過99%, 超越一般母語使用者的表現。


為經典配上儀容

「粵拼」是香港語言學學會為廣東話貼身設計嘅拼音系統,能夠準確、完全覆蓋香港廣東話發音。 粵語字體建基於這經典,利用色彩表示聲韻部份, 並為音調加上貼心提示。使用嘅時候,音調是高或低可以一目了然。


延伸你現有的程序

作為一套字體而非一個單獨程式,粵語字體 完美延伸你現在已經使用的軟件。 你可以繼續利用Word 做文書檔案、Keynote 設計幻燈片、Sketch.app 繪圖、MuseScore 寫作音樂歌詞、Chrome 瀏覽網頁,而無需要學習、安裝、購買新系統。


跟隨你的節奏

你初學廣東話的小朋友、學生可能未必知道如何讀寫「朋友」這個中文字。 粵語字體包含一個英文轉中文的功能;只需要用鍵盤打入friend,字體會自動將他轉換為繁體中文「朋友」,並為字眼加上粵拼。 內置詞庫包括超過2000個常用字。

粵語字體是一套生於香港、面向世界的創作。由每一個地鐵站, Admiralty 變成金鐘 gam1 zung1,到世界上250個國家、超過250個城市, 都能夠自動轉換。

如果你是教廣東話的老師,你可能會使用一個同系列嘅字眼。你可以利用一個特殊嘅模式emotion(1), emotion(2)… 將字眼系統性地列表。


下載項目

兩套字型

一套純粹繁體字中文、一套包含英中翻譯功能。兩套字型以 .ttf 格式提供。在蘋果系統上安裝方便,只需要三個滑鼠點擊、可以在10秒內完成。

示範文檔

包括 Keynote/Powerpoint 幻燈片, Word/Pages 文書, Numbers 電子表格, 以及 LibreOffice 開源格式示範文檔.

粵語字體 (英中翻譯版本)

8500個繁體中文字,每個包括全彩、一字多音的粵語拼音標注。 英中翻譯字型包含超過2000 詞語、內含所有香港港鐵站, 250個國家,超過250個世界大城市。

粵語字體用了一年時間設計、研究、 傾力製造,現在以SIL Open Font License (開源字體授權) 發行, 令大家可以在學習、各種創作(包括商業計劃)中使用。課金將用作支援字體未來發展,閣下亦可以於下載時將價格設定為零,免費下載。


如何支持粵語字體?

項目得以繼續,有賴用戶支持。

  • 作為個人用戶,你可以
    • 在下載字型時候捐款,
    • 購買我們其他的「是像字型」,
    • 寫評語,以及
    • 介紹比朋友(非常重要,因為有用嘅嘢冇人用,等於冇做過);
  • 作為一個機構,你可以
    • 資助字體特定延伸項目(見下Roadmap), 或者
    • 設立定期資金捐獻
  • 如果你是設計師,或者出版商,可以聯絡我們、使用我們創立的一系列技術,為你的成品 加工排版

我們團隊成員接受顧問工作。技術專業範圍包括深度資料處理 automation pipelines (Elixir, Python, and Javascript), 技術性排版 (LaTeX), 以及矢量圖像設計及自動化 vector graphic design & scripting (Adobe Illustrator). 當然,我們對 字型創作、工程學 (font engineering) 以及 廣東話語言學 上有一定認識。

粵語字體經費來源是我們的儲蓄及正職收入。我哋嘅正職係… 教跳舞 🙃。 如果你喺香港,又想學阿根廷探戈 Argentine Tango,可以嚟我哋銅鑼灣 studio。Eliana 係全港唯一來自阿根廷,世界上首屈一指、多次代表阿根廷出席世界博覽、[下刪1000字💖] 嘅舞蹈家。初班有廣東話翻譯。 詳情可見 www.eli.dance 🙂


用戶評語

Cantonese Visual Fonts is a game-changer for anyone learning Cantonese and exploring traditional characters. The added Jyutping pronunciations on top of the characters make it incredibly convenient to use on a daily basis to read. ⭐⭐⭐⭐⭐

PTK (beginner)

FAQs

可以在 Windows 使用嗎?

不完全。「是像字體」 使用一種叫做 OpenType-SVG 的系統,但是Windows並冇支援呢重功能。

由於好多用戶都會使用 Windows,我哋為每隻字都加咗個黑白版本。在Windows使用時,會顯示黑白字型,其他所有功能全部存在。

可以在 [Linux distro] 使用嗎?

有可能。Linux背後是使用FreeType 顯示文字,而FreeType 在2022年4月加入咗OpenType-SVG 嘅支援。每個系統係唔同時候更新,但係長遠嚟講應該才一兩年之間所有Linux 系統都可以使用。

在 [Adobe app] 裏面,字好像被扎扁.

這是 Adobe 2022 連更新嘅時候新增嘅bug. 我已經 同佢哋會報咗。

粵拼細得滯啦!

中文字每個字嘅闊度必須均等 (如果唔係啲五言絕詩、七言律詩就會對唔齊;其他睇落去都會怪怪地)。 所以粵拼最大 只可以同中文字一樣闊。「硬」 字 的粵語拼音為ngaang6,設計嘅時候係啱啱等同中文字嘅闊度。

我研究緊做一套web-app,可以同樣顯示全彩粵語拼音,以及加上音調提示,但暫時十劃只有一撇.

我可以用嚟出書嗎?

可以啊!無限歡迎。如果你有特殊需要,譬如更改拼音大細、字體等等,可以聯絡我們商業合作。

[…]字嘅粵拼錯咗喎

「粵拼錯咗喎」 有四種可能性:

(1) 真係錯咗喎. 每一個字經歷好多工序, 不免有手文之誤。請匯報。

(2) 唔同觀點. 由於廣東話讀法歷史上冇一個官方規範, 亦都會隨着時間轉變,可能你我在讀音上有唔同見解。在香港 2000-2010 有個一番「正讀運動」,例如建議「洱」讀為耳 ji2 (飲茶飲普耳), 或者「構」為 gau3 (而非 kau3). 在我構造粵語字體時嘅宗旨係反映正常人日常使用嘅發音。

(3) fixable contextual error. For example, an early tester discovered that 三思而行 was assigned saam1 instead of saam3 for 三; this is an example of a fixable contextual error. Please file a report.

(4) unfixable contextual error. There are two sub-classes here. The first is that the font is unable to know (esp in isolation) whether a word should be read in literary style 文讀 or vernacular 白讀, and defaults to the vernacular. The second sub-class is due to incorrect segmenting. An example may be 香港地方潮濕. 地 is assigned (incorrectly) dei2 when it should be dei6; the reason is that there is a context for “香港地”, and the font parses this sentence as 香港地.方.潮濕 instead of 香港.地方.潮濕. Unfortunately, without being able to do proper word segmentation, this will remain a limitation.

It’s probably hard to know which case it falls under: you can report them all with this form. I tend to fix fixable things in the next minor patch, but do be forewarn that some are not technically possible with an offline, no-computation package.


Deep Dives


Versioning, Changelog, and Roadmap

Versioning

The font software adopts semantic versioning. The version number has three parts: x.y.z, standing for:

x: major version. This is reserved for a complete font re-build, where unknowable number of pronunciation changes may happen. If this happens, the last major version will still be available so users that prefer the previous version can always download the previous version.

y: minor version. This increments when significant new features are added or changed. Examples include entire new class of categories in the Phrasebook, or new language availability for the Phrasebook.

z: patches. This increments with bug-fixes, added chars, added words, or added categories and terms in the Phrasebook.

I personally dislike getting subscription emails, so patches are definitely not announced anywhere except in the changelog.

Changelog

1.1.5 (2023-05-18)

Minor patch.

Fix: monochrome render of 碟 showed character twice in overlapping ways.

Add: +104 (8471 total).

1.1.4 (2023-05-09)

* Colored layers for words now share the same side-bearings as monochrome layers (i.e., colored layered are center-aligned horizontally). This improves the spacing in a way that is particularly notably on words containing narrow chars such as 革命

* Chars 字: +283 (8,367 total). New selections are mostly from 詩經.大雅+小雅, with minor additions from recent political / science articles on Wikipedia

* Words 詞: +4

* Sound corrections: 咸碟, 鹹碟, 對話, 聿 (now with 歪讀 leot6 instead of 正讀 jyut6🙂 )

* non-phrasebook mode no longer erroneously replace Cantonese and Jyutping

Known problems: 好學生, 曾孫女, despite addition of new word ligatures, are incorrectly segmented as 好學.生, 曾.孫女. This is reflective of a larger problem: longer features must be placed in a higher precedence lookup, instead of randomly as now. This impacts the whole feature set (!)

1.1.2

First public release.

Roadmap

Cantonese Font is feature-complete and stable. The following is my wishlist / todo-list (not in any order) around this project:

  1. PDF manual. This would be bi-lingual, including screenshot/instructions for using the font. Instructions would include (0) authoritative compatibilities, (1) install/uninstall on different platforms, (2) StyleBot/Chrome setup for web-browsing on different OS, (3) mixing with Latin / Zh fonts, (4) enabling ligatures in apps that does not turn on by default (Office, I think), and (5) listing of chars and Phrasebook terms/categories.
  2. Cantonese linguistic categories. Using a syntax of Canto.___(n), exhaustively tally linguistic aspects of Cantonese. An example is Canto.measure(n) which shows all the measure words (個, 隻, 支, …); another example is Canto.particles(n) for 啦, 喎, 啩. These should be helpful for systematic study or teachers preparing teaching material.
  3. 100% traditional characters coverage. CJK (Chinese-Japanese-Korean) characters are complicated in their encoding and usage in different variant. At the first steps of this project, I started with commonly used characters (to ensure they are pronunciable Zh-T entries) and expanded by patching upwards. A complementary approach is to run through the full list of Unicode CJK codepoints, filter out for what is Zh-T and have one or more Cantonese pronunciations. This requires fast access to UniHan and Rime, and will have to wait for the completion of the Elixir libraries UniHan (Kip Cole) and ExCantonese.
  4. Re-compute pronunciations. v1 of the font was constructed when I knew far less about the idiosyncracies of Cantonese NLP. Knowing what I know, I think there are better approaches once we can have user feedback.
  5. Website revamp. The WordPress landing page would be replaced by an Elixir-Phoenix-Ash setup. This opens up possibilities for much more interesting real-time interactive experiences, that are exposed by UniHan / ExCantonese. (A little teaser: with these libraries, we are able to answer questions like, “which jyutping have the most characters mapping to it?”, “how many characters are there for each radical?”, or “what characters have an onset of f and a tone of 2?”) The text-editor workflow would also means less barrier to writing about Canto / font blog posts. (Not to leave you hanging: the sound that maps to the most character is “jyu4”, with over 30 characters. There is a great deal of low-usage characters that contains 俞 with different radicals.) It may also be possible to enable dynamic processing / SVG generation, using user-supplied styles.
  6. Chinese / Spanish / Italian website internationalization. Awaits new website architecture.