Tone derivation¶
Thai has five phonemically distinct tones: mid, low, falling, high, and rising. The tone of a syllable is derived from four inputs: the effective consonant class, the syllable type (live or dead), the vowel length, and the tone mark (if any). These four inputs index into a tone matrix that produces the output tone deterministically.
Inputs to the tone matrix¶
| Input | Values |
|---|---|
| Effective class | HIGH, MID, LOW |
| Syllable type | LIVE, DEAD |
| Vowel length | SHORT, LONG |
| Tone mark | NONE, MAI_EK (◌่), MAI_THO (◌้), MAI_TRI (◌๊), MAI_JATTAWA (◌๋) |
Live vs. dead syllables¶
A syllable is live if it ends in a sonorant coda (/m n ŋ j w/) or has a long vowel in open position. A syllable is dead if it ends in an unreleased stop (/p̚ t̚ k̚/) or has a short vowel in open position. Live and dead syllables receive different tones even when the consonant class and tone mark are the same.
The tone matrix¶
This table gives the tone for every productive input combination. "—" marks combinations that do not occur in Standard Thai orthography (specifically, ◌๊ and ◌๋ are only productive on MC):
| Effective class | Syllable type | Vowel length | No mark | ◌่ mai ek | ◌้ mai tho | ◌๊ mai tri | ◌๋ mai jattawa |
|---|---|---|---|---|---|---|---|
| MC | LIVE | LONG | MID | LOW | FALLING | HIGH | RISING |
| MC | LIVE | SHORT | MID | LOW | FALLING | HIGH | RISING |
| MC | DEAD | LONG | LOW | LOW | FALLING | HIGH | RISING |
| MC | DEAD | SHORT | LOW | LOW | FALLING | HIGH | RISING |
| HC | LIVE | LONG | RISING | LOW | FALLING | — | — |
| HC | LIVE | SHORT | RISING | LOW | FALLING | — | — |
| HC | DEAD | LONG | LOW | LOW | FALLING | — | — |
| HC | DEAD | SHORT | LOW | LOW | FALLING | — | — |
| LC | LIVE | LONG | MID | FALLING | HIGH | — | — |
| LC | LIVE | SHORT | MID | FALLING | HIGH | — | — |
| LC | DEAD | LONG | FALLING | FALLING | HIGH | — | — |
| LC | DEAD | SHORT | HIGH | FALLING | HIGH | — | — |
Mai tri and mai jattawa
The two rarer tone marks (◌๊ mai tri and ◌๋ mai jattawa) only appear on MC consonants in Standard Thai orthography. Their application to HC or LC onsets does not occur in native vocabulary and produces a DerivationError if encountered.
Worked examples¶
กา (crow): ก is MC, open syllable with long /aː/, no tone mark → MID tone. IPA: /kaː˧/
ข้าว (rice): ข is HC, ◌้ (mai tho), live syllable (ends in /w/) → FALLING tone. IPA: /kʰaːw˥˩/
น้ำ (water): น is LC, ◌้ (mai tho), live syllable → HIGH tone. IPA: /naːm˦˥/
รัก (love): ร is LC, dead syllable (ends in /k̚/), short vowel, no mark → HIGH tone. IPA: /rak̚˦˥/
สวัสดี (hello): สวัส is a three-character cluster. ส is HC; ว onset with short /a/, dead syllable → LOW tone. The final ดี: ด is MC, long /iː/, live syllable, no mark → MID tone.
Tone marks in the orthography¶
Four tone marks can appear in Thai writing, placed above the onset consonant (or above the coda when the onset takes an above vowel mark):
| Mark | Thai name | Unicode |
|---|---|---|
| ◌่ | mai ek | U+0E48 |
| ◌้ | mai tho | U+0E49 |
| ◌๊ | mai tri | U+0E4A |
| ◌๋ | mai jattawa | U+0E4B |
A syllable carries at most one tone mark. The tone mark overrides the "natural" tone that the class and syllable type would otherwise produce.
The effective class and tone¶
The tone matrix uses the effective class, not the intrinsic class of the onset letter. Two mechanisms can shift a syllable's effective class:
Leading ห: When a bare ห (HC) precedes an LC sonorant letter without any vowel between them, the sonorant's syllable is promoted to HC effective class. This explains why หน, หม, หว, หย, หง, หล, หร, หว all produce rising tones on live syllables (the HC unmarked-live pattern) rather than mid tones (the LC pattern).
Aksornam propagation: A bare HC or MC leader consonant (one character, no vowel) before an LC sonorant onset causes the following syllable to inherit the leader's class. This is how words like สมาน are handled: ส (HC leader) + มาน (LC sonorant onset /m/) → มาน takes HC effective class → rising tone.
Both mechanisms are implemented in the pipeline runner and recorded in the effective_class field of each Syllable.