Tone derivation

Tone derivation

Thai has five phonemically distinct tones: mid, low, falling, high, and rising. The tone of a syllable is derived from four inputs: the effective consonant class, the syllable type (live or dead), the vowel length, and the tone mark (if any). These four inputs index into a tone matrix that produces the output tone deterministically.


Inputs to the tone matrix

Input Values
Effective class HIGH, MID, LOW
Syllable type LIVE, DEAD
Vowel length SHORT, LONG
Tone mark NONE, MAI_EK (◌่), MAI_THO (◌้), MAI_TRI (◌๊), MAI_JATTAWA (◌๋)

Live vs. dead syllables

A syllable is live if it ends in a sonorant coda (/m n ŋ j w/) or has a long vowel in open position. A syllable is dead if it ends in an unreleased stop (/p̚ t̚ k̚/) or has a short vowel in open position. Live and dead syllables receive different tones even when the consonant class and tone mark are the same.


The tone matrix

This table gives the tone for every productive input combination. "—" marks combinations that do not occur in Standard Thai orthography (specifically, ◌๊ and ◌๋ are only productive on MC):

Effective class Syllable type Vowel length No mark ◌่ mai ek ◌้ mai tho ◌๊ mai tri ◌๋ mai jattawa
MC LIVE LONG MID LOW FALLING HIGH RISING
MC LIVE SHORT MID LOW FALLING HIGH RISING
MC DEAD LONG LOW LOW FALLING HIGH RISING
MC DEAD SHORT LOW LOW FALLING HIGH RISING
HC LIVE LONG RISING LOW FALLING
HC LIVE SHORT RISING LOW FALLING
HC DEAD LONG LOW LOW FALLING
HC DEAD SHORT LOW LOW FALLING
LC LIVE LONG MID FALLING HIGH
LC LIVE SHORT MID FALLING HIGH
LC DEAD LONG FALLING FALLING HIGH
LC DEAD SHORT HIGH FALLING HIGH

Mai tri and mai jattawa

The two rarer tone marks (◌๊ mai tri and ◌๋ mai jattawa) only appear on MC consonants in Standard Thai orthography. Their application to HC or LC onsets does not occur in native vocabulary and produces a DerivationError if encountered.


Worked examples

กา (crow): ก is MC, open syllable with long /aː/, no tone mark → MID tone. IPA: /kaː˧/

ข้าว (rice): ข is HC, ◌้ (mai tho), live syllable (ends in /w/) → FALLING tone. IPA: /kʰaːw˥˩/

น้ำ (water): น is LC, ◌้ (mai tho), live syllable → HIGH tone. IPA: /naːm˦˥/

รัก (love): ร is LC, dead syllable (ends in /k̚/), short vowel, no mark → HIGH tone. IPA: /rak̚˦˥/

สวัสดี (hello): สวัส is a three-character cluster. ส is HC; ว onset with short /a/, dead syllable → LOW tone. The final ดี: ด is MC, long /iː/, live syllable, no mark → MID tone.


Tone marks in the orthography

Four tone marks can appear in Thai writing, placed above the onset consonant (or above the coda when the onset takes an above vowel mark):

Mark Thai name Unicode
◌่ mai ek U+0E48
◌้ mai tho U+0E49
◌๊ mai tri U+0E4A
◌๋ mai jattawa U+0E4B

A syllable carries at most one tone mark. The tone mark overrides the "natural" tone that the class and syllable type would otherwise produce.


The effective class and tone

The tone matrix uses the effective class, not the intrinsic class of the onset letter. Two mechanisms can shift a syllable's effective class:

Leading ห: When a bare ห (HC) precedes an LC sonorant letter without any vowel between them, the sonorant's syllable is promoted to HC effective class. This explains why หน, หม, หว, หย, หง, หล, หร, หว all produce rising tones on live syllables (the HC unmarked-live pattern) rather than mid tones (the LC pattern).

Aksornam propagation: A bare HC or MC leader consonant (one character, no vowel) before an LC sonorant onset causes the following syllable to inherit the leader's class. This is how words like สมาน are handled: ส (HC leader) + มาน (LC sonorant onset /m/) → มาน takes HC effective class → rising tone.

Both mechanisms are implemented in the pipeline runner and recorded in the effective_class field of each Syllable.