Vowels¶
Thai has a rich vowel system with a phonemic length distinction (short vs. long) for every vowel quality, and three centring diphthongs. In writing, vowel symbols are placed above, below, before, and after the consonant they belong to — sometimes in combination.
The vowel inventory¶
thaiphon uses IPA vowel quality labels as internal phoneme symbols. The vowel map below shows both the IPA representation and the typical orthographic pattern.
| Quality | IPA | Short Thai spelling | Long Thai spelling | Notes |
|---|---|---|---|---|
| /a/ | a |
◌ั | า | Short /a/ before a coda uses ◌ั (sara a); long uses า (sara aa) |
| /i/ | i |
◌ิ | ◌ี | Short sara i; long sara ii |
| /u/ | u |
◌ุ | ◌ู | Short sara u; long sara uu |
| /e/ | e |
เ◌็ | เ◌ | Short requires mai tai khu ◌็; long uses just เ |
| /ɛ/ | ɛ |
แ◌็ | แ◌ | Short /ɛ/ before coda uses แ without ◌็ |
| /o/ | o |
โ◌ | โ◌ | Short /o/ before coda uses โ◌ pattern |
| /ɔ/ | ɔ |
เ◌าะ | ◌อ | Short uses เ◌าะ frame; long uses ◌อ |
| /ɯ/ | ɯ |
◌ึ | ◌ื | Short sara ue; long sara uee |
| /ɤ/ | ɤ |
เ◌อะ | เ◌อ | The "ambiguous" vowel; short is เ◌อะ, long is เ◌อ |
| /ia/ | iə |
เ◌ียะ | เ◌ีย | Centring diphthong |
| /ɯa/ | ɯə |
เ◌ือะ | เ◌ือ | Centring diphthong |
| /ua/ | uə |
◌ัวะ | ◌ัว | Centring diphthong |
Vowel length¶
Vowel length is phonemically distinctive in Thai. Minimal pairs based on length alone are common:
| Short | Long |
|---|---|
เขิน /kʰɤn˩˩˦/ |
เขิน (/kʰɤːn˧/) — illustrative: these are the same word, length is orthographic |
คน /kʰon˧/ (person) |
โคน /kʰoːn˧/ (base, stump) |
The vowel_length field on a Syllable records VowelLength.SHORT or VowelLength.LONG.
Sara Am: a special case¶
Sara Am (◌ำ, U+0E33) is not a simple vowel mark. It decomposes as:
- A long /aː/ vowel nucleus.
- A nasal /m/ coda.
thaiphon expands Sara Am in the normalisation phase. The word น้ำ (water) contains น + ◌้ + ◌ำ and is analysed as onset /n/ + vowel /aː/ LONG + coda /m/ + high tone (from ◌้ on an LC onset).
See Special cases for the full expansion logic.
Orthographic vowel frames¶
Thai vowel notation is positional. Some vowels are written before the consonant (pre-vowels), some above, some below, and some after. A single vowel phoneme may involve characters in multiple positions around the onset consonant.
Pre-vowels (written before the onset in text, but phonemically part of the nucleus): - เ — used in เ◌ (long /eː/), เ◌็ (short /e/), แ◌ (long /ɛː/), โ◌ (long /oː/), เ◌าะ (short /ɔ/), เ◌อ (long /ɤː/)
Post-base vowel marks (written above or below the onset): - ◌ั ◌ิ ◌ี ◌ึ ◌ื ◌ุ ◌ู ◌็
thaiphon identifies these marks during syllabification and uses them alongside the presence/absence of a coda and the pre-vowel to determine the vowel phoneme.
Centring diphthongs¶
The three centring diphthongs — /ia/, /ɯa/, /ua/ — glide from a front or back position toward the central schwa. Orthographically they are written as two-part frames:
| Diphthong | Short form | Long form |
|---|---|---|
| /ia/ (เ◌ีย) | เ◌ียะ | เ◌ีย |
| /ɯa/ (เ◌ือ) | เ◌ือะ | เ◌ือ |
| /ua/ (◌ัว) | ◌ัวะ | ◌ัว |
In broad IPA, the long and short centring diphthongs are not distinguished — both surface as iə, ɯə, uə. The length distinction is preserved in the internal representation but does not appear in the IPA output.
Offglides¶
Thai has two semi-vowel offglides that function as coda-like elements at the end of diphthongs: - /w/ — from ว or อ in specific vowel frames - /j/ — from ย or ◌็ in specific vowel frames
These are classified as Phoneme objects with is_sonorant=True in the coda position, and they make the syllable live (not dead). Schemes render them as letters or diacritics appropriate to their notation.