IPA

IPA scheme

The ipa scheme produces broad phonemic IPA transcription in the format used by Wiktionary's Thai pronunciation entries. It is the phonological baseline — the scheme used for accuracy benchmarking.

from thaiphon import transcribe

transcribe("น้ำ", scheme="ipa")
# '/naːm˦˥/'

transcribe("สวัสดี", scheme="ipa")
# '/sa˨˩.wat̚˨˩.diː˧/'

Format conventions

Convention Detail
Word delimiters /…/ phonemic slashes wrap the entire word
Syllable separator . (full stop)
Long vowel ː (IPA length mark, U+02D0)
Unreleased stops (U+031A combining left angle above)
Tones Chao tone letters after each syllable

Tone notation

Thai has five lexical tones. The IPA scheme uses Chao tone letters:

Tone Chao letters Example TLC equivalent
Mid ˧ /kaː˧/ {M}
Low ˨˩ /kaː˨˩/ {L}
Falling ˥˩ /kaː˥˩/ {F}
High ˦˥ /kaː˦˥/ {H}
Rising ˩˩˦ /kaː˩˩˦/ {R}

Onset inventory

All onset phonemes in broad IPA as produced by thaiphon:

IPA onset Example Thai Gloss
k กา crow
ขา leg
จาน plate
tɕʰ ชา tea
d ดา to be covered
t ตา eye / grandfather
ทา to apply
b บา to be thin
p ปา to throw
พา to bring
f ฝา lid, cover
s สา to name
h หา to find
ʔ อา uncle/aunt (glottal onset, rendered as empty in IPA output)
m มา to come
n นา rice field
ŋ งา sesame
j ยา medicine
r รา mould, fungus
l ลา donkey
w วา fathom (unit)

Clusters: pr pl pʰr pʰl tr tʰr kr kl kʰr kʰl kʰw — written as the two phonemes concatenated with no joiner.


Vowel inventory

Quality Short IPA Long IPA
/a/ a
/i/ i
/u/ u
/e/ e
/ɛ/ ɛ ɛː
/o/ o
/ɔ/ ɔ ɔː
/ɯ/ ɯ ɯː
/ɤ/ ɤ ɤː
/ia/ (centring diphthong)
/ɯa/ (centring diphthong) ɯə ɯə
/ua/ (centring diphthong)

The centring diphthongs do not distinguish short/long in broad IPA; both length values surface as the same symbol.


Coda inventory

Native Thai phonotactics allows six coda positions:

Coda IPA Orthographic sources Example
m /kaːm˧/ — กาม
n น ณ ญ ร ล ฬ /kaːn˧/ — กาน
ŋ /kaːŋ˧/ — กาง
บ ป พ ภ ฟ /kaːp̚˨˩/ — กาบ
จ ช ฌ ด ต ถ ท ธ ฎ ฏ ฐ ฑ ฒ ซ ศ ษ ส /kaːt̚˨˩/ — กาด
ก ข ค ฆ /kaːk̚˨˩/ — กาก

Semi-vowel offglides /w/ and /j/ also appear as codas in diphthongs.

Loanword codas

Modern loanwords may preserve foreign phonemes in coda position under the everyday and careful_educated profiles:

Preserved coda Source letters Example
f ลิฟต์ → /lif˦˥/
s ส ศ ษ (in applicable loanwords)
l (in applicable loanwords)

Under etalon_compat, all foreign codas collapse to their native equivalents (f→p̚, s→t̚, l→n).


Profile sensitivity

# everyday (default): preserve /f/ in well-integrated loans
transcribe("ลิฟต์", scheme="ipa", profile="everyday")
# '/lif˦˥/'

# etalon_compat: collapse to native /p̚/
transcribe("ลิฟต์", scheme="ipa", profile="etalon_compat")
# '/lip̚˦˥/'

See Reading profiles for full details.