TLC (Enhanced Phonemic)

TLC scheme (Enhanced Phonemic)

The tlc scheme produces the "Enhanced Phonemic" notation used by thai-language.com. It uses plain ASCII Latin characters, making it safe for contexts that do not support Unicode diacritics. Tones are represented as bracketed tags appended to each syllable.

from thaiphon import transcribe

transcribe("น้ำ", scheme="tlc", format="html")
# 'naam<sup>H</sup>'

transcribe("สวัสดี", scheme="tlc", format="html")
# 'sa<sup>L</sup> wat<sup>L</sup> dee<sup>M</sup>'

Format conventions

Convention Detail
Syllable separator space ()
Tone bracketed tag at end of syllable: {M} {L} {H} {F} {R}
Long vowel doubled or extended vowel letter (e.g. aa, ee, uu)
No slashes unlike IPA, no /…/ wrapping

Tone tags

Tone Tag IPA Chao equivalent
Mid {M} ˧
Low {L} ˨˩
Falling {F} ˥˩
High {H} ˦˥
Rising {R} ˩˩˦

Onset map (IPA → TLC)

IPA onset TLC Note
k g unaspirated
kh aspirated
j unaspirated palatal affricate
tɕʰ ch aspirated palatal affricate
d d
t dt unaspirated alveolar stop
th aspirated
b b
p bp unaspirated bilabial stop
ph aspirated
f f
s s
h h
ʔ (empty) glottal onset is not written
m m
n n
ŋ ng
j y
r r
l l
w w

Note on notation: TLC uses g for /k/ and j for /tɕ/ to avoid confusing English readers — in English, k and ch could be misread as aspirated and palatal respectively in a different way.


Vowel map (IPA → TLC)

IPA quality Short Long
/a/ a aa
/i/ i ee
/u/ oo uu
/e/ e aeh
/ɛ/ ae aae
/o/ oh o:h
/ɔ/ aw aaw
/ɯ/ eu euu
/ɤ/ eer uuhr
/ia/ ia iia
/ɯa/ euua euua
/ua/ uaa uaa

Context-dependent vowel spellings

The vowel surface form sometimes depends on what follows it. TLC has a few specific conventions:

  • /ɤ/ (short) before /j/ offglide: the vowel renders as eeu (rather than eer), so the syllable appears as eeuy. Example: เมย → meeuy{M}.
  • /i/ (long) before /w/ offglide: the vowel renders as iaa and the coda /w/ renders as o, yielding iaao. Example: เขียว → khiaao{R}.

Coda map (IPA → TLC)

IPA coda TLC
m m
n n
ŋ ng
p
t
k
w (offglide) o
j (offglide) i

Context-dependent coda spellings: The /j/ coda renders as y (rather than i) in specific vowel environments: after /ɔː/, after /ɤ/, after /uː/, after /ua/, and after /eː/.


Loanword coda preservation

Under the everyday profile, thaiphon preserves foreign codas in lexicon-listed loanwords:

transcribe("ลิฟต์", scheme="tlc", profile="everyday")
# 'lif{H}'

transcribe("ลิฟต์", scheme="tlc", profile="etalon_compat")
# 'lip{H}'

Additionally, an out-of-lexicon heuristic applies to words containing ฟ (fo fan): if the word scores as likely foreign and the coda is orthographic ฟ, TLC preserves f rather than collapsing to p. This covers unknown English loanwords without requiring them to be pre-listed.

See Reading profiles for details.


HTML output

Pass format="html" to receive tone markers as <sup> tags instead of the bracketed {…} form. This matches the display convention used on thai-language.com, where tones appear as small superscript letters.

transcribe("น้ำ", scheme="tlc", format="html")
# 'naam<sup>H</sup>'

transcribe("สวัสดี", scheme="tlc", format="html")
# 'sa<sup>L</sup> wat<sup>L</sup> dee<sup>M</sup>'

transcribe("ภาษาไทย", scheme="tlc", format="html")
# 'phaa<sup>M</sup> saa<sup>R</sup> thai<sup>M</sup>'

The default format="text" output is unchanged — bracketed tags are still the plain-text form. The HTML output does not add any wrapper elements around the syllable or word; the tone tag is simply appended to the syllable string.

The five superscript values correspond to the same tones as the bracketed form:

Tone Text output HTML output
Mid {M} <sup>M</sup>
Low {L} <sup>L</sup>
Falling {F} <sup>F</sup>
High {H} <sup>H</sup>
Rising {R} <sup>R</sup>