thaiphon¶

You give it Thai text. You get back a pronunciation guide.

thaiphon is a zero-dependency Python library for Thai romanisation. It ships with eight built-in output schemes — IPA, the thai-language.com "Enhanced Phonemic" notation, two Cyrillic transliterations (Morev and LMT), the official RTGS romanization, RTL School romanization, and two Paiboon variants — and you can add your own with one data declaration.

from thaiphon import transcribe

transcribe("สวัสดี", scheme="ipa")                     # '/sa˨˩.wat̚˨˩.diː˧/'
transcribe("น้ำ",    scheme="tlc",   format="html")  # 'naam<sup>H</sup>'
transcribe("รัก",    scheme="morev", format="html")  # 'ракˇ'
transcribe("ข้าว",   scheme="rtl")          # 'khâaw'
transcribe("ปลา",    scheme="paiboon")      # 'bplaa'
transcribe("เสือ",   scheme="paiboon_plus") # 'sʉ̌ʉa'

Try it in your browser

No install needed

The online tool at Thai Transliteration Tool runs thaiphon directly. Paste Thai text and instantly get IPA, TLC, and Morev output. Install the Python package only if you need it in your own code or offline workflow.

Pick a thread¶

You want to produce phonetic guides for your students or personal study — and you'd rather not wrestle with a command line.

What thaiphon does — plain-English explanation
Install without Python experience — step-by-step, including installing Python itself
Your first transcription — copy-paste and run
Reading profiles — everyday vs. formal register

At a glance¶


Version	0.6.3
Python	3.10+
License	Apache-2.0
Runtime dependencies	Zero
Accuracy	~75% exact-match vs. Wiktionary IPA (17,014 words) with `thaiphon-data-volubilis`; ~57% base engine alone
Built-in schemes	`ipa`, `tlc`, `morev`, `lmt`, `rtl`, `paiboon`, `paiboon_plus`, `rtgs`
Reading profiles	`everyday`, `careful_educated`, `learned_full`, `etalon_compat`
Source	github.com/5w0rdf15h/thaiphon
Package	`pip install thaiphon thaiphon-data-volubilis` (recommended)

Install¶

# Recommended — engine + lexicon package (~57% → ~75% accuracy):
pip install thaiphon thaiphon-data-volubilis
# or
uv add thaiphon thaiphon-data-volubilis

The lexicon package (thaiphon-data-volubilis) ships a ~35,000-entry Thai lexicon derived from the VOLUBILIS Mundo Dictionary (CC-BY-SA 4.0). The engine picks it up on import if it's installed. Nothing to configure.

The base engine alone (pip install thaiphon) works without it; the lexicon package is what gets you from ~57% to ~75% on the Wiktionary IPA benchmark. See Install for full details.

Quick example¶

from thaiphon import transcribe, transcribe_sentence, analyze, list_schemes

# Which schemes are available?
list_schemes()
# ('ipa', 'lmt', 'morev', 'paiboon', 'paiboon_plus', 'rtgs', 'rtl', 'tlc')

# Transcribe a single word — default scheme is 'tlc', html mode gives superscript tones.
transcribe("ข้าว", format="html")
# 'khaao<sup>F</sup>'

# Choose a scheme explicitly.
transcribe("ข้าว", scheme="ipa")
# '/kʰaːw˥˩/'

# Inspect the phonological structure directly.
result = analyze("รัก")
for syl in result.best.syllables:
    print(syl.onset.symbol, syl.vowel.symbol, syl.vowel_length.name, syl.tone.name)
# r  a  SHORT  HIGH

How it works¶

thaiphon converts Thai text through four deterministic stages:

Thai text
   ↓
Unicode normalisation + expansion (Sara Am → /aː/ + /m/, ๆ repetition, digits)
   ↓
Syllabification → candidate ranking
   ↓
Rule-based derivation (onset class → tone matrix → phoneme assignment)
   ↓
PhonologicalWord  ←  universal intermediate, scheme-independent
   ↓
SchemeMapping → surface form (IPA / TLC / Morev / your own)

Every output scheme is a pure transformation of the same PhonologicalWord. Fix a derivation bug once and all schemes benefit. See Architecture for the full picture.