Install¶
Just want to transliterate some text?
You don't need to install anything. The online tool at rianthai.pro/thai-transliteration runs thaiphon directly in your browser. Install the package here only if you want to call it from Python code, integrate it into your own app, or work offline.
thaiphon is a pure-Python package with zero runtime dependencies. It works on Windows, macOS, and Linux, and requires Python 3.10 or later.
Standard install¶
After installation, verify it works:
Recommended: add the lexicon data package¶
Strongly recommended — raises accuracy from ~57% to ~75%
The thaiphon-data-volubilis package ships a ~35,000-entry Thai lexicon derived from the VOLUBILIS Mundo Dictionary (CC-BY-SA 4.0). Installing it takes one extra command and makes a large difference to output quality, especially for compound words.
thaiphon detects and uses this package automatically when it is importable — no configuration is required. The jump from ~57% to ~75% exact-match accuracy on the Wiktionary IPA benchmark comes from the lexicon's word-boundary and variant coverage, which the rule-based engine alone cannot match for multi-syllable compound words.
Without the lexicon, common words break in hard-to-detect ways. For example, ส้ม ("orange") comes out as /sa˥˩.ma˦˥/ — two syllables with an inserted /a/ — when the correct reading is /som˥˩/, one closed syllable. The engine always returns a result; there is no error or warning to tell you which outputs are unreliable.
The data package carries its own CC-BY-SA 4.0 license, separate from the engine's Apache-2.0 license. See Accuracy & open problems for the full benchmark details.
Optional: faster word segmentation¶
For sentence-level input, thaiphon uses a built-in longest-match segmenter by default. If pythainlp is importable, thaiphon will use it automatically for improved segmentation quality:
No code changes are needed; thaiphon checks for pythainlp at runtime.
Verify the installation¶
from thaiphon import transcribe, list_schemes
# Check which schemes are available.
print(list_schemes())
# ('ipa', 'morev', 'tlc')
# Run a quick transcription.
print(transcribe("น้ำ", scheme="ipa"))
# /naːm˦˥/
System requirements¶
| Requirement | Minimum |
|---|---|
| Python | 3.10 |
| Operating system | Any (Windows / macOS / Linux) |
| Runtime dependencies | None |
| Optional (strongly recommended) | thaiphon-data-volubilis (lexicon — raises accuracy ~57%→75%), pythainlp (sentence segmentation) |
If you don't have Python yet¶
See Install without Python experience for a step-by-step walkthrough that starts from scratch — including downloading Python itself.
Next steps¶
- Your first transcription — run your first
transcribe()call. - Reading profiles — choose everyday vs. formal register.
- Schemes — IPA, TLC, Morev, and how to add your own.