Quickstart · ~5 Min

Install

Install

Just want to transliterate some text?

You don't need to install anything. The online tool at rianthai.pro/thai-transliteration runs thaiphon directly in your browser. Install the package here only if you want to call it from Python code, integrate it into your own app, or work offline.

thaiphon is a pure-Python package with zero runtime dependencies. It works on Windows, macOS, and Linux, and requires Python 3.10 or later.

Standard install

pip install thaiphon
uv add thaiphon
# or, inside an existing project:
uv pip install thaiphon
poetry add thaiphon
pip install thaiphon   # thaiphon is not on conda-forge yet; use pip inside the conda environment

After installation, verify it works:

import thaiphon
print(thaiphon.__version__)   # 0.2.0
print(thaiphon.transcribe("สวัสดี"))

Strongly recommended — raises accuracy from ~57% to ~75%

The thaiphon-data-volubilis package ships a ~35,000-entry Thai lexicon derived from the VOLUBILIS Mundo Dictionary (CC-BY-SA 4.0). Installing it takes one extra command and makes a large difference to output quality, especially for compound words.

pip install thaiphon thaiphon-data-volubilis
uv add thaiphon thaiphon-data-volubilis
poetry add thaiphon thaiphon-data-volubilis

thaiphon detects and uses this package automatically when it is importable — no configuration is required. The jump from ~57% to ~75% exact-match accuracy on the Wiktionary IPA benchmark comes from the lexicon's word-boundary and variant coverage, which the rule-based engine alone cannot match for multi-syllable compound words.

Without the lexicon, common words break in hard-to-detect ways. For example, ส้ม ("orange") comes out as /sa˥˩.ma˦˥/ — two syllables with an inserted /a/ — when the correct reading is /som˥˩/, one closed syllable. The engine always returns a result; there is no error or warning to tell you which outputs are unreliable.

The data package carries its own CC-BY-SA 4.0 license, separate from the engine's Apache-2.0 license. See Accuracy & open problems for the full benchmark details.

Optional: faster word segmentation

For sentence-level input, thaiphon uses a built-in longest-match segmenter by default. If pythainlp is importable, thaiphon will use it automatically for improved segmentation quality:

pip install pythainlp

No code changes are needed; thaiphon checks for pythainlp at runtime.

Verify the installation

from thaiphon import transcribe, list_schemes

# Check which schemes are available.
print(list_schemes())
# ('ipa', 'morev', 'tlc')

# Run a quick transcription.
print(transcribe("น้ำ", scheme="ipa"))
# /naːm˦˥/

System requirements

Requirement Minimum
Python 3.10
Operating system Any (Windows / macOS / Linux)
Runtime dependencies None
Optional (strongly recommended) thaiphon-data-volubilis (lexicon — raises accuracy ~57%→75%), pythainlp (sentence segmentation)

If you don't have Python yet

See Install without Python experience for a step-by-step walkthrough that starts from scratch — including downloading Python itself.

Next steps