Development guide

Development guide

This guide covers setting up a development environment, running the tools, and the project's code conventions.


Prerequisites

  • Python 3.10 or newer
  • uv — the recommended package manager (installs fast and manages Python versions)
  • git

Clone and install

git clone https://github.com/5w0rdf15h/thaiphon
cd thaiphon
make dev

make dev runs uv sync --extra dev, which installs thaiphon in editable mode along with the development dependencies (pytest, ruff, mypy).

If you prefer plain pip:

pip install -e ".[dev]"

Make targets

Target Command Description
make dev uv sync --extra dev Install with dev dependencies
make test pytest -q Run the test suite
make lint ruff check src/ tests/ Lint with ruff
make typecheck mypy src/thaiphon Type-check with mypy
make check test + lint + typecheck All quality checks
make format black + isort Auto-format code

Running tests

# All tests, quiet output.
pytest -q

# All tests, verbose.
pytest -v

# A specific file.
pytest tests/test_api.py -v

# A specific test by keyword.
pytest -k "test_sara_am" -v

# A specific test class.
pytest tests/rules/ -v

Code style

thaiphon uses ruff for linting and black + isort for formatting. The configuration lives in pyproject.toml.

Key style conventions:

  • Line length: 100 characters.
  • Quotes: double quotes (black default).
  • Imports: isort-style, with from __future__ import annotations at the top of every module.
  • Type annotations: full annotations on all public functions; mypy --strict must pass.
  • Docstrings: brief module-level docstrings; per-function docstrings for public API only.

Run make format to auto-apply formatting, then make lint to check for remaining issues.


Type checking

make typecheck
# or
mypy src/thaiphon

mypy runs in strict mode (strict = true in pyproject.toml). All public API functions and data structures must be fully typed. Internal helper functions should be typed unless the typing is prohibitively complex.


Project layout

thaiphon/
├── src/
│   └── thaiphon/
│       ├── api.py                 # Public API entry points
│       ├── registry.py            # Renderer registry
│       ├── errors.py              # Exception hierarchy
│       ├── model/                 # Frozen dataclasses + enums
│       ├── normalization/         # Unicode normalisation + expansion
│       ├── pipeline/              # PipelineRunner
│       ├── syllabification/       # Candidate generator + ranker
│       ├── tokenization/          # TCC tokenizer
│       ├── derivation/            # Phonological derivation steps
│       ├── tables/                # Static lookup tables
│       ├── lexicons/              # Lexicon dicts + frozensets
│       ├── renderers/             # Scheme implementations
│       └── segmentation/          # Word segmenter
├── tests/
│   ├── conftest.py                # Shared fixtures
│   ├── fixtures/                  # README on provenance policy
│   ├── rules/                     # Internal phonological-rule tests
│   └── test_*.py                  # Public API + renderer tests
├── pyproject.toml
├── Makefile
└── README.md

Commit style

  • Write commit messages in the imperative: "Fix tone for LC dead short syllable", not "Fixed" or "Fixing".
  • The first line should be a clear summary under 72 characters.
  • If the commit fixes a bug, mention the word or pattern affected.
  • No AI attribution in commit messages.

Adding a new feature

  1. Write a test first (or alongside). Place it in tests/ for API-level behaviour, or in tests/rules/ for a new phonological-rule regression.
  2. Implement the change in the appropriate subpackage.
  3. Run make check to ensure tests, linting, and type-checking all pass.
  4. Open a pull request. See Pull requests.

Versioning

thaiphon follows semantic versioning. The version is in pyproject.toml and src/thaiphon/__init__.py. Both must be updated together. The current release is v0.2.0.