Development guide¶
This guide covers setting up a development environment, running the tools, and the project's code conventions.
Prerequisites¶
- Python 3.10 or newer
- uv — the recommended package manager (installs fast and manages Python versions)
- git
Clone and install¶
make dev runs uv sync --extra dev, which installs thaiphon in editable mode along with the development dependencies (pytest, ruff, mypy).
If you prefer plain pip:
Make targets¶
| Target | Command | Description |
|---|---|---|
make dev |
uv sync --extra dev |
Install with dev dependencies |
make test |
pytest -q |
Run the test suite |
make lint |
ruff check src/ tests/ |
Lint with ruff |
make typecheck |
mypy src/thaiphon |
Type-check with mypy |
make check |
test + lint + typecheck | All quality checks |
make format |
black + isort |
Auto-format code |
Running tests¶
# All tests, quiet output.
pytest -q
# All tests, verbose.
pytest -v
# A specific file.
pytest tests/test_api.py -v
# A specific test by keyword.
pytest -k "test_sara_am" -v
# A specific test class.
pytest tests/rules/ -v
Code style¶
thaiphon uses ruff for linting and black + isort for formatting. The configuration lives in pyproject.toml.
Key style conventions:
- Line length: 100 characters.
- Quotes: double quotes (black default).
- Imports: isort-style, with
from __future__ import annotationsat the top of every module. - Type annotations: full annotations on all public functions;
mypy --strictmust pass. - Docstrings: brief module-level docstrings; per-function docstrings for public API only.
Run make format to auto-apply formatting, then make lint to check for remaining issues.
Type checking¶
mypy runs in strict mode (strict = true in pyproject.toml). All public API functions and data structures must be fully typed. Internal helper functions should be typed unless the typing is prohibitively complex.
Project layout¶
thaiphon/
├── src/
│ └── thaiphon/
│ ├── api.py # Public API entry points
│ ├── registry.py # Renderer registry
│ ├── errors.py # Exception hierarchy
│ ├── model/ # Frozen dataclasses + enums
│ ├── normalization/ # Unicode normalisation + expansion
│ ├── pipeline/ # PipelineRunner
│ ├── syllabification/ # Candidate generator + ranker
│ ├── tokenization/ # TCC tokenizer
│ ├── derivation/ # Phonological derivation steps
│ ├── tables/ # Static lookup tables
│ ├── lexicons/ # Lexicon dicts + frozensets
│ ├── renderers/ # Scheme implementations
│ └── segmentation/ # Word segmenter
├── tests/
│ ├── conftest.py # Shared fixtures
│ ├── fixtures/ # README on provenance policy
│ ├── rules/ # Internal phonological-rule tests
│ └── test_*.py # Public API + renderer tests
├── pyproject.toml
├── Makefile
└── README.md
Commit style¶
- Write commit messages in the imperative: "Fix tone for LC dead short syllable", not "Fixed" or "Fixing".
- The first line should be a clear summary under 72 characters.
- If the commit fixes a bug, mention the word or pattern affected.
- No AI attribution in commit messages.
Adding a new feature¶
- Write a test first (or alongside). Place it in
tests/for API-level behaviour, or intests/rules/for a new phonological-rule regression. - Implement the change in the appropriate subpackage.
- Run
make checkto ensure tests, linting, and type-checking all pass. - Open a pull request. See Pull requests.
Versioning¶
thaiphon follows semantic versioning. The version is in pyproject.toml and src/thaiphon/__init__.py. Both must be updated together. The current release is v0.2.0.