Types

Types

Public types and data structures in the thaiphon API.


AnalysisResult

from thaiphon.model.candidate import AnalysisResult

Returned by analyze() and analyze_word().

@dataclass(frozen=True, slots=True)
class AnalysisResult:
    best: PhonologicalWord
    alternatives: tuple[PhonologicalWord, ...] = ()
    source: str = "derivation"
    raw: str = ""
    loan_analysis: LoanAnalysis | None = None
Field Type Description
best PhonologicalWord Top-ranked phonological word
alternatives tuple[PhonologicalWord, ...] Lower-ranked candidates (often empty)
source str "lexicon", "derivation", or "derivation+lexicon"
raw str Normalised input string
loan_analysis LoanAnalysis \| None Foreignness detector result (observational)

PhonologicalWord

from thaiphon.model.word import PhonologicalWord

An immutable tuple of syllables.

@dataclass(frozen=True, slots=True)
class PhonologicalWord:
    syllables: tuple[Syllable, ...]
    morpheme_boundaries: tuple[int, ...] = ()
    confidence: float = 1.0
    source: str = "derivation"
    raw: str = ""
Field Description
syllables The word's syllables, in order
morpheme_boundaries Syllable indices where morpheme breaks occur (may be empty)
confidence Syllabification confidence (1.0 = lexicon; < 1.0 = ranked derivation)
source "lexicon", "derivation", or "derivation+lexicon"
raw Original Thai orthographic string

Supports len() and iteration:

word = analyze("กรุงเทพ").best
len(word)          # 2
list(word)         # [Syllable(...), Syllable(...)]
word.syllables[0]  # Syllable for กรุง

Syllable

from thaiphon.model.syllable import Syllable

One syllable in the phonological representation.

@dataclass(frozen=True, slots=True)
class Syllable:
    onset: Phoneme | Cluster | None
    vowel: Phoneme
    vowel_length: VowelLength
    coda: Phoneme | None
    tone: Tone
    tone_mark: ToneMark = ToneMark.NONE
    effective_class: EffectiveClass = EffectiveClass.MID
    syllable_type: SyllableType = SyllableType.LIVE
    raw: str = ""
    inserted_vowel: bool = False
    notes: tuple[str, ...] = ()
Field Type Description
onset Phoneme \| Cluster \| None Initial consonant(s)
vowel Phoneme Nucleus vowel (IPA symbol)
vowel_length VowelLength SHORT or LONG
coda Phoneme \| None Final consonant, or None for open syllables
tone Tone Derived tone
tone_mark ToneMark Written tone mark (NONE if absent)
effective_class EffectiveClass Class used for tone lookup
syllable_type SyllableType LIVE or DEAD
raw str Orthographic slice for this syllable
inserted_vowel bool True when an inherent vowel was inserted

Phoneme

from thaiphon.model.phoneme import Phoneme

A single IPA phoneme.

@dataclass(frozen=True, slots=True)
class Phoneme:
    symbol: str
    is_aspirated: bool = False
    is_sonorant: bool = False
Field Description
symbol IPA symbol (e.g. "kʰ", "aː", "m", "p̚")
is_aspirated True for aspirated stop onsets
is_sonorant True for sonorants (/m n ŋ j w r l/)

Cluster

from thaiphon.model.phoneme import Cluster

A two-phoneme onset cluster.

@dataclass(frozen=True, slots=True)
class Cluster:
    first: Phoneme
    second: Phoneme

Example — the onset of ปลา:

result = analyze("ปลา")
onset = result.best.syllables[0].onset
isinstance(onset, Cluster)    # True
onset.first.symbol            # 'p'
onset.second.symbol           # 'l'

Enumerations

All enums are str enums: Tone.MID == "MID" is True.

Tone

from thaiphon.model.enums import Tone

class Tone(str, Enum):
    MID     = "MID"
    LOW     = "LOW"
    FALLING = "FALLING"
    HIGH    = "HIGH"
    RISING  = "RISING"

VowelLength

from thaiphon.model.enums import VowelLength

class VowelLength(str, Enum):
    SHORT = "SHORT"
    LONG  = "LONG"

SyllableType

from thaiphon.model.enums import SyllableType

class SyllableType(str, Enum):
    LIVE = "LIVE"
    DEAD = "DEAD"

ToneMark

from thaiphon.model.enums import ToneMark

class ToneMark(str, Enum):
    NONE         = "NONE"
    MAI_EK       = "MAI_EK"       # ◌่
    MAI_THO      = "MAI_THO"      # ◌้
    MAI_TRI      = "MAI_TRI"      # ◌๊
    MAI_JATTAWA  = "MAI_JATTAWA"  # ◌๋

EffectiveClass

from thaiphon.model.enums import EffectiveClass

class EffectiveClass(str, Enum):
    HIGH = "HIGH"
    MID  = "MID"
    LOW  = "LOW"

ConsonantClass

from thaiphon.model.enums import ConsonantClass

class ConsonantClass(str, Enum):
    HIGH         = "HIGH"
    MID          = "MID"
    LOW_PAIRED   = "LOW_PAIRED"    # has a HC counterpart
    LOW_SONORANT = "LOW_SONORANT"  # no HC counterpart

SchemeMapping

from thaiphon.renderers.mapping import SchemeMapping

See Write your own scheme for the full field reference.


Errors

from thaiphon.errors import (
    ThaiphonError,
    ParseError,
    NormalizationError,
    DerivationError,
    UnsupportedSchemeError,
    AmbiguousAnalysisError,
)
Exception When raised
ThaiphonError Base class for all thaiphon exceptions
ParseError Tokenization or orthography rejects input
NormalizationError A combining mark appears at string start without a base character
DerivationError A tone matrix lookup has no entry (non-standard mark combination)
UnsupportedSchemeError The requested scheme is not registered
AmbiguousAnalysisError Strict mode: multiple candidates tie after ranking