Software

Phonotactic Legality for Brazilian Portuguese

or, PhonoLeg-BR. A tripartite phonotactic legality scorer: orthographic familiarity, TINA neural repair/fragility, and G2P consistency — with dynamic reliability-weighted fusion.

No uploaded files. Single-word scoring requests are transient.

Score a Word

Enter a word above and press Score.

/ 100
High ≥ 65

The phoneme sequence is familiar to the phonological grammar of Brazilian Portuguese. The model decodes with high confidence at every position. Suitable candidate for psycholinguistic stimuli or neologisms.

Medium 40–64

Marginally licit sequence. May contain atypical consonant clusters, infrequent vowel patterns, or unusual syllable structures. Useful for studying the gradient zone of BP phonotactics.

Low < 40

The sequence violates core phonotactic patterns of BP. High internal entropy indicates no familiar phonological mapping was activated. Likely perceived as a non-word by native speakers.

Technical Note

PhonoLeg-BR scores how natively phonotactic a string sounds in Brazilian Portuguese. The key insight is that phonotactically well-formed words make TINA — a neural archiphonemic transcriber — easy to run: the decoder is stable, low-entropy, and consistent across multiple stochastic passes. Ill-formed strings force TINA to work harder, producing fragile, high-repair outputs. This inference difficulty is the core legality signal.

Three layers are combined: the TINA repair/fragility signal (dominant, prior weight 0.62), a G2P consistency layer that cross-checks TINA's transcription against an independent grapheme-to-phoneme model (0.24), and an orthographic familiarity layer that scores the written string's letter patterns before TINA runs (0.14). Weights are not fixed — each layer's influence is adjusted per-input based on its estimated reliability. An agreement bonus rewards convergence; a conflict penalty discounts divergence. A positional cap prevents auxiliary layers from overriding a clearly low TINA score. When the optional calibration model is active, a three-component neural corrector (illegality gate → region classifier → bounded residual) refines the raw score.

This tool is still under active development and must be used cautiously.

How to cite

PhonoLeg-BR is part of the registered TINA software; cite the shared TINA registration.

APA

Barroso, A. M. (2026). TINA (Transcrição Inteligente, Notação Automática) (Registration No. BR5120260028949) [Computer software]. Instituto Nacional da Propriedade Industrial.

ABNT

BARROSO, A. M. TINA (Transcrição Inteligente, Notação Automática). 2026. Patente: Programa de computador. Registro n. BR5120260028949, registrado em 09 abr. 2026. Instituto Nacional da Propriedade Industrial (INPI).