UR-BERT: New Text Encoder Scales Text-to-Speech to 495 Languages Using Universal Romanization
Researchers have developed UR-BERT, a text encoder that enables text-to-speech (TTS) systems to work across 495 languages by converting diverse writing systems into a shared Romanized representation. Traditional grapheme-to-phoneme approaches are limited to around 100 languages due to the scarcity of reliable linguistic resources for most languages. The breakthrough could significantly expand access to synthetic speech technology across the world's language diversity.
UR-BERT addresses a fundamental limitation in multilingual text-to-speech technology by replacing conventional grapheme-to-phoneme (G2P) approaches with a universal Romanization strategy that unifies writing systems across languages. The system incorporates a speech token prediction objective during training, which helps the encoder learn phonetically accurate representations efficiently without requiring extensive labeled data for each language. Experimental results demonstrate that TTS systems built on UR-BERT consistently outperform existing text encoder baselines across diverse languages and resource conditions, including strong generalization to previously unseen languages. The approach scales from the typical 100-language ceiling of G2P methods to 495 languages, representing a nearly five-fold expansion in coverage. This work was accepted to Interspeech 2026, a major conference in speech processing research.
What's missing
The paper does not discuss potential limitations of Romanization as a universal representation (e.g., loss of tonal or diacritical information critical to certain languages), computational costs or inference speed compared to baselines, or specific performance metrics (e.g., mean opinion scores for speech quality) that would quantify the claimed improvements.
What different sources said
- arXiv cs.CLCenter
UR-BERT: Scaling Text Encoders for Massively Multilingual TTS Through Universal Romanization and Speech Token Prediction
Related
Genetic Drift, Not Selection, Drives Rapid Feather Color Evolution in Island Bird Radiation
A new study of an island bird radiation found that rapid evolution of feather coloration is driven primarily by genetic drift in small populations rather than sexual or ecological selection. The research integrated whole-genome data with detailed plumage measurements across complete species sampling to test whether signaling trait evolution correlates with speciation rates. The findings suggest that neutral demographic processes play a central role in generating phenotypic diversity during island radiations, challenging assumptions about the mechanisms driving rapid evolution.
New AI Model Improves Prediction of Therapeutic Peptide Function from Protein Sequences
Researchers developed a lightweight CNN classifier that predicts whether peptide sequences have therapeutic properties, trained on a database of 54,655 peptides across 48 functional categories. The model uses a novel negative sampling strategy to reduce false positive rates from over 60% in previous approaches to 2.1%. This advancement could accelerate drug discovery by enabling faster computational screening of peptide candidates before expensive experimental testing.
Study Shows Different Metabolic Stress Models Produce Distinct Effects on Human Neuronal Networks
Researchers tested three common in vitro metabolic stress models on human-derived neuronal networks and found each produced different patterns of neuronal activity and cell damage. The models tested were hypoxia alone, oxygen-glucose deprivation (OGD), and hypoxia combined with glutamate exposure. The findings suggest that choice of experimental model significantly affects results and that combining electrophysiological and structural analyses is important for accurately assessing metabolic stress in stroke research.