New Multilingual Word-Level Forced Alignment Method Outperforms Existing Approaches
Researchers have developed a new method for word-level forced alignment in speech that combines representations from the Massively Multilingual Speech model and a self-supervised phoneme boundary detector. The approach uses a learned dynamic programming decoder and was trained on TIMIT and Buckeye datasets. The method shows potential to scale across 1100+ languages without requiring additional training.
A new multilingual word-level forced alignment method has been presented that integrates two key representations: one from the Massively Multilingual Speech (MMS) model and another from a self-supervised phoneme boundary detector called UnSupSeg. The system consists of an alignment encoder that learns to fuse these representations and estimate word-boundary probabilities over long temporal contexts, combined with a learned dynamic programming decoder that infers final word boundaries. When tested on TIMIT and Buckeye datasets, the proposed approach outperformed existing methods including Montreal Forced Aligner (MFA) and MMS-based alignment. On unseen languages including Dutch, German, and Hebrew, the model achieved performance consistently better than or comparable to existing alignment approaches, suggesting it could scale effectively to the 1100+ languages supported by MMS without requiring additional training.
What different sources said
- arXiv cs.CLCenter
Multilingual Word-Level Forced Alignment with Self-Supervised Representations and Learned Dynamic Programming
Related

Chinese EV Makers BYD and Xpeng Accelerate Humanoid Robot Development to Compete with Tesla
Chinese electric vehicle manufacturers including BYD and Xpeng are expanding beyond automobiles to develop and commercialize humanoid robots, viewing AI advances as a path to a new market. This represents a strategic shift for major EV makers who have traditionally focused on electric cars and autonomous driving technology. The move signals intensifying competition in robotics as Chinese firms seek to diversify revenue streams and compete globally in emerging AI-driven sectors.
Bill Gates warns tech giants that data center expansion cannot raise household power costs
Bill Gates told major tech companies on CNBC that they lack permission to increase residential electricity bills through data center construction, despite the economic and competitive pressures driving expansion. The warning comes as 48 data center projects worth $156 billion were blocked or stalled in 2025, and public opposition has reached unprecedented levels with 70% of Americans opposing data centers near their homes. Gates's message underscores that tech companies must secure genuine community support and absorb infrastructure costs themselves, not pass them to ratepayers.

Major Delhi Data Centre Fire Destroys Equipment Worth Hundreds of Crores, Disrupts Internet Services
A fire broke out on the third floor of ST Telemedia GDC's data centre facility in Delhi's Greater Kailash on June 5, 2026, destroying equipment and causing significant service disruptions for Google, Netflix, and multiple local internet service providers. The fire, categorized as a massive blaze, started in the battery room and was extinguished after several hours, with two firefighters injured but no loss of life reported. The incident highlights vulnerabilities in data centre fire safety protocols and raises questions about whether inert gas suppression systems were adequately stocked.