TellWell
← Back to feed
Publications3d ago88% confidenceConfidence 88% — the share of independent, credible sources corroborating the core facts.

Swivuriso: New 3000-Hour Multilingual Speech Dataset for South African Languages

Center 100%
1 source

Researchers have released Swivuriso, a 3000-hour multilingual speech dataset covering seven South African languages to advance automatic speech recognition (ASR) technology development. The dataset, created as part of the African Next Voices project, includes content from agriculture, healthcare, and general domains, addressing significant gaps in existing ASR resources. The work is significant for improving speech recognition capabilities in underrepresented African languages and supporting technology access across diverse linguistic communities.

Swivuriso is a newly introduced multilingual speech dataset containing 3000 hours of audio data designed to support the development and benchmarking of automatic speech recognition technologies across seven South African languages. Developed by researchers including Vukosi Marivate and colleagues as part of the African Next Voices project, the dataset spans multiple domains including agriculture, healthcare, and general topics. The paper describes the design principles, ethical considerations, and data collection procedures that guided the dataset creation. The authors present baseline results from training and fine-tuning ASR models using this data and provide comparative analysis against other existing ASR datasets for the same languages. This work addresses a critical gap in speech recognition resources for South African languages, which have historically been underrepresented in machine learning datasets.

What's missing

The paper is marked as 'Work in Progress' with the most recent update in June 2026. Specific details about the exact composition of the seven South African languages included, the size distribution across domains, and the specific baseline model performance metrics are not provided in the abstract.

What different sources said

  • Swivuriso: The South African Next Voices Multilingual Speech Dataset

Related

PublicationsConfidence 78% — the share of independent, credible sources corroborating the core facts.

Gut Bacteria Enzyme Found to Break Down Heat-Processed Food Compounds, Producing Novel Biogenic Amines

Researchers have discovered that an enzyme in common gut bacteria can degrade N-epsilon-carboxymethyllysine (CML), a compound formed during thermal food processing, producing previously unknown biogenic amines. The enzyme, ornithine decarboxylase SpeC from enterobacteria, acts on CML and related modified lysine derivatives through a low-level 'underground' catalytic activity. This finding suggests a previously unrecognized communication axis between thermally processed dietary compounds and gut microbial physiology, with potential implications for host health.

1 source53m ago
PublicationsConfidence 78% — the share of independent, credible sources corroborating the core facts.

Full-Length Gene Sequencing Reveals Two Distinct Bacterial Communities in Black-Legged Ticks Expanding Into Canada

Researchers used Oxford Nanopore full-length 16S rRNA gene sequencing to characterize the microbiome of Ixodes scapularis black-legged ticks collected in Nova Scotia, Canada, distinguishing between tick-adapted bacteria and environmentally acquired bacteria. The study comes as I. scapularis — the primary vector of Lyme disease — is rapidly expanding northward into Canada due to climate change. The findings suggest that environmentally derived bacteria in tick microbiomes are not mere contamination, which has implications for how tick microbiome data is collected and interpreted across surveillance studies.

1 source53m ago
PublicationsConfidence 78% — the share of independent, credible sources corroborating the core facts.

Study Identifies Metabolic Link Between Cell Envelope Stress and Biofilm Formation in Bacteria

Researchers have discovered that the metabolite acetyl-CoA directly inhibits enzymes that degrade the bacterial signaling molecule c-di-GMP, connecting cell envelope biosynthesis stress to biofilm formation in Pseudomonas aeruginosa. The study found that sub-inhibitory concentrations of antibiotics targeting early peptidoglycan biosynthesis — but not other antibiotic classes — elevate c-di-GMP levels by reducing phosphodiesterase activity, with acetyl-CoA competing for the enzyme active site. Because the relevant enzyme domain is broadly conserved across bacterial species, this checkpoint mechanism may be widespread and could have implications for understanding antibiotic-induced biofilm responses.

1 source53m ago