TellWell
← Back to feed
Tech2h ago75% confidenceConfidence 75% — the share of independent, credible sources corroborating the core facts.

Anthropic Releases Guardrailed Version of Mythos Model for Public Use

1 source

Anthropic launched Fable 5, a publicly available version of its powerful Mythos model with safety guardrails that restrict answers on cybersecurity and biology topics. The company claims extensive testing with hackers found no successful jailbreak attempts, though the underlying model retains dangerous capabilities that safeguards prevent from being used. The release raises questions about whether guardrails can withstand determined adversaries seeking to exploit the model's unrestricted capabilities.

Anthropic released Fable 5 on Tuesday, a guardrailed version of its unreleased Mythos model designed for general public use. The model includes safeguards that prevent it from answering questions about cybersecurity and biology, areas where Anthropic determined Mythos posed too great a risk for unrestricted access. According to the company, Fable 5 underwent extensive testing with hackers attempting to bypass its safeguards, with none reportedly successful. Anthropic acknowledged that without these guardrails, Fable 5 would be exceptionally capable at finding and exploiting software vulnerabilities, potentially lowering the cost of cyberattacks. The company simultaneously upgraded Mythos to version 5 for select existing customers, claiming it now has "the strongest cybersecurity capabilities of any model in the world." Early customer testing indicated Fable 5 significantly reduced software publication timelines and performed well on reasoning tasks.

What's missing

The article does not specify what types of jailbreaking attempts were tested, the number or expertise level of hackers involved in testing, or independent verification of the guardrail effectiveness. Additionally, no information is provided about potential regulatory oversight or industry standards for evaluating AI safety measures.

What different sources said

  • SemaforCenter

    Anthropic releases guardrailed version of Mythos for public use

Related

TechConfidence 85% — the share of independent, credible sources corroborating the core facts.

Blacksmith CI Service Charges $1,081 to User on Free Trial Without Credit Card on File

A developer team using Blacksmith, a GitHub Actions alternative, received a $1,081 invoice after exceeding free tier limits without having provided a credit card. The company's free trial continued accruing charges rather than stopping service, contrary to typical SaaS conventions. The incident raises questions about whether such billing practices are legally permissible and whether they align with user expectations.

1 source7m ago
TechConfidence 78% — the share of independent, credible sources corroborating the core facts.

Apple Testing Camera-Equipped AirPods for AI-Enhanced Siri, But Privacy Concerns May Delay Launch

Apple has designed AirPods with built-in cameras to give Siri visual context for user requests and is in late-stage testing with employees, according to Bloomberg reporting. The cameras would enable features like landmark-based navigation, food identification, and smarter contextual assistance, though they would not record photos or video like smart glasses. However, Wired reports Apple may delay the product due to insufficient AI capabilities and executive concerns about privacy risks without compelling use cases.

1 source7m ago
TechConfidence 78% — the share of independent, credible sources corroborating the core facts.

AI Companies Adopt Serif Fonts to Signal Trustworthiness and Human Touch

AI companies like Claude, Perplexity, and Runway are increasingly using serif fonts in their branding and user interfaces, a shift designers attribute to efforts to make artificial intelligence appear more human and trustworthy. Serif typefaces, historically associated with print media, books, and authority, contrast with the cleaner sans-serif fonts often perceived as computer-like and cold. The trend reflects broader public skepticism about AI and companies' attempts to build confidence in their products through design choices that evoke human craftsmanship and reliability.

1 source7m ago