PublicationsJun 1183% confidence

New Self-Supervised Method Enables AI Agents to Improve Without External Validation Data

Center 100%

1 source

Researchers have introduced Retrospective Harness Optimization (RHO), a self-supervised technique that allows AI agents to improve their own tools and workflows using only past task trajectories, without requiring human-labeled validation data. The method works by selecting challenging past tasks, re-solving them in parallel, and using the agent's own self-consistency and pairwise preference judgments to select the best harness updates. In testing, a single optimization round raised the pass rate on the SWE-Bench Pro software engineering benchmark from 59% to 78%, suggesting significant practical value for deploying adaptive AI agents.

Retrospective Harness Optimization (RHO) addresses a core bottleneck in deploying AI agents: the difficulty of obtaining ground-truth labeled data needed by most existing optimization methods. Instead of relying on external validation, RHO mines the agent's own historical trajectories to identify a diverse coreset of challenging tasks, re-solves them in parallel, and uses self-validation and self-consistency checks to evaluate candidate harness updates. The agent then selects the best update through pairwise self-preference, requiring no human grading at any stage. The approach was evaluated across three domains—software engineering, technical work, and knowledge work—demonstrating broad applicability. On SWE-Bench Pro, a single RHO round improved the pass rate from 59% to 78%. Analysis showed that RHO specifically targets prior failure modes, altering the agent's behavioral patterns and sustaining higher accuracy over long-horizon task sessions. The work is accompanied by released code and a project website.

What's missing

The paper does not report results from multiple independent optimization rounds, leaving open whether RHO's gains compound, plateau, or degrade over successive iterations. It is also unclear how sensitive the method is to the quality and diversity of the initial trajectory history, which may be limited in early deployment. The self-preference mechanism relies on the agent's own judgment, raising the question of whether systematic blind spots in the model could cause it to consistently prefer suboptimal updates. Computational cost of re-solving coresets in parallel is not discussed in the abstract.

What different sources said

arXiv cs.AICenter
Evolving Agents in the Dark: Retrospective Harness Optimization via Self-Preference

Publications

Gut Bacteria Enzyme Found to Break Down Heat-Processed Food Compounds, Producing Novel Biogenic Amines

Researchers have discovered that an enzyme in common gut bacteria can degrade N-epsilon-carboxymethyllysine (CML), a compound formed during thermal food processing, producing previously unknown biogenic amines. The enzyme, ornithine decarboxylase SpeC from enterobacteria, acts on CML and related modified lysine derivatives through a low-level 'underground' catalytic activity. This finding suggests a previously unrecognized communication axis between thermally processed dietary compounds and gut microbial physiology, with potential implications for host health.

1 sourceJun 13

Publications

Full-Length Gene Sequencing Reveals Two Distinct Bacterial Communities in Black-Legged Ticks Expanding Into Canada

Researchers used Oxford Nanopore full-length 16S rRNA gene sequencing to characterize the microbiome of Ixodes scapularis black-legged ticks collected in Nova Scotia, Canada, distinguishing between tick-adapted bacteria and environmentally acquired bacteria. The study comes as I. scapularis — the primary vector of Lyme disease — is rapidly expanding northward into Canada due to climate change. The findings suggest that environmentally derived bacteria in tick microbiomes are not mere contamination, which has implications for how tick microbiome data is collected and interpreted across surveillance studies.

1 sourceJun 13

Publications

Study Identifies Metabolic Link Between Cell Envelope Stress and Biofilm Formation in Bacteria

Researchers have discovered that the metabolite acetyl-CoA directly inhibits enzymes that degrade the bacterial signaling molecule c-di-GMP, connecting cell envelope biosynthesis stress to biofilm formation in Pseudomonas aeruginosa. The study found that sub-inhibitory concentrations of antibiotics targeting early peptidoglycan biosynthesis — but not other antibiotic classes — elevate c-di-GMP levels by reducing phosphodiesterase activity, with acetyl-CoA competing for the enzyme active site. Because the relevant enzyme domain is broadly conserved across bacterial species, this checkpoint mechanism may be widespread and could have implications for understanding antibiotic-induced biofilm responses.

1 sourceJun 13

New Self-Supervised Method Enables AI Agents to Improve Without External Validation Data

What's missing

What different sources said

Related

Gut Bacteria Enzyme Found to Break Down Heat-Processed Food Compounds, Producing Novel Biogenic Amines

Full-Length Gene Sequencing Reveals Two Distinct Bacterial Communities in Black-Legged Ticks Expanding Into Canada

Study Identifies Metabolic Link Between Cell Envelope Stress and Biofilm Formation in Bacteria