PublicationsJun 1183% confidence

Study Finds Neural Chess Engine Overrides Its Own Correct Solutions Due to Learned Safety Preferences

Center 100%

1 source

Researchers studying Leela Chess Zero, a leading neural chess engine, found that the model correctly computes solutions to chess puzzles in intermediate layers but systematically suppresses those answers in its final output. The phenomenon, dubbed 'forgotten puzzles,' occurs because late layers in the network shift toward prioritizing cautious, safe play over aggressive but correct moves. The findings challenge a core assumption in AI interpretability: that identifying an algorithm inside a neural network guarantees the network will actually use it.

A new study posted to arXiv examines Leela Chess Zero, widely considered the strongest neural chess engine, and uncovers a striking disconnect between internal computation and final behavior. Using an extended 'logit lens' technique applied to the model's policy network, researchers found that correct puzzle solutions—including immediate checkmates—frequently emerge in intermediate layers but are overridden by the time the model produces its final move selection. The authors replicated prior mechanistic analyses and confirmed that the look-ahead algorithm itself is functioning normally: future moves of the correct continuation are represented, causally active, and linearly decodable within the network. The override is instead driven by late-layer biases toward safe, non-aggressive play, which the researchers characterize as learned 'safety priors.' To establish causality, they steered the model against these preferences and successfully recovered 61.7% of the forgotten puzzle solutions. The study concludes that algorithmic structure does not guarantee algorithmic behavior, meaning a model can internally arrive at the correct answer and still output the wrong one—a finding with broad implications for AI interpretability and alignment research.

What's missing

The paper does not clarify whether the 'safety priors' observed are an emergent artifact of training data distribution (e.g., human games favoring solid play) or a more deliberate structural feature of the network. It also does not address whether similar override phenomena occur in other game-playing or general-purpose neural networks, leaving open the question of how widespread this behavior is. The steering intervention recovering 61.7% of forgotten puzzles leaves 38.3% unrecovered, and the paper does not fully account for what drives those remaining failures.

What different sources said

arXiv cs.AICenter
The Algorithm Is Not the Behavior: Learned Priors Override Look-Ahead in a Chess-Playing Neural Network

Publications

Gut Bacteria Enzyme Found to Break Down Heat-Processed Food Compounds, Producing Novel Biogenic Amines

Researchers have discovered that an enzyme in common gut bacteria can degrade N-epsilon-carboxymethyllysine (CML), a compound formed during thermal food processing, producing previously unknown biogenic amines. The enzyme, ornithine decarboxylase SpeC from enterobacteria, acts on CML and related modified lysine derivatives through a low-level 'underground' catalytic activity. This finding suggests a previously unrecognized communication axis between thermally processed dietary compounds and gut microbial physiology, with potential implications for host health.

1 sourceJun 13

Publications

Full-Length Gene Sequencing Reveals Two Distinct Bacterial Communities in Black-Legged Ticks Expanding Into Canada

Researchers used Oxford Nanopore full-length 16S rRNA gene sequencing to characterize the microbiome of Ixodes scapularis black-legged ticks collected in Nova Scotia, Canada, distinguishing between tick-adapted bacteria and environmentally acquired bacteria. The study comes as I. scapularis — the primary vector of Lyme disease — is rapidly expanding northward into Canada due to climate change. The findings suggest that environmentally derived bacteria in tick microbiomes are not mere contamination, which has implications for how tick microbiome data is collected and interpreted across surveillance studies.

1 sourceJun 13

Publications

Study Identifies Metabolic Link Between Cell Envelope Stress and Biofilm Formation in Bacteria

Researchers have discovered that the metabolite acetyl-CoA directly inhibits enzymes that degrade the bacterial signaling molecule c-di-GMP, connecting cell envelope biosynthesis stress to biofilm formation in Pseudomonas aeruginosa. The study found that sub-inhibitory concentrations of antibiotics targeting early peptidoglycan biosynthesis — but not other antibiotic classes — elevate c-di-GMP levels by reducing phosphodiesterase activity, with acetyl-CoA competing for the enzyme active site. Because the relevant enzyme domain is broadly conserved across bacterial species, this checkpoint mechanism may be widespread and could have implications for understanding antibiotic-induced biofilm responses.

1 sourceJun 13

Study Finds Neural Chess Engine Overrides Its Own Correct Solutions Due to Learned Safety Preferences

What's missing

What different sources said

Related

Gut Bacteria Enzyme Found to Break Down Heat-Processed Food Compounds, Producing Novel Biogenic Amines

Full-Length Gene Sequencing Reveals Two Distinct Bacterial Communities in Black-Legged Ticks Expanding Into Canada

Study Identifies Metabolic Link Between Cell Envelope Stress and Biofilm Formation in Bacteria