New Kernel-Based Method Improves Protein Property Prediction from Limited Data
Researchers have developed a class of sequence kernels that use evolutionary substitution matrices to predict protein properties like binding affinity and thermostability more efficiently than existing methods. The approach uses Gaussian processes and can incorporate structural information from foundation models, showing particular strength in multi-task learning scenarios. This advancement could accelerate protein design applications by enabling accurate predictions with sparse experimental data.
A new machine learning approach for protein property prediction has been introduced that leverages sequence kernels based on evolutionary substitution matrices and local linearity assumptions. The method employs Gaussian processes to create data-efficient models of protein property landscapes, demonstrating superior performance compared to alternatives relying on foundation model embeddings alone. The researchers further enhanced their approach by learning structure-aware substitution matrices that integrate structural information from foundation models. The structure-conditioned kernels proved particularly effective for multi-task learning across multiple protein property landscapes, consistently outperforming local supervised learning methods. This work addresses a significant challenge in computational protein design where experimental data is typically sparse and expensive to obtain.
Limitations & open questions
The paper does not discuss computational complexity or runtime comparisons with baseline methods, nor does it provide details on the specific protein properties tested beyond binding affinity and thermostability. The limitations of the approach when applied to proteins with limited evolutionary information or novel protein sequences are not explicitly addressed.
What different sources said
- arXiv cs.LGCenter
Flexible Kernels for Protein Property Prediction
Related

Study suggests asexual reproduction slowed early animal evolution during Ediacaran period
Researchers from the University of Cambridge found that early animals during the Ediacaran period (635-539 million years ago) reproduced asexually through runners, which limited competition and slowed evolutionary diversity. The study used fossil analysis, spatial modeling, and artificial intelligence to examine ancient ecosystems at Mistaken Point in Newfoundland. The findings help explain why animal diversity remained limited for millions of years before a dramatic burst of innovation in the Cambrian period.

UK Science Facilities Face £162m Funding Crisis With Potential Closures
Britain's world-leading science facilities, including the Diamond Light Source and ISIS Neutron and Muon Source, face potential closure or significant cuts due to a £162m funding crisis at the Science and Technology Facilities Council caused by rising electricity costs, staff expenses, and foreign exchange pressures. These facilities serve hundreds of companies and thousands of scientists domestically and internationally, with Diamond producing light 10 billion times brighter than the sun for materials research. Scientists and research leaders warn that short-term funding cuts could cause decades-long damage to the UK's scientific capability and international competitiveness.
Mitochondrial ROS Signaling Drives Avoidance Learning in C. elegans
Researchers discovered that reactive oxygen species (ROS) produced by mitochondria in postsynaptic neurons are necessary and sufficient to drive avoidance learning in C. elegans, using optogenetic stimulation of nociceptive neurons. The study demonstrates that activity-dependent mitochondrial ROS production increases glutamate receptors at synapses and strengthens neural circuits controlling avoidance behavior. This finding reveals a novel molecular mechanism linking neuronal activity to synaptic plasticity and behavioral learning.