Feb 11, 2026

Bayesian Optimization Augmented with Text

LLM generated text can add speed and diversity to high-entropy alloy search.

Bayesian Optimization (BO) is a powerful technique to manage the search for novel High Entropy Alloys when experiments are expensive. But what happens when your search space isn't just numbers—it's the complex world of material compositions?

We've developed two intriguing approaches that use Large Language Model (LLM) embeddings to guide Bayesian Optimization in alloy design. By encoding expert knowledge about how elements affect material properties, we can search more intelligently and discover a wider range of useful alloys.

Key Takeaway: LLM embeddings can encode domain knowledge about materials, enabling more efficient and diverse Bayesian Optimization in compositional spaces.

Our Approach

We explored two complementary strategies for incorporating LLM knowledge into the optimization loop.

Element-Based Optimization (EBO)

The first approach focuses on individual elements:

Generate text descriptions of how each element influences yield strength and ductility
Embed these descriptions into 512-dimensional vectors
Represent compositions as weighted averages of element embeddings
Use PCA to project to a 15-D latent space suitable for BO
Optimize in this space, then map back to valid compositions

Composition-Based Optimization (CBO)

The second approach works at the composition level:

Generate descriptions for 15,000 compositions describing their properties
Train an autoencoder to map between embeddings and weight vectors
Run BO in the learned latent space
Constrained optimization recovers valid compositions from suggestions

Results

We compared our LLM-guided approaches (EBO and CBO) against standard Bayesian Optimization across multiple runs.

The following figure charts the performance of these methods over time (iterations) when we optimize for materials with improving yield-strength and ductility. The hypervolume combines the utility of multiple properties into a single number so that a larger hypervolume implies better options across the multiple objectives.

Figure 1: Comparison between standard BO, Element-Based BO (EBO), and Composition-Based BO (CBO). EBO consistently discovers more diverse, high-performing solutions.

In addition to the hypervolume, we measured the diversity of the returned compositions as the maximum pairwise-distance between them. Table 1 summarizes the key metrics:

Table 1: Performance comparison across optimization methods. EBO achieves the highest hypervolume and solution diversity.

The results are intriguing. In addition to a significant improvement in hypervolume over standard BO, the text-enhanced methods also discover more diverse solutions. This suggests that LLM embeddings effectively encode useful prior knowledge about element interactions.

Why It Helps

Our hypothesis is that LLMs trained on scientific literature have implicitly learned relationships between elements and material properties. When we embed descriptions like ``chromium increases hardness and corrosion resistance,'' the resulting vector captures semantic relationships that correlate with actual physical behavior.

By optimizing in this semantically-informed space rather than raw compositional coordinates, we bias the search toward physically meaningful regions---leading to faster convergence and better solutions.

What's Next

Several directions look promising for future work:

Expanded feature sets: Currently we optimize for yield strength and ductility. Adding more objectives could discover alloys with better overall performance.
Turntable constraints: Our current filters (cost <$1000/kg, melting point > 1500K) could be adjusted for different applications.
Higher-dimensional embeddings: We use 512-D embeddings, but could scale up to 3000-D for richer representations.
Longer optimization runs: At 10 iterations/hour, we can afford to run longer campaigns.

References

[1] Z. Zhang et al., "A multi-modal robotic platform for multi-element electrocatalyst discovery'' Nature, Vol. 647, pp. 390-396 (2025).

[2] A. Chen et al., "Generalists vs. Specialists: Evaluating LLMs on Highly-Constrained Biophysical Sequence Optimization Tasks,'' arXiv:2410.22296 (2024).

[3] B. Ranković et al., "Large language models as uncertainty-calibrated optimizers for experimental discovery,'' arXiv:2504.06265 (2025).

[4] Y. Pu et al., "PiFlow: Principle-aware Scientific Discovery with Multi-Agent Collaboration,'' arXiv:2505.15047 (2025).

[5] B. Ranković and P. Schwaller, "Bochemian: Large language model embeddings for Bayesian optimization of chemical reactions,'' NeurIPS Workshop (2023).

[6] A. Kristiadi et al., ``A sober look at LLMs for material discovery,'' ICML (2024).

@RadicalAI

View on