layout

2 Column

Style

section-top-padding-small, gray, section-padding-16, social-share-top-right, social-share-purple-red

Practical applications of Bayesian optimization in industrial experiments

When every run counts, Bayesian optimization can help.

Jonas Rinne
March 3, 2026
6 min. read

Style

columns-50-50, section-padding-small, gray, section-top-padding-none, blog-hero

Let us start with a very brief introduction to Bayesian optimization. Unlike design of experiments (DOE), where you can start without any prior knowledge (although this might not be recommended) Bayesian optimization needs some initial experimental or historical data and a definition of goals, such as hitting a target or maximizing or minimizing the response(s) to get started. It uses this initial data to build a more or less good fitting model. Depending on the quality of model fit, Bayesian optimization will decide autonomously to explore unknown areas of the design space with high uncertainty or the areas where it is assuming a high probability of reaching the experimental goals.

Based on that decision, one or multiple new experimental runs will be recommended and added to the initial data after experimentation, so that the process eventually ends up repeating experiments in areas where the goal is reached. Due to this iterative and live incorporation of new information, Bayesian optimization is known for being very efficient with experimental resources to reach the goal(s). Figure 1 below shows the complete process.

Apart from conducting the experiments, Bayesian optimization can do all this on its own, relieving the user from making decisions about experimental design and statistical analysis. Nevertheless, keeping humans in the loop to incorporate expert knowledge can be done very frequently with every iteration. Some common questions (and their answers) include:

Is the range for one or more factors too large or too small? If so, adjust the range right away.
Is one or more factor harder to change? If so, just fix it for the next run/batch of runs.
Should any additional factor(s) be considered? If so, just add any new ones to the next iteration.

I like to think of it as a very smart autopilot with a hands-on option if desired. However, you do need to be able to run each experiment and measure the outcomes iteratively.

Fig 1. High level overview of the Bayesian optimization loop

DOE screening + Bayesian optimization

One of the many strengths of classical DOE is the interpretability of its linear regression analysis, which offers p-values, factor estimates, or errors. It is the go-to method when interpretating the impact of important factors out of many possible ones is needed in early-stage development. These kinds of experiments are usually followed up by optimization designs or design augmentation of the existing data. Why not use Bayesian optimization in addition to these initial screening data sets while fixing the screened-out factors? You only need to change the important ones that have been identified to potentially reach the goal faster with less effort, thus saving the organization time and resources while speeding up innovation.

Expensive production experiments

Improving or fixing an active production line is always a delicate matter; it is also a high-stakes task that can potentially cost considerable amounts of lost revenue and material. It can be difficult to adequately explain why a production manager should greenlight a major experimental plan. By proposing a leaner, less bulky, iterative Bayesian optimization approach – one that incorporates legacy data and begins with only a couple of production runs – management would be more likely to approve it since it would increase the chances of significant efficiency gains or improved production quality. Bayesian optimization just might help you get leadership support for low-risk, high-impact improvements in manufacturing environments where experimentation is typically less common than in R&D.

Mixtures and formulations

Due to the mixture properties of components summing up in the mixture and changing the amount of one component will change the amount of other components to meet this condition, standard regression modelling was always a bit flawed in describing these problems. Add some non-mixture process factors and it can become even more complicated. I still struggle with fully grasping the concept of Scheffé cubic models, which were made to deal with these correlated and often heavily constrained mixture factors using linear regression.

By their very nature, even perfectly executed approaches do not offer the advantageous interpretability that we are used to getting from linear regression. Determining the impact of mixture factors using either estimates or p-values for classical factor screening is hard to do.

As a result, mixture experiments often boil down to optimization in many cases. If the advantage of easy interpretation disappears, why not tackle these problems directly with Bayesian optimization, which does not care about factor correlations, hard-to-interpret reports, or constraints, yet can potentially save experimental resources and headaches for the staff involved?

Failed or incomplete DOEs

Let’s be honest. Not every experiment is a success right away – especially in the adaptation phase when getting started with structured experimentation – and that is fine. I have had several situations (and I have seen it happening with others) where the experimental plan could not be finished for a variety of reasons, including incomplete knowledge in the planning phase, some wrong initial assumptions, safety reasons, process instabilities, human error, or material shortages. Recovering failed or incomplete designs needs careful considerations and design augmentation since incomplete designs bear some interpretation risks when being analysed.

Again, Bayesian optimization can relieve the user from the burden of these decisions on design recovery, by taking every kind of available data and working toward the goals efficiently. The same holds true for all the hidden potential in unstructured experimentational data sets that are currently sleeping in R&D organisations’ data lakes.

‘Rugged’ response surfaces

We don’t always look at the linear relationships between factors and the responses we hope to get. Sometimes the physics are too messy for a traditional regression model, but there is a lack of data and/or resources or the system is too complex for a full physics simulation. When dealing with these rugged modelling landscapes, Bayesian optimization’s Gaussian process models can really shine since they might be better at handling all that complexity when enough runs and iterations have been allocated to do so.

Summary

Although not a new concept, Bayesian optimization’s new user-friendly accessibility adds another layer to the toolbox of structured experimental approaches. Its low threshold of entry – due to its built-in decision making and automatic handling of the needed statistical analysis in the background – makes it easier for the user to interact with the process whenever needed,.

While it can be used for any optimization problem that can be tackled with an iterative approach, Bayesian optimization offers great new opportunities for dealing with expensive production experiments. It may just be what you are looking for when every run counts, since it can incorporate mixtures/formulation optimization, rugged non-linear response surfaces, and any unstructured experimental data or incomplete DOE designs that are sleeping unused in various databases.

See it in practice.

Watch this on-demand webinar to learn how Boehringer Ingelheim combined DOE and Bayesian optimization to run fewer, better experiments.

/en/fragments/bios/rinne-jonas-condensed

/en/fragments/blog/subscribe-card

layout

2 Column

Style

columns-80-20, section-padding-large