**************************************************
Optimization via Infill Criteria using Surrogates
**************************************************
When dealing with a non linear problem (NLP), such as in :eq:`optproblem`,
typically it is resorted to classical solvers (e.g. SQP, trust-region-dogleg,
genetic algorithms, simulated annealing, etc.) to obtain its solution,
depending on the nature of the NLP (e.g. presence of discontinuities, whether
or not the function is differentiable, etc.).
There is a entire field of study dedicated to find these NLP solutions with
*Kriging* surrogates. In the works of
:cite:`jones2001,sasena2002,forrester2008,alexandrov2000`, there are entire
discussions and frameworks on how to solve non linear problems and
comparisons of several metrics involved in the optimization process with
metamodels.
The premise of performing a optimization using surrogates is that the model
to be optimized is too time consuming or computationally expensive to be
solved with classical solvers. To circumvent this, the following steps are
proposed:
#. Build an approximation model with *Kriging* surrogates using a
limited number of initial samples. This approximation is a "generalist"
enough representation of the real model;
#. Perform an optimization of the approximation model using classical NLP
solvers and an infill criteria. The surrogate model reduces the
"search area" needed by the solver;
#. Compare the surrogate optimum found in step 2 with the
result from original model. In other words: feed the results from the
*Kriging* metamodel optimum into the original model and see if they are
close enough;
#. If the optimum from the metamodel is close enough (based on a
chosen metric) to the original model, then this may be the true optimum.
Otherwise, update the *Kriging* model by introducing the value found and
return to step 2;
This process is basically "filling holes" (hence the name *infill*)
in our *Kriging* metamodel until original model optimum is found. To illustrate
this in the simplest way, we are going to use the same function :eq:`complex`.
Assuming that we only have three initial points sampled from this model
function, we build our *Kriging* model. As can be seen in :numref:`infill_init`.
.. figure:: ../images/infill_init.svg
:name: infill_init
:align: center
Initial plot of the example function. The solid blue line represents the
function behavior. The orange dashed line is the *Kriging* metamodel of
the three sampled points (red circles) available.
When applying an optimization solver on the *Kriging* model, we get a new
optimal value for :math:`x` near 7.8 (3.47 for :math:`f(x)`$ when we consult
the original model). Now, we include these values of (:math:`x, f(x)`) in the
sample and rebuild the *Kriging* metamodel. The result is shown
:numref:`infill_1`. We keep repeating this procedure until we get the result
in :numref:`infill_2`.
.. figure:: ../images/infill_1.svg
:name: infill_1
:align: center
The *Kriging* model after one update.
.. figure:: ../images/infill_2.svg
:name: infill_2
:align: center
The *Kriging* model after four updates. Notice how the *Kriging* model
adjusts to the true function near the optimal point.
This is the entire process animated in :numref:`infill_animated`:
.. figure:: ../images/animation_infill_t.gif
:name: infill_animated
:align: center
The main steps of the infill criteria procedure as an example.
This example is a trivial one because the problem involves a single input
variable and infill criteria is the own *Kriging* prediction of the model. As
discussed in :cite:`jones2001`, this criteria has its pitfalls if used
without other precautions.
:cite:`caballero2008` presented an algorithm, based on the "method 2"
in the work of :cite:`jones2001`, referred as a gradient matching
technique where the gradient of the surrogate is forced to match with
the true function gradient, this is done through trust-region approach to
ensure local convergence which was proven in the work of
:cite:`alexandrov2000`.
.. IMPORTANT::
The basic idea of this approach is:
#. Minimize the NLP problem metamodel.
#. Consult the original function at the minimum found in the metamodel.
#. Update the sample matrix used to build the surrogate.
#. Repeat this until a convergence criteria is met.
The flowchart depicting the whole procedure is defined in
:numref:`caballeroflowchart`. For detailed explanation of each step of the
proposed algorithm, one must refer to :cite:`caballero2008` and
:cite:`alves2018`.
.. figure:: ../images/caballero_flowchart_final.svg
:name: caballeroflowchart
:align: center
Flowchart of :cite:`caballero2008` algorithm, translated to Python and
implemented within *Metacontrol*.