Optimization Problems#
From: https://docs.sciml.ai/ModelingToolkit/stable/tutorials/optimization/
2D Rosenbrock Function#
Wikipedia: https://en.wikipedia.org/wiki/Rosenbrock_function
Find \((x, y)\) that minimizes the loss function \((a - x)^2 + b(y - x^2)^2\)
using ModelingToolkit
using Optimization
using OptimizationOptimJL
@variables begin
x, [bounds = (-2.0, 2.0)]
y, [bounds = (-1.0, 3.0)]
end
@parameters a b
Define the target (loss) function The optimization algorithm will try to minimize its value
loss = (a - x)^2 + b * (y - x^2)^2
Build the OptimizationSystem
@mtkbuild sys = OptimizationSystem(loss, [x, y], [a, b])
Initial guess
u0 = [ x => 1.0, y => 2.0]
2-element Vector{Pair{Symbolics.Num, Float64}}:
x => 1.0
y => 2.0
parameters
p = [ a => 1.0, b => 100.0]
2-element Vector{Pair{Symbolics.Num, Float64}}:
a => 1.0
b => 100.0
ModelingToolkit can generate gradient and Hessian to solve the problem more efficiently.
prob = OptimizationProblem(sys, u0, p, grad=true, hess=true)
OptimizationProblem. In-place: true
u0: 2-element Vector{Float64}:
1.0
2.0
Solve the problem The true solution is (1.0, 1.0)
u_opt = solve(prob, GradientDescent())
retcode: Success
u: 2-element Vector{Float64}:
1.0000000135463598
1.0000000271355158
Adding constraints#
OptimizationSystem(..., constraints = cons)
@variables begin
x, [bounds = (-2.0, 2.0)]
y, [bounds = (-1.0, 3.0)]
end
@parameters a = 1 b = 100
loss = (a - x)^2 + b * (y - x^2)^2
Constraints are define using ≲
(\lesssim
) or ≳
(\gtrsim
)
cons = [
x^2 + y^2 ≲ 1,
]
@mtkbuild sys = OptimizationSystem(loss, [x, y], [a, b], constraints=cons)
u0 = [x => 0.14, y => 0.14]
prob = OptimizationProblem(sys, u0, grad=true, hess=true, cons_j=true, cons_h=true)
OptimizationProblem. In-place: true
u0: 2-element Vector{Float64}:
0.14
0.14
Use interior point Newton method for constrained optimization
solve(prob, IPNewton())
┌ Warning: The selected optimization algorithm requires second order derivatives, but `SecondOrder` ADtype was not provided.
│ So a `SecondOrder` with SciMLBase.NoAD() for both inner and outer will be created, this can be suboptimal and not work in some cases so
│ an explicit `SecondOrder` ADtype is recommended.
└ @ OptimizationBase ~/.julia/packages/OptimizationBase/gvXsf/src/cache.jl:49
retcode: Success
u: 2-element Vector{Float64}:
0.7864151541684254
0.6176983125233897
Parameter estimation#
From: https://docs.sciml.ai/DiffEqParamEstim/stable/getting_started/
DiffEqParamEstim.jl
is not installed with DifferentialEquations.jl
. You need to install it manually:
using Pkg
Pkg.add("DiffEqParamEstim")
using DiffEqParamEstim
The key function is DiffEqParamEstim.build_loss_objective()
, which builds a loss (objective) function for the problem against the data. Then we can use optimization packages to solve the problem.
Estimate a single parameter from the data and the ODE model#
Let’s optimize the parameters of the Lotka-Volterra equation.
using DifferentialEquations
using Plots
using DiffEqParamEstim
using ForwardDiff
using Optimization
using OptimizationOptimJL
Example model
function lotka_volterra!(du, u, p, t)
du[1] = dx = p[1] * u[1] - u[1] * u[2]
du[2] = dy = -3 * u[2] + u[1] * u[2]
end
u0 = [1.0; 1.0]
tspan = (0.0, 10.0)
p = [1.5] ## The true parameter value
prob = ODEProblem(lotka_volterra!, u0, tspan, p)
ODEProblem with uType Vector{Float64} and tType Float64. In-place: true
timespan: (0.0, 10.0)
u0: 2-element Vector{Float64}:
1.0
1.0
True solution
sol = solve(prob, Tsit5())
retcode: Success
Interpolation: specialized 4th order "free" interpolation
t: 34-element Vector{Float64}:
0.0
0.0776084743154256
0.2326451370670694
0.42911851563726466
0.679082199936808
0.9444046279774128
1.2674601918628516
1.61929140093895
1.9869755481702074
2.2640903679981617
⋮
7.5848624442719235
7.978067891667038
8.483164641366145
8.719247691882519
8.949206449510513
9.200184762926114
9.438028551201125
9.711807820573478
10.0
u: 34-element Vector{Vector{Float64}}:
[1.0, 1.0]
[1.0454942346944578, 0.8576684823217128]
[1.1758715885890039, 0.6394595702308831]
[1.419680958026516, 0.45699626144050703]
[1.8767193976262215, 0.3247334288460738]
[2.5882501035146133, 0.26336255403957304]
[3.860709084797009, 0.27944581878759106]
[5.750813064347339, 0.5220073551361045]
[6.814978696356636, 1.917783405671627]
[4.392997771045279, 4.194671543390719]
⋮
[2.6142510825026886, 0.2641695435004172]
[4.241070648057757, 0.30512326533052475]
[6.79112182569163, 1.13452538354883]
[6.265374940295053, 2.7416885955953294]
[3.7807688120520893, 4.431164521488331]
[1.8164214705302744, 4.064057991958618]
[1.146502825635759, 2.791173034823897]
[0.955798652853089, 1.623563316340748]
[1.0337581330572414, 0.9063703732075853]
Create a sample dataset with some noise.
ts = range(tspan[begin], tspan[end], 200)
data = [sol.(ts, idxs=1) sol.(ts, idxs=2)] .* (1 .+ 0.03 .* randn(length(ts), 2))
200×2 Matrix{Float64}:
0.955417 1.05159
1.02627 0.891215
1.06935 0.812458
1.13775 0.707015
1.1508 0.681625
1.19604 0.621972
1.25164 0.573472
1.28989 0.531952
1.42772 0.467311
1.53631 0.451357
⋮
1.00213 2.02603
0.986264 1.90245
0.990831 1.71463
0.947535 1.48271
0.94817 1.36011
0.989919 1.20776
0.931644 1.11739
1.02695 0.969842
1.07089 0.879465
Plotting the sample dataset and the true solution.
plot(sol)
scatter!(ts, data, label=["u1 data" "u2 data"])
DiffEqParamEstim.build_loss_objective()
builds a loss function for the ODE problem for the data.
We will minimize the mean squared error using L2Loss()
.
Note that
the data should be transposed.
Uses
AutoForwardDiff()
as the automatic differentiation (AD) method since the number of parameters plus states is small (<100). For larger problems, one can useOptimization.AutoZygote()
.
alg = Tsit5()
cost_function = build_loss_objective(
prob, alg,
L2Loss(collect(ts), transpose(data)),
Optimization.AutoForwardDiff(),
maxiters=10000, verbose=false
)
plot(
cost_function, 0.0, 10.0,
linewidth=3, label=false, yscale=:log10,
xaxis="Parameter", yaxis="Cost", title="1-Parameter Cost Function"
)
There is a dip (minimum) in the cost function at the true parameter value (1.5). We can use an optimizer, e.g., Optimization.jl
, to find the parameter value that minimizes the cost. (1.5 in this case)
optprob = Optimization.OptimizationProblem(cost_function, [1.42])
optsol = solve(optprob, BFGS())
retcode: Success
u: 1-element Vector{Float64}:
1.5004662608266077
The fitting result:
newprob = remake(prob, p=optsol.u)
newsol = solve(newprob, Tsit5())
plot(sol)
plot!(newsol)
Estimate multiple parameters#
Let’s use the Lotka-Volterra (Fox-rabbit) equations with all 4 parameters free.
function f2(du, u, p, t)
du[1] = dx = p[1] * u[1] - p[2] * u[1] * u[2]
du[2] = dy = -p[3] * u[2] + p[4] * u[1] * u[2]
end
u0 = [1.0; 1.0]
tspan = (0.0, 10.0)
p = [1.5, 1.0, 3.0, 1.0] ## True parameters
alg = Tsit5()
prob = ODEProblem(f2, u0, tspan, p)
sol = solve(prob, alg)
retcode: Success
Interpolation: specialized 4th order "free" interpolation
t: 34-element Vector{Float64}:
0.0
0.0776084743154256
0.2326451370670694
0.42911851563726466
0.679082199936808
0.9444046279774128
1.2674601918628516
1.61929140093895
1.9869755481702074
2.2640903679981617
⋮
7.5848624442719235
7.978067891667038
8.483164641366145
8.719247691882519
8.949206449510513
9.200184762926114
9.438028551201125
9.711807820573478
10.0
u: 34-element Vector{Vector{Float64}}:
[1.0, 1.0]
[1.0454942346944578, 0.8576684823217128]
[1.1758715885890039, 0.6394595702308831]
[1.419680958026516, 0.45699626144050703]
[1.8767193976262215, 0.3247334288460738]
[2.5882501035146133, 0.26336255403957304]
[3.860709084797009, 0.27944581878759106]
[5.750813064347339, 0.5220073551361045]
[6.814978696356636, 1.917783405671627]
[4.392997771045279, 4.194671543390719]
⋮
[2.6142510825026886, 0.2641695435004172]
[4.241070648057757, 0.30512326533052475]
[6.79112182569163, 1.13452538354883]
[6.265374940295053, 2.7416885955953294]
[3.7807688120520893, 4.431164521488331]
[1.8164214705302744, 4.064057991958618]
[1.146502825635759, 2.791173034823897]
[0.955798652853089, 1.623563316340748]
[1.0337581330572414, 0.9063703732075853]
ts = range(tspan[begin], tspan[end], 200)
data = [sol.(ts, idxs=1) sol.(ts, idxs=2)] .* (1 .+ 0.01 .* randn(length(ts), 2))
200×2 Matrix{Float64}:
0.995837 1.01476
1.03802 0.905018
1.06787 0.800802
1.09508 0.736222
1.11997 0.687929
1.18872 0.613837
1.23411 0.583538
1.31033 0.525488
1.38862 0.478373
1.44308 0.43572
⋮
0.984307 2.03974
0.97665 1.84733
0.948471 1.65733
0.958485 1.49855
0.9465 1.33009
0.972328 1.23191
0.963384 1.10477
1.00992 1.00652
1.04136 0.901841
Then we can find multiple parameters at once using the same steps. True parameters are [1.5, 1.0, 3.0, 1.0]
.
cost_function = build_loss_objective(
prob, alg, L2Loss(collect(ts), transpose(data)),
Optimization.AutoForwardDiff(),
maxiters=10000, verbose=false
)
optprob = Optimization.OptimizationProblem(cost_function, [1.3, 0.8, 2.8, 1.2])
result_bfgs = solve(optprob, BFGS())
retcode: Success
u: 4-element Vector{Float64}:
1.4974913041398208
1.0003298682048294
3.0092126826216257
1.0035207536865165
This notebook was generated using Literate.jl.