Skip to content

Commit 927c58b

Browse files
authored
[Nonlinear] add SymbolicAD submodule (#2624)
1 parent a05b55d commit 927c58b

File tree

9 files changed

+2642
-0
lines changed

9 files changed

+2642
-0
lines changed

docs/make.jl

+1
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,7 @@ const _PAGES = [
7070
],
7171
"Nonlinear" => [
7272
"Overview" => "submodules/Nonlinear/overview.md",
73+
"SymbolicAD" => "submodules/Nonlinear/SymbolicAD.md",
7374
"API Reference" => "submodules/Nonlinear/reference.md",
7475
],
7576
"Utilities" => [
+259
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,259 @@
1+
```@meta
2+
CurrentModule = MathOptInterface
3+
DocTestSetup = quote
4+
import MathOptInterface as MOI
5+
end
6+
DocTestFilters = [r"MathOptInterface|MOI"]
7+
```
8+
9+
# SymbolicAD
10+
11+
The `Nonlinear.SymbolicAD` submodule contains data structures and functions for
12+
working with the symbolic derivatives of a nonlinear optimization problem. This
13+
page explains the API and describes the rationale behind its design.
14+
15+
## Background
16+
17+
The code in `SymbolicAD` is inspired by Hassan Hijazi's work on
18+
[coin-or/gravity](https://github.com/coin-or/Gravity), a high-performance
19+
algebraic modeling language in C++.
20+
21+
Hassan made the following observations:
22+
23+
* For large scale models, symbolic differentiation is slower than other
24+
automatic differentiation techniques, such as the reverse mode algorithm
25+
implemented in `MOI.Nonlinear.ReverseAD`.
26+
* However, most large-scale nonlinear programs have a lot of structure.
27+
* Gravity asks the user to provide structure in the form of
28+
_template constraints_, where the user gives the symbolic form of the
29+
constraint as well as a set of data to convert from a symbolic form to the
30+
numerical form.
31+
* Instead of differentiating each constraint in its numerical form, we can
32+
compute one symbolic derivative of the constraint in symbolic form, and then
33+
plug in the data in to get the numerical derivative of each function.
34+
* As a final step, if users don't provide the structure, we can still infer it
35+
--perhaps with less accuracy--by comparing the expression tree of each
36+
constraint.
37+
38+
The symbolic differentiation approach of Gravity works well when the problem is
39+
large with few unique constraints. For example, a model like:
40+
```julia
41+
model = Model()
42+
@variable(model, 0 <= x[1:10_000] <= 1)
43+
@constraint(model, [i=1:10_000], sin(x[i]) <= 1)
44+
@objective(model, Max, sum(x))
45+
```
46+
is ideal, because although the Jacobian matrix has 10,000 rows, we can compute
47+
the derivative of `sin(x[i])` as `cos(x[i])`, and then fill in the Jacobian by
48+
evaluating the derivative function instead of having to differentiation 10,000
49+
expressions.
50+
51+
The symbolic differentiation approach of Gravity works poorly if there are a
52+
large number of unique constraints in the model (which would require a lot of
53+
expressions to be symbolically differentiated), or if the nonlinear functions
54+
contain a large number of nonlinear terms (which would make the symbolic
55+
derivative expensive to compute).
56+
57+
`SymbolicAD` started life as [MathOptSymbolicAD.jl](https://github.com/lanl-ansi/MathOptSymbolicAD.jl),
58+
development of which began in early 2022. This initial version of `SymbolicAD`
59+
used the `Symbolics.jl` package to compute the symbolic derivatives. In 2025, we
60+
rewrote MathOptSymbolicAD.jl to remove the dependence on `Symbolics.jl`, and,
61+
since the rewrite depended only on MathOptInterface, we added it to
62+
`MOI.Nonlinear` as a new submodule.
63+
64+
For more details on `MathOptSymbolicAD.jl`, see Oscar's [JuMP-dev 2022 talk](https://www.youtube.com/watch?v=d_X3gj3Iz-k),
65+
although note that the syntax has changed since the original recording.
66+
67+
## Use SymbolicAD with JuMP
68+
69+
To use `SymbolicAD` with JuMP, set the [`AutomaticDifferentiationBackend`](@ref)
70+
attribute to [`Nonlinear.SymbolicMode`](@ref):
71+
72+
```julia
73+
using JuMP, Ipopt
74+
model = Model(Ipopt.Optimizer)
75+
set_attribute(
76+
model,
77+
MOI.AutomaticDifferentiationBackend(),
78+
MOI.Nonlinear.SymbolicMode(),
79+
)
80+
@variable(model, x[1:2])
81+
@objective(model, Min, (1 - x[1])^2 + 100 * (x[2] - x[1]^2)^2)
82+
optimize!(model)
83+
```
84+
85+
To revert back to the default sparse reverse mode algorithm, set the
86+
[`AutomaticDifferentiationBackend`](@ref) attribute to
87+
[`Nonlinear.SparseReverseMode`](@ref).
88+
89+
## `simplify`
90+
91+
Use [`Nonlinear.SymbolicAD.simplify`](@ref) to simplify nonlinear expressions.
92+
The simplification algorithm performs simple rewrites such as lifting nested
93+
summations:
94+
95+
```jldoctest
96+
julia> x = MOI.VariableIndex(1)
97+
MOI.VariableIndex(1)
98+
99+
julia> f = MOI.ScalarNonlinearFunction(
100+
:+,
101+
Any[MOI.ScalarNonlinearFunction(:+, Any[1.0, x]), 2.0 * x + 3.0],
102+
)
103+
+(+(1.0, MOI.VariableIndex(1)), 3.0 + 2.0 MOI.VariableIndex(1))
104+
105+
julia> MOI.Nonlinear.SymbolicAD.simplify(f)
106+
4.0 + 3.0 MOI.VariableIndex(1)
107+
```
108+
109+
and trivial identities such as ``x^1 = x``:
110+
111+
```jldoctest
112+
julia> x = MOI.VariableIndex(1)
113+
MOI.VariableIndex(1)
114+
115+
julia> f = MOI.ScalarNonlinearFunction(:^, Any[x, 1])
116+
^(MOI.VariableIndex(1), (1))
117+
118+
julia> MOI.Nonlinear.SymbolicAD.simplify(f)
119+
MOI.VariableIndex(1)
120+
```
121+
122+
The list of rewrites that will be made is intentionally limited to keep the
123+
codebase simple. `Nonlinear.SymbolicAD` is not a substitute for a Computer
124+
Algebraic System (CAS). For example, we do not detect the relationship
125+
``sin(x)^2+cos(x)^2=1``:
126+
127+
```jldoctest
128+
julia> x = MOI.VariableIndex(1)
129+
MOI.VariableIndex(1)
130+
131+
julia> sin_x = MOI.ScalarNonlinearFunction(:sin, Any[x])
132+
sin(MOI.VariableIndex(1))
133+
134+
julia> cos_x = MOI.ScalarNonlinearFunction(:cos, Any[x])
135+
cos(MOI.VariableIndex(1))
136+
137+
julia> f = MOI.ScalarNonlinearFunction(
138+
:+,
139+
Any[
140+
MOI.ScalarNonlinearFunction(:^, Any[sin_x, 2]),
141+
MOI.ScalarNonlinearFunction(:^, Any[cos_x, 2]),
142+
],
143+
)
144+
+(^(sin(MOI.VariableIndex(1)), (2)), ^(cos(MOI.VariableIndex(1)), (2)))
145+
146+
julia> MOI.Nonlinear.SymbolicAD.simplify(f)
147+
+(^(sin(MOI.VariableIndex(1)), (2)), ^(cos(MOI.VariableIndex(1)), (2)))
148+
```
149+
150+
In addition to [`Nonlinear.SymbolicAD.simplify`](@ref), there is an in-place
151+
version [`Nonlinear.SymbolicAD.simplify!`](@ref) that may make changes to the
152+
existing function.
153+
154+
## `variables`
155+
156+
Use [`Nonlinear.SymbolicAD.variables`](@ref) to return a sorted list of the
157+
variables that appear in the function:
158+
159+
```jldoctest
160+
julia> x = MOI.VariableIndex.(1:3)
161+
3-element Vector{MathOptInterface.VariableIndex}:
162+
MOI.VariableIndex(1)
163+
MOI.VariableIndex(2)
164+
MOI.VariableIndex(3)
165+
166+
julia> f = MOI.ScalarNonlinearFunction(:atan, Any[x[3], 2.0 * x[1]])
167+
atan(MOI.VariableIndex(3), 0.0 + 2.0 MOI.VariableIndex(1))
168+
169+
julia> MOI.Nonlinear.SymbolicAD.variables(f)
170+
2-element Vector{MathOptInterface.VariableIndex}:
171+
MOI.VariableIndex(1)
172+
MOI.VariableIndex(3)
173+
```
174+
175+
## `derivative`
176+
177+
Use [`Nonlinear.SymbolicAD.derivative`](@ref) to compute the symbolic derivative
178+
of a function with respect to a decision variable:
179+
180+
```jldoctest
181+
julia> x = MOI.VariableIndex(1)
182+
MOI.VariableIndex(1)
183+
184+
julia> f = MOI.ScalarNonlinearFunction(:sin, Any[x])
185+
sin(MOI.VariableIndex(1))
186+
187+
julia> MOI.Nonlinear.SymbolicAD.derivative(f, x)
188+
cos(MOI.VariableIndex(1))
189+
```
190+
191+
Note that the resultant expression can often be simplified. Thus, in most cases
192+
you should call [`Nonlinear.SymbolicAD.simplify`](@ref) on the expression before
193+
using it in other places:
194+
195+
```jldoctest
196+
julia> x = MOI.VariableIndex(1)
197+
MOI.VariableIndex(1)
198+
199+
julia> f = MOI.ScalarNonlinearFunction(:sin, Any[x + 1.0])
200+
sin(1.0 + 1.0 MOI.VariableIndex(1))
201+
202+
julia> df_dx = MOI.Nonlinear.SymbolicAD.derivative(f, x)
203+
*(cos(1.0 + 1.0 MOI.VariableIndex(1)), 1.0)
204+
205+
julia> MOI.Nonlinear.SymbolicAD.simplify!(df_dx)
206+
cos(1.0 + 1.0 MOI.VariableIndex(1))
207+
```
208+
209+
## `gradient_and_hessian`
210+
211+
In some cases, you may want to compute the gradient and (sparse) Hessian matrix
212+
of a function. One way to achieve this is by recursively calling
213+
[`Nonlinear.SymbolicAD.derivative`](@ref) on the result of [`Nonlinear.SymbolicAD.derivative`](@ref)
214+
for each variable in the list of [`Nonlinear.SymbolicAD.variables`](@ref). But,
215+
to simplify the process, you should use [`Nonlinear.SymbolicAD.gradient_and_hessian`](@ref):
216+
217+
```jldoctest
218+
julia> x = MOI.VariableIndex.(1:2);
219+
220+
julia> f = MOI.ScalarNonlinearFunction(:sin, Any[1.0 * x[1] + 2.0 * x[2]])
221+
sin(0.0 + 1.0 MOI.VariableIndex(1) + 2.0 MOI.VariableIndex(2))
222+
223+
julia> y, ∇f, H, ∇²f = MOI.Nonlinear.SymbolicAD.gradient_and_hessian(f);
224+
225+
julia> y
226+
2-element Vector{MathOptInterface.VariableIndex}:
227+
MOI.VariableIndex(1)
228+
MOI.VariableIndex(2)
229+
230+
julia> ∇f
231+
2-element Vector{Any}:
232+
cos(0.0 + 1.0 MOI.VariableIndex(1) + 2.0 MOI.VariableIndex(2))
233+
*(cos(0.0 + 1.0 MOI.VariableIndex(1) + 2.0 MOI.VariableIndex(2)), 2.0)
234+
235+
julia> H
236+
3-element Vector{Tuple{Int64, Int64}}:
237+
(1, 1)
238+
(1, 2)
239+
(2, 2)
240+
241+
julia> ∇²f
242+
3-element Vector{Any}:
243+
-(sin(0.0 + 1.0 MOI.VariableIndex(1) + 2.0 MOI.VariableIndex(2)))
244+
*(-(sin(0.0 + 1.0 MOI.VariableIndex(1) + 2.0 MOI.VariableIndex(2))), 2.0)
245+
*(-(sin(0.0 + 1.0 MOI.VariableIndex(1) + 2.0 MOI.VariableIndex(2))), 4.0)
246+
```
247+
248+
where:
249+
250+
* `y` is the list of variables that appear in `f`
251+
* `∇f` is the first partial derivatives of `f` with respect to each variable in
252+
`y`
253+
* `H` and `∇²f` form a sparse Hessian matrix, were `H` is the `(row, column)`
254+
index of each element, and the corresponding element in `∇²f` is the second
255+
partial derivative of `f` with respect to `y[row]` and `y[column]`.
256+
257+
Unlike [`Nonlinear.SymbolicAD.derivative`](@ref), the gradient and Hessian
258+
expressions have already been simplified; you do not need to call
259+
[`Nonlinear.SymbolicAD.simplify`](@ref).

docs/src/submodules/Nonlinear/reference.md

+11
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ Nonlinear.Evaluator
7272
Nonlinear.AbstractAutomaticDifferentiation
7373
Nonlinear.ExprGraphOnly
7474
Nonlinear.SparseReverseMode
75+
Nonlinear.SymbolicMode
7576
```
7677

7778
## Data-structure
@@ -86,3 +87,13 @@ Nonlinear.parse_expression
8687
Nonlinear.convert_to_expr
8788
Nonlinear.ordinal_index
8889
```
90+
91+
## SymbolicAD
92+
93+
```@docs
94+
Nonlinear.SymbolicAD.simplify
95+
Nonlinear.SymbolicAD.simplify!
96+
Nonlinear.SymbolicAD.variables
97+
Nonlinear.SymbolicAD.derivative
98+
Nonlinear.SymbolicAD.gradient_and_hessian
99+
```

docs/styles/config/vocabularies/JuMP/accept.txt

+2
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,8 @@ Clp
4747
Goldfarb
4848
Gurobi
4949
GZip
50+
Hassan
51+
Hijazi
5052
Holy
5153
JSONSchema
5254
MOI

src/Nonlinear/Nonlinear.jl

+1
Original file line numberDiff line numberDiff line change
@@ -48,5 +48,6 @@ include("model.jl")
4848
include("evaluator.jl")
4949

5050
include("ReverseAD/ReverseAD.jl")
51+
include("SymbolicAD/SymbolicAD.jl")
5152

5253
end # module

0 commit comments

Comments
 (0)