Direct collocation is now widely used for solving biomechanical optimal control problems. For example, OpenSim now includes Moco (Dembia et al. 2020) which allows users to find optimal control solutions with lower overhead. One advantage of direct collocation is the speed at which solutions can be found, which can be orders of magnitude faster than methods which require sequentially integrating the differential equations. Direct collocation methods can still result in overall slower speed when searching for global minima and choices in the software design can reduce computational efficiency. For example, Moco relies on finite differences for gradients which may reduce performance and accuracy.
Solving an optimal control problem resulting from direct collocation requires evaluating the system’s equations of motion, its Jacobian, and possibly its Hessian thousands to millions of times, but this evaluation is parallelizable over multiple compute cores and can leverage the massive sparsity of the equations. Maximizing the compute and memory performance of the functions that evaluate the constraints and their derivatives allow the subsequent optimization algorithms to solve very large problems, both in simulation duration and number of degrees of freedom.
Early dynamics simulation software used computer-aided algebra techniques, which allowed for analytical derivatives when forming equations of motion, but this has faded from popularity with the growth of numerical physics engines. Code generated from computer-aided algebra can be optimized for evaluation performance using both compiler pre-optimizations and built-in compiler optimizations that can be harder to apply to numeric physics engines.
We will demonstrate software that leverages symbolic formulations to generate computationally efficient parallelized functions based on implicit dynamics to evaluate the non-linear programming problem’s constraints and its Jacobian based on the ideas of (van den Bogert, Blana, and Heinrich 2011). We demonstrate solving a 3D arm-muscle driven bicycle-rider model that has three degrees of freedom and seven algebraic constraints resulting in millions of floating point operations in the implicit equations of motion.
| Symbolic | Symbolic | Constraint | Jacobian | Symbolic | OCP | NLP | Objective | Gradient | Constraint | Jacobian |
| EoM [s] | Jacobian [s] | Clang [s] | Clang [s] | OCP [s] | Solve [s] | iterations | evaluations | evaluations | evaluations | evaluations |
| 14.9 | 35.4 | 58.3 | 66.7 | 169.8 | 73.0 | 286 | 1098 | 286 | 1098 | 292 |
The non-linear equations of motion for musculoskeletal-vehicle systems can be described by differential algebraic equations that are functions of the states \(\mathbf{y} \in \mathbb{R}^n\), controls \(\mathbf{r}\), and constant parameters \(\mathbf{p}\), in general. The implicit equations take this form: \[\begin{aligned} \mathbf{f}(\dot{\mathbf{y}}(t), \mathbf{y}(t), \mathbf{r}(t), \mathbf{p}) = \mathbf{0} \in \mathbb{R}^n \end{aligned}\] These equations can contain elements such as musculotendon force-velocity relationships, kinematic loops, friction and collision forces, etc. We write these equations as analytically differentiable functions. To map these to non-linear programming constraints we discretize the equations with backward Euler differentiation over a constant time step and code generate functions in C using opty (Moore and van den Bogert 2018) that evaluate the constraints and Jacobian over all \(N\) nodes while exploiting the high sparsity of the Jacobian. The symbolic Jacobian is calculated in forward mode applying the chain rule through all common sub-expressions which efficiently handles differentiating equations of motion with many mathematical operations. This results in cacheable computationally efficient numerical functions that evaluate the non-linear programming problem constraints and its Jacobian. Evaluation of the objective is generally a low computational cost, so this is only done in Python, but also with a symbolic-to-numeric conversion.
To demonstrate our methods, we developed a bicycle-rider model that is the non-linear Carvallo-Whipple model of the vehicle with four additional rigid bodies representing the upper and lower arm body segments using SymBRiM (Stienstra, Brockie, and Moore 2023). We include four lumped Hill-type muscle models representing the extensor and flexor groups for the elbow. The model has seven holonomic constraints and four nonholonomic constraints resulting in a 10th order model with seven additional algebraic equations. This results in 17 equations with 2.8M floating point operations in the implicit symbolic equations of motion. After efficient Jacobian formulation and compiler pre-optimizations, this reduces to 10K and 69K operations for evaluating the constraints and Jacobian, respectively. This numerical evaluation time depends on the number of nodes, so proportional to \(N\times\)79K. For \(N=\) 10K we time the evaluation of these two functions with and without parallelization using OpenMP 4.5 seeing a 4.6\(\times\) and 3.1\(\times\) speed up over 23 ms and 190 ms, respectively. Lastly, we define a minimal muscle excitation lane change tracking task, starting from zero speed up to a mean speed of 3 m/s over the fixed duration of 6 seconds, as an optimal control problem.
Table [tab:performance] shows timings for solving a 200 node lane change problem. Building the model and forming the OCP takes 185 seconds with most time spent compiling the generated functions with Clang. This is a fixed cost regardless of the size of \(N\) and can be cached to disk and need not be built again unless the dynamics model changes. Ipopt solves the problem in 73 seconds. Fig. 1 shows an example optimal solution.
Our methods focus on maximizing the performance of evaluating the constraint and Jacobian numerical functions. The performance of these functions scales linearly with the number of collocation nodes and can be sped up multiplicatively by the number of compute cores available in the best-case scenario. When Ipopt needs to evaluate these functions thousands or millions of times, for example in a long-duration problem, the overall compute time can be greatly reduced.
P. Stahlecker contributed to opty; funds were from CZI.