Dynamical models

Aaron Meyer (based on slides from Bruce Tidor)

Ordinary Differential Equations

  • ODE models are typically most useful when we already have an idea of the system components
    • As opposed to data-driven approaches when we don’t know how to connect the data
    • Incredibly powerful for making specific predictions about how a system works
  • Limits of these approaches:
    • Results can be extremely sensitive to missing components or model errors
    • Can quickly explode in complexity
    • May rely on variables that are impossible to measure

Applications of ODE models

Molecular kinetics

Remember BE 110!

Let’s say we have two ligands that dimerize, then this dimer binds to a receptor as one unit:

\[ L_f + L_f \leftrightarrow L_D \]

\[ L_D + R_f \leftrightarrow R_b \]

If we want to know about how these species interact, we can model their behavior with the rate equations:

BOARD

Pharmacokinetics

flowchart LR
    drugInjection("drug injection") --> C1
    
    subgraph CentralCompartment["central compartment"]
        C1["C<sub>1</sub><br/>V<sub>1</sub>"]
    end
    
    subgraph PeripheralCompartment["peripheral compartment"]
        C2["C<sub>2</sub><br/>V<sub>2</sub>"]
    end
    
    C1 -->|k1| C2
    C2 -->|k2| C1
    C1 -->|ke| Elimination

    style CentralCompartment rect
    style PeripheralCompartment rect

Applications of ODE models: Pharmacokinetics

  • Central compartment corresponds to the plasma in the body.
    • \(V_1\) is the distribution volume of plasma in the body.
    • \(C_1\) is the concentration of drug in the plasma.
  • Peripheral compartment represents a group of organs that significantly take up the particular drug.
    • \(V_2\) is the volume of these group of organs.
    • \(C_2\) is the concentration of drug in the group of organs.

Applications of ODE models: Pharmacokinetics

  • \(k_e\) is the rate constant for clearance.
    • \(k_e C_1 V_1\) is the mass of drug/time that’s cleared.
  • \(k_1\) is the rate constant for mass transfer from the central to peripheral compartment.
    • \(k_1 C_1 V_1\) is the mass of drug/time that transfers from the central to peripheral compartment.
  • \(k_2\) is the rate constant for mass transfer from the peripheral to central compartment.
    • \(k_2 C_2 V_2\) is the mass of drug/time that transfers from the peripheral to the central compartment.

Applications of ODE models: Pharmacokinetics

  • Have a bolus i.v. injection
    • No drug in both compartments for \(t<0\)
    • Total dose \(D\) (μg) administered at once at \(t=0\)
    • Drug distribution occurs instantaneously in the central compartment.
      • Also get well-mixed instantaneously.
      • Concentration in central compartment at \(t=0\) is \(C_1(0) = D/V_1\) μg/mL
  • No chemical reactions in the compartment

Applications of ODE models: Population kinetics

Lotka-Volterra Equations

  • Prey population \(x\), predator population \(y\):

\[\dot{x} = \alpha x - \beta x y\]

\[\dot{y} = \delta x y - \gamma y\]

  • \(\alpha\): prey birth rate; \(\beta\): predation rate per encounter
  • \(\gamma\): predator death rate; \(\delta\): predator growth per prey consumed

Types of Analysis

Note about difference from other models we’ve covered

  • ODE models can be part of inference techniques just as elsewhere
  • But we often don’t have a symbolic expression of the answer
    • Have to simulate the model every time
    • Can only focus on the input-output we get from the black box
  • In this respect, what we do with ODE models will be very similar to what you could do with any computational simulation

Analytic vs Numerical Modeling

Analytic

  • Wider range of parameters
  • Avoid numerical problems
  • Physical intuition more direct
  • Often must simplify model

Numerical

  • Can handle complex models
  • Dependence on parameters & initial conditions
  • Physical insight may be difficult to extract
  • Convergence, numerical stability

Analytic vs Numerical Modeling

Reality often requires handling in between:

  • Use analytic treatment to study entire parameter space
  • Use numerical treatment to study interesting regions
  • Use both to handle complex behavior

Stability Analysis

  • Can solve for steady-states of a system \[\frac{dF}{dt} = 0\]
  • Results of this can be both stable or unstable points
    • With stable points, slope of \(\frac{dF}{dt}\) is negative
    • In multivariate case, this means eigenvalues of Jacobian are negative
  • Steady-state points aren’t necessarily realistic or feasible!
    • NNLSQ can solve for points
    • Only simulating system ensures they are accessible

Linear vs. Nonlinear Systems

  • Linear models are easier to simulate and understand than non-linear
    • Linearity: If \(\mathbf{x}_1\) and \(\mathbf{x}_2\) are both solutions, then \(c_1 \mathbf{x}_1 + c_2 \mathbf{x}_2\) is also a solution
  • Linear systems tend to be separable (effective decoupling)
  • Non-linear systems exhibit interesting properties

Linearity & Separability

Linear separability and superposition principles in dynamical systems

Phase Portraits

\[ \dot{\vec{x}} = \vec{f}(\vec{x}) \]

Phase portrait showing trajectories in a dynamical system

Non-linear systems

  • No general analytic approach to finding trajectory
  • So, goal is to understand qualitative trajectory behavior

Features in Phase Portraits

Features of phase portraits including fixed points and closed orbits

Solving a Set of Equations for Phase Portrait

  • Numerical computation
    • i.e., Runge-Kutta integration
  • Qualitative
    • Sufficient for some purposes
  • Analytic/symbolic
    • Elegant, though not always tractable

Step 1: Find the fixed points

Identification of fixed points where derivatives are zero

Step 2: Determine stability of fixed points

  • If the systems moves slightly away from each fixed point, will it return or will it move further away?
  • Another way to ask the same question is to ask whether, as time approaches infinity, does the system tend toward or away from a given stable point.
  • Note \(y\) solution must be of form:
    • \(y = y_0 e^{-t}\) (because \(\dot{y} = \frac{dy}{dt} = -y\))
    • So \(y \rightarrow 0\) for \(t \rightarrow \infty\)
  • Thus, \(\dot{x} = x + e^{-y}\) becomes \(\dot{x} \rightarrow x + 1\) for long times
    • This has exponentially growing solutions
    • Toward \(\infty\) for \(x > -1\) and \(-\infty\) for \(x < -1\)

Solution grows exponentially in at least one dimension, so it is unstable.

Step 3: Sketch nullclines

Nullclines are the sets of points for which \(\dot{x} = 0\) or \(\dot{y} = 0\), so flow is either horizontal or vertical.

Nullclines separating regions of horizontal and vertical flow

Step 4: Plot flow lines

Flow lines indicating system trajectories and saddle point behavior

Existence & Uniqueness

Non-linear \(\dot{\mathbf{x}} = f(\mathbf{x})\) and given an initial condition.

  • Existence and uniqueness of solution guaranteed if \(f\) is continuously differentiable
  • Corollary: Trajectories do not intersect, because if they did, then there would be two solutions for the same initial condition at the crossing point

Linearization About Fixed Points

Linearization of a non-linear system around a fixed point

Solving Linearized Systems

Eigenvalue analysis for solving linearized dynamical systems

Example

Example phase portrait of a linearized system

More Examples

Various types of fixed points in phase portraits

Classification of Fixed Points

Classification of fixed points based on eigenvalue properties

Eigenvalue Behavior Summary

Eigenvalues Fixed Point Type Behavior
Real, both negative Stable node Trajectories converge to fixed point
Real, both positive Unstable node Trajectories diverge from fixed point
Real, opposite signs Saddle point Stable in one direction, unstable in another
Complex, negative real part Stable spiral Oscillatory decay toward fixed point
Complex, positive real part Unstable spiral Oscillatory divergence from fixed point
Purely imaginary Center Neutral closed orbits (undamped oscillation)

Relevance for Nonlinear Dynamics

  • So, we have said that we can find fixed points of nonlinear dynamics, linearize about each fixed point, and characterize the dynamics about each fixed point in the non-linear model by the corresponding linear model.
  • Is this always true? Do the nonlinearities ever disturb this approach?
  • A theorem can be proven which states
    • That all the regions on the previous slide are “robust” (nodes, spirals, saddles) and correspond between linear and nonlinear models.
    • But that all the lines on the previous slide are “delicate” (centers, stars, degenerate nodes, non-isolated fixed points) and can have different behaviors in linear and non-linear models.

Bifurcations

  • The phase portraits we have been looking at describe the trajectory of the system for a given set of initial conditions. However, for “fixed” parameters (rate constants in eqns, for instance).
  • What we might like is a series of phase portraits corresponding to different sets of parameters.
  • Many will be qualitatively similar. The interesting ones will be where a small change of parameters creates a qualitative change in the phase portrait (bifurcations).
  • What we will find is that fixed points & closed orbits can be created/destroyed and stabilized/destabilized.

Saddle-Node Bifurcation

Diagram of a saddle-node bifurcation
  • Canonical example: \(\dot{x} = r + x^2\)
    • For \(r < 0\): two fixed points (one stable, one unstable)
    • At \(r = 0\): the two fixed points collide and annihilate
    • For \(r > 0\): no fixed points — system escapes to infinity

Hopf Bifurcation

  • A Hopf bifurcation occurs when a fixed point changes stability and a limit cycle (sustained oscillation) is born or destroyed
  • Eigenvalues of \(J\) cross the imaginary axis as a parameter varies
  • Common in biological oscillators: circadian rhythms, cell cycle checkpoints, neural firing

Example: Genetic Control Network

Genetic Control Network

The Griffith model is a canonical example of a saddle-node bifurcation giving rise to bistability — two coexisting stable steady states separated by an unstable one. Griffith (1971) model of genetic control:

  • x = protein concentration
  • y = mRNA concentration
Griffith (1971) genetic control network model diagram

Genetic Control Network

Biochemical version of a bistable switch:

  1. Only stable points are no protein and mRNA or a fixed composition
  2. If degradation rates too great, only stable point is origin
Phase portrait of a genetic control network acting as a bistable switch

Implementation

Testing

  • Many properties one can test
    • Mass balance
    • Changes upon parameter adjustment
  • Good to test these before and after integration

Python Packages

SciPy provides several interfaces for ODE solving:

Notes:

  • All can solve stiff and non-stiff equations.
  • scipy.integrate.ode exposes multiple backends; use set_integrator to select (e.g., 'vode', 'lsoda').
  • solve_ivp uses the method argument instead (default 'RK45'; use 'Radau' or 'BDF' for stiff systems).

Code Example: Pendulum Oscillation

The second order differential equation for the angle \(\theta\) of a pendulum acted on by gravity with friction can be written:

\[\theta''(t) + b*\theta'(t) + c*\sin(\theta(t)) = 0\]

where b and c are positive constants, and a prime (‘) denotes a derivative. To solve this equation with odeint, we must first convert it to a system of first order equations. By defining the angular velocity \(\omega(t) = \theta'(t)\), we obtain the system:

\[\theta'(t) = \omega(t)\]

\[\omega'(t) = -b*\omega(t) - c*\sin(\theta(t))\]

Pendulum Oscillation

Let y be the vector \([\theta, \omega]\). We implement this system in python as:

def pend(y, t, b, c):
    theta, omega = y
    dydt = [omega, -b*omega - c*np.sin(theta)]
    return dydt

We assume the constants are \(b = 0.25\) and \(c = 5.0\):

b, c = 0.25, 5.0

Pendulum Oscillation

For initial conditions, we assume the pendulum is nearly vertical with \(\theta(0) = \pi - 0.1\), and it initially at rest, so \(\omega(0) = 0\). Then the vector of initial conditions is

y0 = [np.pi - 0.1, 0.0]

We generate a solution 101 evenly spaced samples in the interval \(0 \leq t \leq 10\). So our array of times is:

t = np.linspace(0, 10, 101)

Pendulum Oscillation

Call odeint to generate the solution. To pass the parameters b and c to pend, we give them to odeint using the args argument.

from scipy.integrate import odeint
sol = odeint(pend, y0, t, args=(b, c))

The solution is an array with shape (101, 2). The first column is \(\theta(t)\), and the second is \(\omega(t)\). The following code plots both components.

Pendulum Oscillation

import matplotlib.pyplot as plt
plt.plot(t, sol[:, 0], 'b', label='theta(t)')
plt.plot(t, sol[:, 1], 'g', label='omega(t)')
plt.legend(loc='best')
plt.xlabel('t')
plt.grid()
plt.show()

Pendulum Oscillation

Plot of pendulum angle and angular velocity over time

Analysis Example: The Cell Cycle

Consider a simple model of cells transitioning between two phases of the cell cycle:

graph LR
    G1[G1 Phase] -->|α| G2S[G2/S Phase]
    G2S -->|β| G1
    G1 -->|γ| Death[Death]
    G2S -->|γ| Death
    
    style G1 fill:#f9d5e5,stroke:#333,stroke-width:2px
    style G2S fill:#a1e7e6,stroke:#333,stroke-width:2px
    style Death fill:#f8b195,stroke:#333,stroke-width:2px

Can this model oscillate under physically meaningful situations and, if so, when?

Implementation Note: Stiff Systems

  • Very roughly, most ODE solvers take steps inversely proportional to the rate at which the state is changing
  • For systems where there are two processes operating on differing timescales, this can be problematic
    • If everything happens really fast, the system will come to equilibrium quickly
    • If everything is slow, you can take longer steps
  • A typical biological example: receptor–ligand binding equilibrates in milliseconds, while downstream transcriptional responses occur over hours — a \(10^6\)-fold difference in timescale
  • Stiff solvers additionally require the Jacobian matrix
    • This very roughly allows them to keep track of these differences in timescales
  • odeint can automatically find this for you
    • Sometimes it’s faster/better to provide this as parameter Dfun
  • For stiff problems with solve_ivp, prefer method='Radau' or method='BDF'

Fitting ODE Models to Data

  • ODE models have no closed-form solution in general — must simulate to evaluate the likelihood
  • Fitting is an outer optimization loop around numerical integration:
from scipy.integrate import solve_ivp
from scipy.optimize import minimize

def residuals(params):
    sol = solve_ivp(model, t_span, y0, args=(params,), t_eval=t_obs)
    return np.sum((sol.y.T - y_obs)**2)

result = minimize(residuals, params_init)
  • Can constrain to oscillatory solutions by checking eigenvalues of \(J\) at steady state
    • Oscillation requires eigenvalues with nonzero imaginary part

Machine Learning as a Dynamical System

Optimization as a Discrete-Time ODE

  • Every iterative ML optimizer is a discrete-time dynamical system
    • State vector: model parameters \(\theta_t\) (and, for adaptive optimizers, auxiliary moment estimates)
    • Update rule: \(\theta_{t+1} = \theta_t - \eta \, g_t\) (analogue of an Euler integration step)
  • The learning rate \(\eta\) plays the role of the integration step size
    • Too small: slow convergence (fine-grained but inefficient)
    • Too large: oscillation or divergence — exactly as in stiff ODE solvers

Stability Analysis of Gradient Descent

For a quadratic loss \(L(\theta) = \frac{1}{2}\theta^T H \theta\), the gradient descent update is:

\[\theta_{t+1} = (I - \eta H)\,\theta_t\]

The discrete eigenvalues are \(\lambda_i = 1 - \eta h_i\), where \(h_i\) are eigenvalues of the Hessian \(H\).

Eigenvalue \(\lambda_i\) Fixed-Point Type Behavior
\(0 < \lambda < 1\) Stable node Monotone convergence
\(-1 < \lambda < 0\) Stable spiral Oscillatory convergence
\(\lambda < -1\) Unstable Divergence
\(\lambda = 0\) Dead-beat Converges in one step

Stability Analysis of Gradient Descent

Stability requires \(0 < \eta < \frac{2}{h_\text{max}}\) — controlled entirely by the largest Hessian eigenvalue.

Three regimes of gradient descent for different learning rates: monotone convergence, oscillatory convergence, and divergence

The Condition Number Problem

  • Realistic loss surfaces have Hessian eigenvalues spanning many orders of magnitude (ill-conditioned)
  • For \(L(\theta) = \frac{a}{2}\theta_1^2 + \frac{b}{2}\theta_2^2\) with \(a \gg b\):
    • Stability requires \(\eta < 2/a\) — set by the stiff direction
    • The slow direction \(\theta_2\) then requires \(\sim \frac{a}{b}\) more steps to converge

The Condition Number Problem

  • This is precisely the stiff ODE problem: fast and slow timescales coexist
    • Just as stiff ODE solvers need the Jacobian, adaptive optimizers track local curvature

Phase portraits of gradient descent on an ill-conditioned quadratic loss showing monotone, oscillatory, and diverging trajectories

Learning Rate as a Bifurcation Parameter

  • As \(\eta\) increases past \(1/h_\text{max}\), the fixed point changes from a stable node to a stable spiral — a bifurcation
  • Past \(2/h_\text{max}\): the fixed point becomes unstable — training diverges
  • This is structurally identical to the bifurcations we studied for ODE systems

Bifurcation diagram showing how final parameter values depend on learning rate, with stable, oscillatory, and divergent regimes

Adam as a Coupled Dynamical System

Adam maintains three state variables — exactly like an ODE system with three coupled equations:

\[m_t = \beta_1 m_{t-1} + (1-\beta_1)\,g_t \qquad \text{(1st moment — smoothed gradient)}\]

\[v_t = \beta_2 v_{t-1} + (1-\beta_2)\,g_t^2 \qquad \text{(2nd moment — smoothed curvature)}\]

\[\theta_t = \theta_{t-1} - \eta \, \frac{\hat{m}_t}{\sqrt{\hat{v}_t} + \varepsilon} \qquad \text{(parameter update)}\]

  • \(m_t\) and \(v_t\) are state variables with their own dynamics (exponential moving averages)
  • \(\beta_1, \beta_2\) are decay constants — analogous to rate constants in ODE models
  • The effective step size \(\eta / \sqrt{v_t}\) adapts to local curvature, making Adam more robust to ill-conditioning than SGD

Adam as a Coupled Dynamical System

Adam optimizer state variables over training: parameter trajectory, first moment, second moment, and effective step size

Reading & Resources

Review Questions

  1. What is the steady-state solution to an ODE model? How do you solve for them?
  2. What is a phase-space diagram?
  3. What property of ODE models ensures that solutions can never cross in phase space?
  4. What is a Jacobian matrix? How do you calculate it from an ODE system?
  5. What are the eigenvalues and eigenvectors of a matrix?
  6. What is the behavior of an ODE system if its eigenvalues are all positive and real? How about negative and real?
  7. What is the behavior of an ODE system if all its eigenvalues are imaginary and positive? How about imaginary and negative?
  8. What does it mean if the eigenvalues of an ODE system give conflicting answers (e.g. one is real and negative, the other is imaginary)?
  9. How can you fit an ODE model to a series of measurements over time?
  10. How could you constrain an ODE system so that you only consider parameter values that have oscillation?
  11. Why does gradient descent on an ill-conditioned loss surface behave like a stiff ODE? What do adaptive optimizers like Adam do to address this?
  12. What is the analogue of a bifurcation in the context of optimizer dynamics? What happens at the stability boundary \(\eta = 2/h_\text{max}\)?