Architecture¶
Developer-oriented view of the crate. Read concepts
first for the abstractions this chapter assumes.
This page covers four things, in order:
- Module map — where everything lives
- Data flow — how values travel from user input to packed frame
- Algorithms — pseudo-code for the three nested loops
- Hot path — what one objective evaluation does
- Invariants and conventions — load-bearing rules
Module map¶
src/
├── lib.rs public re-exports + rustdoc chapters
├── packer.rs Molpack builder + pack() driver + phase / iteration loops
├── target.rs Target — molecule type + per-molecule restraints
├── restraint.rs Restraint trait + 14 concrete *Restraint structs
├── region.rs Region trait + And/Or/Not + RegionRestraint
├── relaxer.rs Relaxer / RelaxerRunner + TorsionMcRelaxer
├── handler.rs Handler trait + 4 built-in observers
├── objective.rs compute_f / compute_g / compute_fg + Objective impl
├── context/ PackContext = single owner of mutable packing state
│ ├── pack_context.rs
│ ├── model.rs immutable topology + inputs
│ ├── state.rs mutable per-iteration state
│ └── work_buffers.rs scratch arrays (xcart, gxcar, …)
├── constraints/ EvalMode / EvalOutput facade
├── gencan/ bound-constrained quasi-Newton optimizer
│ ├── mod.rs pgencan / gencan / tn_linesearch
│ ├── cg.rs conjugate-gradient inner solve
│ └── spg.rs spectral projected gradient fallback
├── initial.rs initial random placement + restmol pre-fit
├── movebad.rs worst-molecule perturbation heuristic
├── euler.rs Euler angles ↔ rotation matrices
├── cell.rs cell-list neighbor lookup
├── frame.rs PackContext ↔ molrs::Frame conversions
├── validation.rs post-pack correctness check
├── script/ .inp parser + lowering to Targets
├── api/ builder facade re-exports
└── bin/molpack/ CLI front-end (cli feature)
Dependency direction¶
lib.rs
│
packer.rs (driver — depends on everything below)
│
┌──────────┬──────┴──────┬──────────┬──────────┐
▼ ▼ ▼ ▼ ▼
target initial gencan movebad handler
│ │ │ │
│ └────────┐ │ │
▼ ▼ ▼ ▼
restraint + region context/PackContext
│
▼
objective.rs ← hot path
│
└── constraints/ (EvalMode facade)
target / restraint / region are pure data — no driver imports.
packer is the only module that imports everything else. objective
is the narrow waist through which all per-atom work flows.
Data flow¶
USER INPUTS ─→ Target / Molpack builders
Frame, count, restraints,
handlers, tolerance, seed
─→ pack() entry
a. broadcast global → per-target restraints
b. snapshot every Target
c. build PackContext
ModelData (immutable topology)
RuntimeState (x, coor, radius)
WorkBuffers (xcart, gxcar, scratch)
d. flatten restraints → CSR pool
e. initial placement → x[0..6·ntotmol]
PER-ITERATION ─→ evaluate(x, mode, &mut g)
GENCAN reads x, → expand_molecules: x → xcart
reads f / g via → restraint penalties per atom
&mut dyn Objective → cell list + pair penalties
→ project gradient back: gxcar → g
returns f_total, fdist, frest
OUTPUT ─→ Frame
pack_with_report() also exposes
converged, fdist, frest
Three rules govern this flow:
PackContextowns mutable state. GENCAN, movebad, handlers, and the phase driver all take&mut PackContext(writers) or&PackContext(observers). No other module owns mutable state across iterations.Arc<dyn Restraint>for polymorphic storage. Cheap clone (refcount bump) into the per-atom CSR pool. The hot path does one virtual call per restraint per atom.- GENCAN is decoupled.
gencan/pgencantakes&mut dyn Objective, not&mut PackContext. Synthetic objectives (Rosenbrock, Booth, Beale) exercise the optimizer in isolation.
Coordinate layout¶
The optimizer variable vector x packs centers of mass and Euler angles:
Cartesian atom positions xcart: Vec<[F; 3]> of length ntotat are
expanded each evaluation:
where i is molecule type, m is copy index, a is atom index.
Algorithms¶
Three nested loops drive the packer.
Outer: pack() (one call)¶
fn pack(targets, max_loops):
validate inputs (non-empty, valid PBC, atoms > 0)
broadcast Molpack.global_restraints → each target's molecule_restraints
split targets into free / fixed
build PackContext
run init_passes of restmol(): // geometric pre-fit, no pair kernel
for each free target type:
place molecules randomly inside their restraints
relax restraint penalties only
handlers.on_start, handlers.on_initialized
for phase in 0 ..= ntype:
if phase < ntype:
comptype[i] := (i == phase) // PER-TYPE pre-compaction
else:
comptype[i] := true // ALL-TYPES main phase
report := run_phase(phase, max_loops, …)
if report.error_phase: break
handlers.on_finish
build Frame; pack_with_report() also returns converged/fdist/frest
Why per-type pre-compaction first: if every type optimizes simultaneously from a random start, cross-type interference traps the solver in shallow minima. Compacting one type at a time inside its own restraint volume gives the all-types phase a much better seed.
Middle: run_phase (one phase)¶
fn run_phase(phase_id, max_loops):
handlers.on_phase_start(phase_info)
radscale := discale // start with inflated radii (default 1.1)
relax_runners := build relaxer runners for this phase
// Quick-exit: if the unscaled objective is already below precision,
// skip the whole phase.
if evaluate_unscaled(sys, x).below(precision): return Converged
for loop_idx in 0 .. max_loops:
result := run_iteration(loop_idx, radscale, relax_runners)
radscale := decay(radscale) // → 1.0 over the phase
handlers.on_step(step_info, sys)
if result.converged: return Converged
if handlers.should_stop(): return EarlyStop
return MaxLoops
radscale starts at discale (1.1) and decays toward 1.0 over the
phase. This soft-starts the pair penalty: the optimizer first sees
slightly oversized atoms (easier to push apart) and tightens to true
tolerance as the phase progresses.
Inner: run_iteration (one outer step)¶
fn run_iteration(loop_idx, radscale, runners):
// 1. Movebad — relocate the K worst molecules.
if movebad enabled:
identify atoms with largest restraint + pair penalty
perturb their COM/Euler within init_box_half_size
// 2. Relaxers — update reference geometry per type (count == 1 only).
for (type, runner) in runners:
runner.on_iter(ref_coords, f_current, &mut evaluate, rng)
if accepted: write back new ref_coords
// 3. GENCAN — bound-constrained quasi-Newton solve.
pgencan(x, &mut sys, params, precision)
// Internally: tn_linesearch → CG inner solve → SPG fallback,
// each step calls sys.evaluate(x, mode, g).
// 4. Convergence check on the unscaled objective.
f_unscaled := evaluate_unscaled(sys, x)
fimp := percentage improvement vs previous loop
converged := fdist < precision AND frest < precision
return { converged, fimp, fdist, frest }
GENCAN itself runs three nested solvers:
pgencan: project x onto bounds, then call gencan
gencan: truncated-Newton outer; calls tn_linesearch
tn_ls: conjugate-gradient line search; SPG fallback if CG stalls
Each leaf step calls sys.evaluate(x, mode, &mut g) — the hot path.
Hot path: objective evaluation¶
PackContext::evaluate is invoked O(10³–10⁴) times per pack() run.
Performance lives here.
evaluate(x, mode, g) dispatches by mode:
FOnly → compute_f
GradientOnly → compute_g
FAndGradient → compute_fg
RestMol → compute_fg (init phase, pair kernel skipped)
compute_fg is the canonical path — it does five steps:
1. expand_molecules(x):
for each molecule type t, copy m, atom a:
xcart[icart] := com_t,m + R(eul_t,m) · ref_coords[t, a]
2. accumulate_constraint_value_and_gradient (per atom icart):
range := iratom_offsets[icart] .. iratom_offsets[icart + 1]
for &irest in iratom_data[range]:
f += sys.restraints[irest].fg(xcart[icart], scale, scale2,
&mut grad_xcart[icart])
// Linear penalties consume `scale`; quadratic consume `scale2`.
3. insert_atom_in_cell (per atom):
linked-list bucket atoms into cells
cell side ≈ 2 × max_radius × radscale
4. accumulate_pair_fg (or _parallel under rayon):
for each non-empty cell c:
for each neighbor cell c′ in 13-cell stencil:
for each (i ∈ c, j ∈ c′):
d := pbc_distance(xi, xj)
σ := (rᵢ + rⱼ) · radscale
if d < σ:
penalty := (σ − d)²
grad_xcart[i] += d penalty / d xi
grad_xcart[j] += d penalty / d xj
5. project_cartesian_gradient:
for each molecule m, atom a:
g_com[m] += grad_xcart[icart]
g_euler[m] += Jᵀ(eul_m, ref_a) · grad_xcart[icart]
// J = ∂xcart/∂eul, derived once per molecule from R(eul).
Cost breakdown: steps 1–3 are O(N_atoms); step 4 is
O(N_atoms × neighbor_avg) ≈ O(N_atoms × 32) and dominates wall time on
realistic workloads. Step 4 is the rayon parallelization point
(accumulate_pair_fg_parallel), reducing into per-atom gradient slots
via AtomicU64 (since Cell<f64> is not Sync).
The Arc<dyn Restraint> virtual call in step 2 measured at +0.22% e2e
versus the prior monomorphic dispatch — a negligible cost for the
flexibility of user-defined restraints.
Invariants and conventions¶
Gradient accumulation. Restraint::fg accumulates the true
gradient (∂penalty/∂x) into g with +=. Optimizer negates for descent.
Multiple restraints may touch one atom, so never overwrite.
Two-scale contract. Linear penalties (Packmol kinds ⅔/6/7/10/11)
consume scale; quadratic penalties (kinds ⅘/8/9/12/13/14/15) consume
scale2. Each impl Restraint picks one internally.
Rotation convention. R_new = δR · R_old (LEFT multiplication).
Single-atom tests cannot detect LEFT/RIGHT bugs — always test with
≥ 2 atoms.
Coordinate layout. GENCAN's x is [com₀..n, eul₀..n] of length
6·ntotmol. Cartesian atom positions xcart are Vec<[F; 3]> of length
ntotat.
Thread safety. All trait objects are Send + Sync. Interior
mutability inside parallel reductions uses AtomicU64 with
f64::to_bits / f64::from_bits — Cell<f64> is not Sync.
Scope equivalence.
There is no separate global-restraint storage path. The broadcast at
pack() entry is the implementation.
Restraint vs Constraint. Packmol implements all 15 "constraints" as
soft penalties. Naming reflects mechanism, not user intent → Restraint.
Direction-3 extension pattern. Every extension trait follows the
same shape: public trait, N concrete pub struct impls, user types
impl Trait identically. No Builtin* / Native* wrappers in the
public API.
init1 short-circuit. Set during the initial geometric pre-fit.
Skips the pair kernel — the restraint-only objective is enough to get
atoms into their regions before pair conflicts matter.
Cheatsheet¶
| Question | Where to look |
|---|---|
| How is one restraint's penalty computed for one atom? | restraint.rs::*::f / *::fg |
Where does with_global_restraint broadcast? |
packer.rs::pack (top of fn) |
| Where is the per-atom CSR pool built? | packer.rs::pack (CSR build loop) |
How are x ↔ Cartesian coords expanded? |
objective.rs::expand_molecules, euler.rs::eulerrmat |
| Where is the pair-overlap kernel? | objective.rs::accumulate_pair_fg_parallel |
| What does the initial pre-fit do? | initial.rs::initial, initial.rs::restmol |
| How is precision-based termination tested? | gencan/mod.rs::packmolprecision |
What does movebad do? |
movebad.rs::movebad |
| How is torsion MC wired in? | relaxer.rs::TorsionMcRelaxer::on_iter |
| Where does periodic boundary wrap apply? | context/pack_context.rs::pbc_distance |