Automated Free Energy Simulations with the Bennett Acceptance Ratio (BAR) Method
autoBAR is a lightweight automation tool for running alchemical free energy simulations using Tinker and Tinker9. It supports the polarizable AMOEBA and AMOEBA+ force fields and uses the Bennett Acceptance Ratio for free energy estimation.
| Category | Requirements |
|---|---|
| Python (≥ 3.8) | numpy, scipy (only for parmOPT.py), ruamel.yaml (both < 0.18 and ≥ 0.18 are supported) |
| Simulation Software | Tinker (CPU) and Tinker9 (GPU) |
A conda environment file is provided:
conda env create -f environment.yml
conda activate autobarautoBAR/
├── autoBAR.py # Main driver script
├── environment.yml # Conda environment specification
├── tests/ # Test suite
├── dat/ # Default configuration files
│ ├── settings.yaml # Example settings template
│ ├── gas.key # Default gas-phase key file
│ ├── liquid.key # Default liquid-phase key file
│ ├── orderparams_courser # Coarser lambda schedule (18 windows)
│ ├── orderparams_default # Standard lambda schedule (26 windows)
│ └── tinker.env # Tinker executable paths
├── utils/ # Utility modules
│ ├── checkautobar.py # Progress monitoring for MD and BAR jobs
│ ├── elescale.py # Electrostatic parameter scaling
│ ├── parmOPT.py # Parameter optimization targeting experimental HFE and liquid density
│ └── submitTinker.py # Cluster job submission helper
└── examples/ # Ready-to-use example systems
├── Ion-HFE/ # Na⁺ hydration free energy
└── Phenol-HFE/ # Phenol hydration free energy
Edit dat/tinker.env to point to your local Tinker and Tinker9 installations:
export TINKER8=/path/to/tinker
export DYNAMIC8="$TINKER8/dynamic"
export ANALYZE8="$TINKER8/analyze"
export BAR8="$TINKER8/bar"
export MINIMIZE8="$TINKER8/minimize"
export tk9home=/path/to/tinker9/build
export DYNAMIC9="$tk9home/dynamic9"
export ANALYZE9="$tk9home/analyze9"
export BAR9="$tk9home/bar9"
export MINIMIZE9="$tk9home/minimize9"Place the following four files in your working directory:
| File | Description |
|---|---|
gas_xyz |
Ligand Tinker .xyz file |
box_xyz |
Solvated system Tinker .xyz file (box dimensions on line 2) |
parameters |
Tinker force field parameter file (.prm) |
settings.yaml |
Simulation settings — see the template |
Tip: Custom
.keyfiles are supported. Setliquid_keyand/orgas_keyinsettings.yamlto use your own key files instead of the defaults.
Key settings to configure (see dat/settings.yaml for the full reference):
# Lambda schedule: "courser" (18 windows) or "default" (26 windows)
# (the value is case-insensitive; "courser" is the literal key the code expects)
lambda_window: courser
# Input files
gas_xyz: ligand.xyz
box_xyz: solvated.xyz
parameters: forcefield.prm
# Liquid-phase MD settings
liquid_md_total_time: 1.25 # ns
liquid_md_time_step: 2.0 # fs (RESPA integrator)
liquid_md_ensemble: NPT
liquid_md_temperature: 300.0 # K
# Gas-phase MD settings (set gas_md_total_time to 0 to skip)
gas_md_total_time: 1.25 # ns
gas_md_time_step: 0.1 # fs (stochastic dynamics)
gas_md_temperature: 300.0 # K
# Cluster nodes for job distribution
node_list:
- node01
- node02
# One-step perturbation (only used when a "<parameters>_XX" file exists):
# YES → reuse the reference trajectory → true one-step FEP
# NO → re-run dynamics for a new state → BAR (see the section below)
copy_arc_for_perturb: YESautoBAR can be run in interactive or automated mode.
Run individual steps as needed:
# Step 1: Generate simulation input files
python autoBAR.py setup
# Step 2: Run MD simulations across all lambda windows
python autoBAR.py dynamic
# Step 3: Perform BAR free energy analysis
python autoBAR.py bar
# Step 4: Collect and summarize results
python autoBAR.py resultRun the entire workflow end-to-end:
python autoBAR.py autoIn automated mode, autoBAR will:
- Generate all input files (
setup) - Submit MD jobs and wait for completion (
dynamic) - Submit BAR analysis jobs and wait for completion (
bar) - Collect and print the final free energy (
result)
setup ──► dynamic ──► bar ──► result
│ │ │ │
│ │ │ └─ Parse .ene files, sum ΔG, write result.txt
│ │ └────────── Run BAR on trajectory pairs (arc0, arc1)
│ └──────────────────── Run MD at each λ window (liquid: GPU, gas: CPU)
└─────────────────────────────── Generate .key/.xyz per λ window, minimize
Only the liquid phase is needed (small ion → gas phase is skipped automatically):
examples/Ion-HFE/
├── Na.xyz # Gas-phase ion structure
├── Na-water.xyz # Solvated ion + water box
├── water03.prm # Force field parameters
├── settings.yaml # Simulation settings
└── result.txt # Reference output
Both gas and liquid phases are required:
examples/Phenol-HFE/
├── phenol.xyz # Gas-phase phenol structure
├── phenol_solv.xyz # Solvated phenol + water box
├── amoeba09.prm # AMOEBA force field parameters
├── settings.yaml # Simulation settings
└── result.txt # Reference output
autoBAR can evaluate the free-energy difference to one or more perturbed end states (e.g. small force-field parameter changes) without redefining the lambda schedule:
- Place one or more perturbed parameter files named
<parameters>_XX(whereXX=01to99) in the working directory next to your main parameter file. For example, ifparameters: forcefield.prm, the perturbed files areforcefield.prm_01,forcefield.prm_02, … - Each perturbed file defines a new end state, appended to the lambda schedule as an extra window.
- No other change to
settings.yamlis required — autoBAR detects the files automatically. - Results are reported as
FEP_001,FEP_002, … inresult.txt.
The estimator used for the perturbation depends on whether the reference and perturbed
states share a trajectory, which is controlled by the copy_arc_for_perturb setting:
copy_arc_for_perturb |
Trajectory used for the perturbed state | Estimator |
|---|---|---|
YES (default) |
Reuses the reference end-state trajectory (the same .arc is symlinked for both states) |
One-step FEP (Zwanzig exponential averaging over a single ensemble) |
NO |
Re-runs dynamics with the perturbed parameters to generate a separate .arc |
BAR (two independent trajectories, one per state) |
In other words, the result is true one-step FEP only when the same trajectory file is
reused (copy_arc_for_perturb: YES). When a fresh trajectory is simulated for the
perturbed state, the calculation is an ordinary two-state BAR — even though the column in
result.txt is still labeled FEP_XXX.
YES is much cheaper (no extra MD) and is appropriate for small parameter changes where the
two states overlap well; NO is more robust for larger perturbations. Both modes are useful
for sensitivity analysis and parameter optimization (see utils/parmOPT.py).
The utils/parmOPT.py utility optimizes force field parameters to simultaneously match experimental hydration free energy (HFE) and neat liquid density at one or more temperatures:
python utils/parmOPT.pyIt uses scipy.optimize.least_squares (TRF, soft-L1 loss) with a custom Jacobian. The HFE row is evaluated via autoBAR's one-step perturbation: each trial point writes a perturbed <parameters>_01 file and runs autoBAR.py auto, so with copy_arc_for_perturb: YES (the recommended setting here) the HFE change is obtained by one-step FEP over the reference trajectory. Each density row uses the fluctuation formula (Wang et al. 2013, Eq. 4) applied to the most recent per-temperature trajectory, with all $ANALYZE9 jobs submitted to the GPU cluster in parallel alongside the autoBAR HFE run.
Add these fields to settings.yaml (copy utils/settings.yaml as a starting point):
# --- parmOPT required ---
parameters: forcefield.prm # Tinker parameter file
expt_hfe: -4.75 # Experimental HFE (kcal/mol)
# Single-temperature density target (scalar form):
temperature: 298.15 # Simulation temperature (K)
expt_density: 997.0 # Experimental density in kg/m³
# NOTE: units are kg/m³ — water ≈ 997, ethanol ≈ 789
liquid_dir: neat_liquid # Directory for neat-liquid MD (must NOT be "liquid"
# — autoBAR already uses ./liquid/ for HFE windows)
production_time: 2.0 # Production simulation time (ns)
opt_params: "vdwpair-401-402 3.8 0.05" # Force field term + initial parameter values
params_range: "0.5 0.02" # Search range (±) per parameter; use 0 to fix
# --- Liquid MD settings (shared between HFE and neat-liquid simulations) ---
liquid_md_time_step: 2.0 # Integration timestep (fs)
liquid_md_write_freq: 0.1 # Trajectory output interval (ps)
liquid_md_pressure: 1.0 # Pressure (atm)
# --- parmOPT optional ---
hfe_weight: 1.0 # Relative importance weight for the HFE residual (default: 1.0)
density_weight: 1.0 # Relative importance weight for the density residual (default: 1.0)
hfe_denom: 2.24 # Scale denominator for HFE (kcal/mol).
# Default: sqrt(|expt_hfe|). Override to fix the normalization
# scale across multiple optimization runs.
density_denom: 31.6 # Scale denominator for density (kg/m³).
# Default: std-dev of expt_densities (multi-T) or
# sqrt(|expt_density|) (single-T). Override as needed.
liquid_base: neat_liq # Coordinate/trajectory base name inside liquid_dir
# (default: "neat_liq"); sets xyz/key/sh/arc/dyn names
liquid_key: neat_liq # Key file basename (default: same as liquid_base)
equil_time: 0.5 # Equilibration time (ns) — discarded before averaging (default: 0.02)
hfe_liquid_key: path/to/liquid.key # Override the HFE liquid key template used to auto-generate neat_liq.key
# (default: dat/liquid.key in the autoBAR repo)
checking_time: 60 # Polling interval (s) for MD and analyze job completion (default: 60)The optimizer minimizes:
cost = (hfe_weight × ΔHFE / hfe_denom)² + Σ (density_weight_i × Δρ_i / density_denom)²
There are two separate knobs:
| Knob | Purpose | Settings key |
|---|---|---|
| Denom | Normalizes the physical scale of each property so residuals are dimensionless | hfe_denom, density_denom |
| Weight | Controls the relative importance of HFE vs. density after normalization | hfe_weight, density_weight |
The denominator converts each raw residual (in its physical unit) into a dimensionless number. parmOPT computes sensible defaults automatically — the same convention used by ForceBalance:
| Property | Default Denom |
Rationale |
|---|---|---|
| HFE | sqrt(|expt_hfe|) |
e.g. expt_hfe = −5 kcal/mol → denom ≈ 2.24 kcal/mol |
| Density (single T) | sqrt(|expt_density|) |
e.g. expt_density = 997 kg/m³ → denom ≈ 31.6 kg/m³ |
| Density (multi T) | std_dev(expt_densities) |
spread of the experimental values across temperatures |
With these defaults a typical HFE error of ~1 kcal/mol and a typical density error of ~10 kg/m³ both produce a normalized residual of ~0.3–0.5, so hfe_weight = density_weight = 1.0 gives a balanced starting point without any manual tuning.
Override the defaults only when you want to lock the normalization scale across multiple runs (e.g. to make cost values comparable between different optimization attempts):
hfe_denom: 2.24 # kcal/mol — fix at sqrt(5) regardless of expt_hfe
density_denom: 31.6 # kg/m³ — fix at sqrt(997) regardless of expt_densityOnce the denominators have put both residuals on the same scale, the weights express how much you care about one property relative to the other. With the default denominators, hfe_weight = density_weight = 1.0 is a reasonable starting point for most organic solvents.
Adjust when the optimization consistently sacrifices one property for the other:
- If the optimizer fits HFE well but density drifts — increase
density_weight(or decreasehfe_weight). - If density is well reproduced but HFE error is large — increase
hfe_weight. - A factor-of-2 change in a weight shifts the cost ratio by 4×; start with small adjustments.
The WtNormRes column in parmOPT.log shows weight × Δ / denom for each property at every optimizer step. Watch this column to see which property is driving the cost.
Practical starting point (works for most solvents):
# Let denom defaults handle unit normalization; use equal weights.
hfe_weight: 1.0
density_weight: 1.0If you have a strong prior on acceptable errors you can also set weights as inverse acceptable errors after normalization. For example, if you want a 0.5 kcal/mol HFE error to cost the same as a 5 kg/m³ density error, and the defaults give hfe_denom ≈ 2.24 and density_denom ≈ 31.6:
# normalized acceptable errors
hfe_threshold = 0.5 / 2.24 ≈ 0.22
density_threshold = 5 / 31.6 ≈ 0.16
# set weights so both thresholds give cost contribution ≈ 1
hfe_weight = 1 / 0.22 ≈ 4.5 → round to 5.0
density_weight = 1 / 0.16 ≈ 6.3 → round to 6.0
Providing densities at multiple temperatures simultaneously constrains the parameter search and reduces overfitting.
Replace the scalar temperature / expt_density keys with their list counterparts:
# Replace these scalar keys:
# temperature: 298.15
# expt_density: 997.0
# With these list keys (lengths must match):
temperatures: [278.15, 298.15, 318.15] # temperatures in K
expt_densities: [999.9, 997.0, 989.3] # experimental densities in kg/m³ at each T
density_weights: [1.0, 1.0, 1.0] # optional per-T weights; replaces density_weightImportant:
expt_densitiesvalues must be in kg/m³, not g/cm³. Common reference values: water 278 K → 999.9, 298 K → 997.0, 318 K → 989.3 kg/m³.
Weight normalization: each density_weight is divided by the number of temperatures internally, so the total density contribution to the cost is the same whether you fit at one temperature or ten. The user-specified weights control the relative importance of each temperature; the aggregate HFE/density balance is unchanged. The effective weights are printed to parmOPT.log at startup.
autoBAR runs the HFE calculation once per optimizer call. Neat-liquid MD jobs are submitted to the GPU cluster in parallel with the HFE run: each temperature gets its own coordinate symlink, run script, and GPU card, so all temperatures run simultaneously. Before submitting, parmOPT checks the existing .arc file for each temperature — if it already contains the required number of frames (e.g., after a restart), the submission is skipped; if the run is partial and a .dyn checkpoint exists, a resume script is generated with only the remaining steps. parmOPT waits for the HFE job and all neat-liquid MD jobs to finish before moving to the next optimizer step.
For the Jacobian, parmOPT first strips the equilibration frames from each trajectory, writing a *-prod.arc file containing only production snapshots. It then writes one $ANALYZE9 run script per (parameter perturbation, temperature) pair, submits all scripts to the GPU cluster simultaneously via submitTinker.py, and waits for all output logs before assembling the density sensitivity rows of the Jacobian.
opt_params and params_range accept either a single string (one group) or a YAML list (multiple groups). Each list entry is one independent force-field term; all groups are optimized simultaneously.
opt_params:
- "vdw-401 3.80 0.050" # R and epsilon on atom type 401
- "vdw-402 3.60 0.060" # R and epsilon on atom type 402
- "chgpen-403 5.00 0.800" # two chgpen parameters on atom type 403
params_range:
- "0.30 0.02"
- "0.30 0.02"
- "0.50 0.05"The optimizer sees a single flat parameter vector formed by concatenating the free parameters from all groups; write_prm splits it back and writes one line per group.
Use 0 in params_range to hold a parameter constant. That parameter is excluded from the optimizer but still written to the parameter file at its initial value:
# Optimize R only; keep epsilon fixed at 0.050
opt_params: "vdw-401 3.80 0.050"
params_range: "0.30 0"This works for both the single-string and list forms:
opt_params:
- "vdw-401 3.80 0.050" # optimize R only
- "chgpen-403 5.00 0.800" # optimize both params
params_range:
- "0.30 0" # 0 → epsilon is fixed
- "0.50 0.05"Fixed parameters are annotated as (fixed) in each step's log line in parmOPT.log.
Important: set
liquid_dirto a name other thanliquid(e.g.,neat_liquid). autoBAR creates./liquid/and./gas/for the HFE alchemical windows; a conflictingliquid_dir: liquidwould mix those files with the pure-liquid MD trajectories.
The only file you need to supply in liquid_dir is the Tinker coordinate file.
The key and per-temperature shell scripts are auto-generated at startup:
| File | Source | Description |
|---|---|---|
neat_liq.xyz |
User-supplied | Simulation box Tinker coordinates |
neat_liq.key |
Auto-generated (if absent) | Derived from the HFE liquid key template by removing vdw-annihilate, vdw-lambda, ele-lambda, and ligand lines; shared by all temperatures |
neat_liq_{T}K.xyz |
Auto-generated (symlink) | Per-temperature symlink → neat_liq.xyz; Tinker writes neat_liq_{T}K.arc so parallel GPU runs don't overwrite each other |
neat_liq_{T}K.key |
Auto-generated (symlink) | Per-temperature symlink → neat_liq.key; ensures Tinker names its output after the temperature-tagged coordinate file |
neat_liq_{T}K.sh |
Auto-generated (always) | Per-temperature NPT run script (sources dat/tinker.env, calls $DYNAMIC9); each is submitted to a separate GPU card |
Example directory after a two-temperature run at 278 K and 298 K with Jacobian computed:
neat_liquid/
├── neat_liq.xyz # user-supplied
├── neat_liq.key # shared key file (PARAMETERS updated each opt step)
├── neat_liq_278K.xyz # symlink → neat_liq.xyz
├── neat_liq_278K.key # symlink → neat_liq.key
├── neat_liq_278K.sh # MD run script for 278 K
├── neat_liq_278K-md.log # MD log
├── neat_liq_278K.arc # full trajectory (equil + production)
├── neat_liq_278K.dyn # MD checkpoint for resume
├── neat_liq_278K-prod.arc # production-only arc (equil frames stripped)
├── neat_liq_278K-prm01.key # analyze key: prm perturb +Δ, 278 K
├── neat_liq_278K-prm01-analyze.sh # analyze script for above
├── neat_liq_278K-prm01-analyze.log # ANALYZE9 output
├── neat_liq_298K.xyz # symlink → neat_liq.xyz
├── neat_liq_298K.key # symlink → neat_liq.key
├── neat_liq_298K.sh # MD run script for 298 K
├── neat_liq_298K-md.log # MD log
├── neat_liq_298K.arc # full trajectory
├── neat_liq_298K.dyn # MD checkpoint
├── neat_liq_298K-prod.arc # production-only arc
├── neat_liq_298K-prm01.key # analyze key: prm perturb +Δ, 298 K
├── neat_liq_298K-prm01-analyze.sh # analyze script for above
└── neat_liq_298K-prm01-analyze.log # ANALYZE9 output
For
Nfree parameters,2Nparameter perturbations are created per optimizer call; the analyze files are labeledprm01…prm{2N}and all submitted to the cluster simultaneously.
For hydration free energy calculations:
| Setting | Value | Notes |
|---|---|---|
lambda_window |
courser |
18 windows — good accuracy with less cost |
liquid_md_total_time |
1.25 ns | Last 80% (1 ns) used in BAR analysis |
liquid_md_time_step |
2.0 fs | Works well with RESPA integrator |
gas_md_total_time |
1.25 ns | Last 80% (1 ns) used in BAR analysis |
gas_md_time_step |
0.1 fs | Required for gas-phase stochastic dynamics |