Skip to content

aai-research-lab/FastMDXplora

FastMDXplora

Fully Automated SysTem for Molecular Dynamics eXploration

DOI PyPI version Python versions License: MIT Tests codecov


FastMDXplora explores a protein's behavior end to end from a single command. Given a structure (or just a PDB ID) it performs molecular dynamics exploration all the way through setup, simulation, analysis, and reporting, then hands back publication-ready results:

  setup  →  simulation  →  analysis  →  report

Highlights

  • Explore a protein's full dynamics with a single command, covering setup, simulation, analysis, and reporting
  • Probe protein-ligand binding automatically with analyses for pose stability, contacts, and protein-ligand hydrogen bonds
  • Reach beyond plain MD with built-in PLUMED enhanced sampling (metadynamics, umbrella sampling, steered MD) and a full analysis suite that turns trajectories into slide-ready, publication-quality figures
  • Scale from a quick single-protein exploration to large-scale parallel campaigns, driven the same way from the CLI or the Python API

Phases of FastMDXplora

Phase What it does
setup Cleans up your structure and builds a simulation-ready system: fixes missing atoms, adds hydrogens, solvates, and adds ions.
simulation Runs the molecular dynamics (energy minimization, equilibration, and production), with optional enhanced sampling.
analysis Computes the standard structural and dynamic metrics (and protein-ligand metrics when a ligand is present), with figures ready to use.
report Packages everything into a slide deck, a written report, and a self-contained bundle you can share.

Installation

FastMDXplora's four phases have different dependency footprints. The analysis and report phases work from pip alone; the setup and simulation phases need PDBFixer + OpenMM, which are distributed primarily through conda-forge. So there are two routes; pick by what you need.

Full install (all four phases, from the git repo)

The setup/simulation chemistry stack (OpenMM, PDBFixer) installs most reliably from conda-forge, so the full install uses the bundled environment.yml. We recommend mamba (a faster conda solver); plain conda works too.

git clone https://github.com/aai-research-lab/FastMDXplora.git
cd FastMDXplora
mamba env create -f environment.yml || conda env create -f environment.yml
conda activate fastmdxplora
pip install .

Don't have mamba? Either install Miniforge (see below), or just use conda; the || above falls back to it automatically.

Analysis + report only (from PyPI)

If you only need to analyze existing trajectories and build reports (no simulation), plain pip is enough, no conda required:

pip install fastmdxplora              # primary package
pip install fastmdx                    # alias (resolves to fastmdxplora)

This gives a fully working analysis + report pipeline, slide deck included (python-pptx is a core dependency). The setup and simulation phases require the chemistry stack; if it is missing, invoked setup/simulation runs fail with a clear missing-dependency message. Add it via conda-forge (recommended, reliable across platforms):

conda install -c conda-forge pdbfixer openmm

or best-effort via the [md] pip extras (PDBFixer wheels are unavailable on some platforms, so conda is preferred):

pip install "fastmdxplora[md]"

Development install

git clone https://github.com/aai-research-lab/FastMDXplora.git
cd FastMDXplora
mamba env create -f environment.yml || conda env create -f environment.yml
conda activate fastmdxplora
pip install -e ".[test]"               # editable, with the test dependencies

On Windows PowerShell, use the Python launcher and the virtual environment's activation script:

cd C:\Users\User\OneDrive\Documents\GitHub\FastMDXplora
py -3.11 -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip setuptools wheel
python -m pip install -e ".[test]"
python -m fastmdxplora.cli.main --version
python -m fastmdxplora.cli.main info

If PowerShell blocks activation, allow local scripts for your user account:

Set-ExecutionPolicy -Scope CurrentUser RemoteSigned

You can also skip activation and call the environment's Python directly:

.\.venv\Scripts\python.exe -m pip install -e ".[test]"
.\.venv\Scripts\python.exe -m fastmdxplora.cli.main --version

Verify

fastmdx --version
fastmdx info                           # versions + detected backends (OpenMM/PDBFixer)

If fastmdx is not on PATH, these module commands are the safest fallback:

python -m fastmdxplora.cli.main --version
python -m fastmdxplora.cli.main info

Check which OpenMM platforms are available (CPU/CUDA/OpenCL):

python - <<'PY'
import openmm as mm
plats = [mm.Platform.getPlatform(i).getName() for i in range(mm.Platform.getNumPlatforms())]
print("Available platforms:", plats)
print("CUDA available" if "CUDA" in plats else "CPU-only: simulations will run on CPU")
PY

conda-forge package (coming soon). A single-command conda install -c conda-forge fastmdxplora (pulling every dependency, all four phases working out of the box) is planned once the recipe clears review. Until then, use the git + environment.yml route above.

Mamba / Miniforge (optional)

mamba is a drop-in, faster replacement for the conda solver, helpful because solving the OpenMM/CUDA stack is exactly where the classic solver is slow. If you don't have it, the easiest source is Miniforge (conda + mamba, preconfigured for conda-forge):

# Linux (x86_64); see https://conda-forge.org/miniforge/ for macOS/Windows/ARM
curl -L -o "$HOME/Miniforge3.sh" \
  "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh"
bash "$HOME/Miniforge3.sh" -b -p "$HOME/miniforge3"
source "$HOME/miniforge3/etc/profile.d/conda.sh"
conda init "$(basename "$SHELL")"

If mamba still isn't on PATH afterward, add it to the base environment:

conda install -n base -c conda-forge mamba

For other operating systems (macOS Intel/Apple Silicon, Linux ARM64, Windows), grab the matching installer from the Miniforge releases page.

Troubleshooting: fastmdx is not recognized

On Windows, especially with Microsoft Store Python or PowerShell, you may see:

fastmdx : The term 'fastmdx' is not recognized as the name of a cmdlet, function, script file, or operable program.

First check which Python installed FastMDXplora and where console scripts are written:

python -m pip show fastmdxplora
python -c "import sys; print(sys.executable)"
python -c "import sysconfig; print(sysconfig.get_path('scripts'))"
python -m fastmdxplora.cli.main --version

If import fastmdxplora works but fastmdx is missing, the console-script directory is not on PATH. Use python -m fastmdxplora.cli.main ... as the portable fallback, or reinstall from the same Python to recreate the script:

python -m pip install -e .
python -m fastmdxplora.cli.main info

Avoid mixing multiple Python installs in the same terminal. The Python used for python -m pip install ... should be the same one used for python -m fastmdxplora.cli.main ....

Examples

Command line

Run the full pipeline (setup → simulate → analyze → report):

fastmdx explore --system protein.pdb

Fetch a structure from the PDB by ID (auto-detected, fetched from RCSB):

fastmdx explore --system 1L2Y

Tune per-phase options (flags are namespaced by phase):

fastmdx explore -s protein.pdb --setup-ph 7.4 --simulate-duration-ns 100 --simulate-platform CUDA

For a short local smoke test before a longer production run, use the gentle preset:

fastmdx explore -s protein.pdb --include setup simulation --simulate-preset gentle --simulate-platform CPU

Run only specific phases:

fastmdx explore -s protein.pdb --include setup simulation

Run a single phase (bare flags, no phase prefix):

fastmdx setup -s protein.pdb --ph 6.5
fastmdx simulate -s protein.pdb --output run_001 --duration-ns 50 --platform CUDA
fastmdx analyze --output run_001 --analyses rmsd rmsf rg
fastmdx report --output run_001 --no-slides

Drive a whole study from a config file (-c and -config also work):

fastmdx explore --config study.yml

Generate a commented config template to edit:

fastmdx init-config -o study.yml

The -s, -system, and --system forms are equivalent; xplore is an alias of explore.

Python API

Run the full pipeline:

from fastmdxplora import FastMDXplora

fmdx = FastMDXplora(system="protein.pdb")
fmdx.explore()

Specify options and select phases:

fmdx = FastMDXplora(system="1L2Y")          # PDB ID, fetched from RCSB
results = fmdx.explore(
    include=["setup", "simulation", "analysis"],
    options={
        "simulation": {"duration_ns": 100, "temperature_K": 310, "platform": "CUDA"},
        "analysis":   {"include": ["rmsd", "rg", "cluster"]},
    },
)
# explore() always returns a list of runs (a single study is a list of one)
for run in results:
    print(run.run_id, run.status)
    for phase in run.phases:
        print("  ", phase.name, phase.status)

Run a config file (one system, many systems, or a parameter sweep, all the same way):

fmdx = FastMDXplora(config="study.yml")
fmdx.explore()

Preview a run without executing (CLI --dry-run, or dry_run=True):

FastMDXplora(config="campaign.yml").explore(dry_run=True)

Recommended alias: import fastmdxplora as fastmdx.

See Configuration files and Many systems and parameter sweeps for the YAML format, batches, sweeps, and parallel execution.

Configuration files

For anything beyond a quick run, capture the whole study in a single YAML file instead of a long flag list. The same file drives both the CLI and the Python API. Input is always given as a systems: list (even for a single system), so the file looks the same whether you study one protein or a dozen.

Generate a commented template to start from:

fastmdx init-config                    # writes fastmdxplora.yml (comprehensive)
fastmdx init-config --minimal -o study.yml   # short starter

A study.yml looks like:

systems:
  - id: protein1
    system: protein.pdb        # PDB/CIF path, 4-char PDB ID, or sequence

output: ./my_study
include: [setup, simulation, analysis, report]

setup:
  ph: 7.4
  ion_concentration_M: 0.15

simulation:
  duration_ns: 100.0         # production length (equilibration is separate)
  temperature_K: 310.0
  platform: CUDA

analysis:
  include: [rmsd, rmsf, rg, cluster]
  selection: "name CA"
  options:
    cluster:
      methods: [kmeans, hierarchical]
      n_clusters: 5

report:
  title: "My MD Study"

Run it from the CLI or the API:

fastmdx explore --config study.yml     # also: -c, -config
from fastmdxplora import FastMDXplora
FastMDXplora(config="study.yml").explore()

With a single system and no sweep, the output uses the familiar flat layout (my_study/setup/, my_study/simulation/, …) with the usual manifest.json and resolved_config.yml. Three things make this robust:

  • Flags override the file. fastmdx explore --config study.yml --simulate-duration-ns 50 keeps everything in the file but runs 50 ns. Precedence is: command-line flags / API kwargs > config file > built-in defaults.
  • Strict validation. A typo like pH: (wrong case) or simulaton: is rejected with a did-you-mean suggestion, so a misspelled key never silently runs with the default.
  • Reproducibility. Every run writes resolved_config.yml, the fully-merged configuration that actually ran (defaults + file + overrides). Feed it straight back to --config to reproduce the study exactly.

For a quick command-line one-off, -s/--system is shorthand that builds a one-element systems list for you:

fastmdx explore -s protein.pdb --simulate-duration-ns 50

Many systems and parameter sweeps

Because input is always a systems: list, studying several systems is just adding entries. Add a sweep: block to vary parameters, and FastMDXplora runs the full cross-product, each as a complete, self-contained study.

output: ./trpcage_campaign
include: [setup, simulation, analysis, report]

systems:
  - id: trpcage1
    system: trpcage.pdb
  - id: trpcage2
    system: trpcage.pdb
    setup: { ph: 6.5 }                 # optional per-system overrides

sweep:
  simulation.temperature_K: [300, 310, 320]   # dotted phase.option → values
  simulation.pressure_bar: [1.0, 1.2]          # multiple axes → cross-product

That config produces 2 systems × 3 temperatures × 2 pressures = 12 runs. When there is more than one run, each goes in its own runs/<id>/ subdirectory, indexed by a top-level batch_manifest.json, with a cross-run comparison/ report:

trpcage_campaign/
  batch_manifest.json
  comparison/                                        (cross-run report)
  runs/
    trpcage1__temperature_K-300__pressure_bar-1.0/   (a full study)
    trpcage1__temperature_K-300__pressure_bar-1.2/
    ...

Run it exactly as any other config:

fastmdx explore --config campaign.yml
from fastmdxplora import FastMDXplora
FastMDXplora(config="campaign.yml").explore()

Each run is identical in structure to a single study (its own manifest.json, resolved_config.yml, and phase directories), so existing analysis tooling works per-run unchanged. Option precedence within a run is base config < per-system overrides < swept value. Typo'd sweep axes are rejected with the valid-option list, and a failed run is recorded while the others continue.

Cross-run comparison report

After a multi-run study, FastMDXplora automatically builds a comparison/ report at the batch root that turns a directory of runs into a single analysis:

  • Overlays: every run's per-frame trace (RMSD, Rg, Q-value, total SASA) drawn on one set of axes, labelled by its swept value, so divergence across the sweep is visible at a glance.
  • Trends: each run reduced to a summary scalar (e.g. mean RMSD over the trajectory) and plotted against the swept parameter, giving a structure-property relationship.
  • comparison_summary.csv: one row per run with the summary scalars, ready for further analysis.
  • comparison_report.md: a written report tying the figures together, with a one-line quantitative takeaway per property (e.g. "across temperature_K 300 → 320, mean RMSD increases 0.21 → 0.23 nm").

It degrades gracefully (errored runs and missing analyses are skipped) and can be turned off with report: { comparison: false }.

Parallel execution

By default runs execute sequentially. An optional execution: block runs several at once:

execution:
  mode: parallel          # sequential (default) | parallel
  workers: 2              # how many runs at once
  devices: [0, 1]         # GPU indices: one run pinned per device
  continue_on_error: true

Parallelism is process-based (each run is a subprocess, required because OpenMM contexts and the GIL don't share across threads). On GPU, the safe pattern is one run per GPU: list your devices and each worker is pinned to a distinct index round-robin. Oversubscribing a single GPU is slower than running sequentially, so workers should not exceed the number of devices on GPU. When workers is unset it defaults to one per device (GPU) or the CPU count capped at the run count (CPU).

Outputs by phase

Each phase writes to a dedicated subdirectory under the project output root, with a structured parameters manifest so every artifact is traceable to the options that produced it.

Phase Key outputs
setup prepared.pdb, solvated.pdb, setup_parameters.json
simulation production.dcd, topology.pdb, simulation_parameters.json
analysis <analysis>/*.dat, <analysis>/*.png, analysis_manifest.json
report report.md, slides.pptx, project_bundle.zip

Documentation

Documentation is available at fastmdxplora.readthedocs.io and is actively expanding.

Citation

If you use FastMDXplora in your work, please cite:

Aina, A.; Kwan, D. FastMDAnalysis: Software for Automated Analysis of Molecular Dynamics Trajectories. J. Comput. Chem. 2026, 47, e70350. DOI: 10.1002/jcc.70350

@article{aina2026fastmd,
  author  = {Aina, Adekunle and Kwan, Derrick},
  title   = {FastMDAnalysis: Software for Automated Analysis of Molecular Dynamics Trajectories},
  journal = {Journal of Computational Chemistry},
  volume  = {47},
  number  = {8},
  pages   = {e70350},
  year    = {2026},
  doi     = {10.1002/jcc.70350},
}

Contributing

Contributions are welcome. See CONTRIBUTING.md. FastMDXplora follows the Contributor Covenant.

License

MIT. See LICENSE.

Acknowledgements

FastMDXplora is developed in the AAI Research Lab at California State University Dominguez Hills. It builds on a deep ecosystem of open-source scientific Python: MDTraj, OpenMM, PDBFixer, NumPy, SciPy, scikit-learn, Matplotlib, python-pptx, and many others.