A lightweight Python implementation of (1) single-stage neural NMF and (2) a two-stage case-control NMF that separates baseline (control) programs from case-specific programs, with optional foundation model–informed gene-similarity regularization.
DeltaNMF is a neural-network reformulation of non-negative matrix factorization (NMF) designed for single-cell transcriptomics. It supports:
- One-stage (control-only) factorization to learn a baseline program dictionary
- Two-stage (case-control) factorization that learns baseline programs on controls, then learns additional case-specific programs on case cells while keeping baseline programs active
- Optional gene-similarity regularization via a precomputed gene–gene similarity matrix derived from transformer gene embeddings (e.g., scGPT or TranscriptFormer)
- GPU acceleration: consensus initialization + PyTorch optimization (when CUDA is available)
- Interpretable gene programs: returns
W(genes × programs) andH(programs × cells) - Foundation-model regularization: add a Laplacian penalty using precomputed gene–gene similarity (
S_E) from scGPT or TranscriptFormer embeddings - Two-stage case-control modeling: explicitly separates shifted usage of shared programs from emergence of novel case-specific programs
DeltaNMF expects expression matrices in genes × cells orientation:
X_control/X_ntc: shape(n_genes, n_control_cells)X_case/X_spec: shape(n_genes, n_case_cells)(two-stage only)gene_names: list/array of lengthn_genesmatching the rows ofX_*
If you start from an .h5ad, deltanmf/io.py::h5ad_to_npy(...) will:
- split cells by
adata.obs[condition_key] - extract
adata.Xoradata.layers[layer] - convert to dense (if sparse)
- transpose to genes × cells
Edit scripts/run_onestage.py and set:
h5ad_pathoutntc_key,condition_key- resource choice (
scgptvstranscriptformer) K
Then run:
python scripts/run_onestage.pyIf you're calling the API directly, stage-1 NTC fitting can run in either mode:
- full-batch (default):
use_minibatch_ntc=False - minibatch over cells:
use_minibatch_ntc=Trueandminibatch_size_ntc=40960(or your preferred size)
Edit scripts/run_twostage.py and set:
h5ad_pathoutntc_key,condition_key, and optionallycase_key- resource choice (
scgptvstranscriptformer) - k values
Then run:
python scripts/run_twostage.pyTwo-stage has the same stage-1 option:
- full-batch baseline fitting:
stage1_use_minibatch_ntc=False - minibatch baseline fitting:
stage1_use_minibatch_ntc=Trueandstage1_minibatch_size_ntc=40960
Running without foundation-model regularization (use_fm, default True):
Both run_onestage_deltanmf and run_twostage_deltanmf accept a use_fm flag. When use_fm=False, the FM Laplacian term is disabled entirely: S_E is never loaded, the gene set is not intersected with S_E's gene list, and the solver runs plain (regularizer-free, aside from non-negativity) NMF / two-stage case-control NMF. Use this when you want a pure-data baseline, lack a precomputed S_E, or want to compare against the FM-regularized variant.
- One-stage:
run_onestage_deltanmf(..., use_fm=False)—S_E_PATH/S_E_GENES_PATHmay be omitted. - Two-stage:
run_twostage_deltanmf(..., use_fm=False)—S_E_PATH/S_E_GENES_PATHmay be omitted;stage1_rel_alpha,stage2_rel_alpha,stage2_rel_gammaare ignored.
Default (use_fm=True) preserves the existing behavior: S_E is loaded, genes are intersected with S_E's gene list, and FM regularization is applied according to the rel_alpha/rel_gamma parameters.
Stage-2 memory mode (stage2_use_hybrid_memory, default False):
By default, stage 2 uses the standard GPU solver (solve_specific_with_fixed_ntc), which loads the entire X_spec matrix into VRAM upfront. This is simple and fast but will OOM if X_spec is too large to fit alongside the model parameters.
For large case matrices, set stage2_use_hybrid_memory=True to use a memory-aware GPU solver (solve_specific_with_fixed_ntc_hybrid_fast) that adapts to available VRAM:
- Tier 1 — keeps
X_specon GPU (like the default solver) but auto-tunes the batch size based on remaining VRAM after model allocation - Tier 2 — if VRAM is insufficient (< 40 % free or OOM), keeps
X_specon CPU and streams batches to GPU via pinned memory
The solver prints VRAM diagnostics at startup so you can see which tier was selected. Use this when your case matrix is large enough that the default solver OOMs.
DeltaNMF’s optional regularization uses a gene–gene similarity matrix S_E, aligned to your dataset’s genes via a companion gene-order JSON.
The runner scripts expect:
resources/<model>/S_E_relu.npyresources/<model>/genes_order.json
You can generate scGPT or TranscriptFormer-derived files by running:
resources/transcriptformer/create_transformer_similarity_matrix_transcriptformer.pyresources/scgpt/create_transformer_similarity_matrix_scgpt.py
After generation, place outputs at:
resources/{}/S_E_relu.npyresources/{}/genes_order.json
- Gene alignment matters:
S_Emust correspond to the same gene naming convention asgene_names(the code aligns and filters genes; mismatches reduce coverage). Use ENSEMBL IDs - Memory:
S_Eis dense and can be large. Usefloat32and store on disk; load only when needed.
If you use this code, please cite:
Karpurapu, A., Gersbach, C. A., & Singh, R. (2026). DeltaNMF: A Two-Stage Neural NMF for Differential Gene Program Discovery. bioRxiv. https://doi.org/10.64898/2026.01.22.701049
@article{karpurapu2026deltanmf,
title = {DeltaNMF: A Two-Stage Neural NMF for Differential Gene Program Discovery},
author = {Karpurapu, Anish and Gersbach, Charles A. and Singh, Rohit},
journal = {bioRxiv},
year = {2026},
doi = {10.64898/2026.01.22.701049},
url = {https://www.biorxiv.org/content/10.64898/2026.01.22.701049v1}
}CC BY NC SA 4.0
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
For questions and issues, please open a GitHub issue or contact the maintainers.