Skip to content

feat: xenium patch tiling/stitching, cellpose downscaling, module updates, and bug fixes#133

Merged
heylf merged 20 commits intonf-core:devfrom
an-altosian:feature/xenium-processing-updates
Mar 24, 2026
Merged

feat: xenium patch tiling/stitching, cellpose downscaling, module updates, and bug fixes#133
heylf merged 20 commits intonf-core:devfrom
an-altosian:feature/xenium-processing-updates

Conversation

@an-altosian
Copy link

@an-altosian an-altosian commented Mar 20, 2026

Summary

Major feature additions and improvements to the nf-core/spatialxe pipeline for Xenium spatial transcriptomics processing:

New Features

  • Xenium patch tiling/stitching: Support for processing large Xenium datasets by tiling images into patches, running segmentation per-patch, and stitching results back together
  • Cellpose downscaling: Automatic image downscaling for cellpose segmentation to reduce memory usage on large FOVs
  • Ficture optional features: Added optional ficture segmentation-free analysis mode

Module Updates

  • cellpose: Updated to latest nf-core/modules (0780b96) with GPU support improvements, zero-cell detection, and custom resource labels
  • stardist: Updated to latest nf-core/modules (4e783502) — Seqera Containers GPU image with CUDA support, model input channel, topic-based version channels
  • opt/flip, opt/stat, opt/track: Updated to latest nf-core/modules (7d3e5c9d) — linting cleanup
  • All local modules: Migrated from versions.yml to topic-based version channels
  • All template-based Python modules: Converted to resources/usr/bin pattern

Bug Fixes

  • Fixed baysor version string detection (now correctly reports 0.7.1)
  • Fixed get_coordinates stub output
  • Fixed cellpose meta.yml to match main.nf outputs
  • Regenerated all module .diff patch files in correct nf-core format
  • Updated linting.yml to match nf-core template
  • Refreshed all test snapshots for current pipeline output

Refactoring

  • Generic resource labels across all modules (process_high, process_gpu_single)
  • Renamed transcript channels for consistency

Test plan

  • All 24 nf-test shards pass (12 latest-everything + 12 NF 25.04.0)
  • nf-core lint passes (0 failures, 640 tests passed)
  • pre-commit checks pass
  • All module .diff patches verified against upstream bases

an-altosian and others added 9 commits March 20, 2026 23:50
…napshots

- Restore extract_dapi and convert_mask_uint32 modules lost during PR nf-core#123 merge
- Remove flows/cells outputs from cellpose meta.yml to match patched main.nf
- Update cellpose.diff to include meta.yml patch alongside main.nf patch
- Restore stardist.diff patch file (process_high label change)
- Update proseg snapshot for v3.1.0 container versions.yml hash
- Refresh all workflow test snapshots for current pipeline output structure

All 21 nf-tests pass with --profile=+docker.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The nf-core linter doesn't support topic-based version outputs
(versions_cellpose, versions_python, versions_torch). Revert to the
simple versions: key matching the upstream module structure.

Include all meta.yml changes in cellpose.diff so the linter can
correctly verify local vs patched-upstream.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ore tools

Reverted meta.yml to exact upstream version (ad65d06). Used `nf-core modules
patch cellpose` to generate the correct .diff file containing only main.nf
changes. This fixes the nf-core lint "too many values to unpack" error caused
by manual meta.yml modifications.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
nf-core 3.4.1 has a bug causing "too many values to unpack" when
linting modules with topic-based version outputs. 3.5.2 fixes this.
Also reformats modules.json to pass prettier pre-commit hook.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… nf-core tools

Both diffs had malformed patch format (standard diff format with a/ b/
prefix or timestamps) instead of nf-core patch format. This caused
nf-core lint to crash with "too many values to unpack" when parsing
the --- lines.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…version channels

- Convert 29 local modules to use `topic: versions` channel pattern instead of writing versions.yml files
- Remove versions.yml generation from 12 Python template scripts
- Use dynamic version detection via eval() for all tools (no hardcoded versions)
- Standardize on python3 instead of python for version commands
- Fix version channel references in 11 subworkflows
- Update linting.yml action SHAs to match nf-core 3.5.2 template
- Remove orphaned XENIUM_PATCH_FILTER block from modules.config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…pattern

Migrate all 12 local modules from template() Python scripts to the
nf-core resources/usr/bin/ convention with executable CLI scripts.
This resolves eval() output failures since eval() only works with
Bash process scripts, not Python template interpreters.

Changes:
- Convert Python templates to argparse-based CLI scripts in resources/usr/bin/
- Update main.nf files to use bash script blocks calling CLI scripts
- Remove all template() calls from local modules
- Enable nextflow.enable.moduleBinaries in nextflow.config
- Remove version channel mixing from subworkflows (topic channels handle it)
- Add topic channel subscriber in main workflow for version collection
- Update all test assertions and snapshots for new version emit names
- Fix pre-existing bugs: split_transcripts missing mkdir, spatialdata/merge
  error message, spatialdata/meta sys.err, ficture/preprocess stub outputs

All 21 nf-test tests pass locally with Docker.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@an-altosian
Copy link
Author

an-altosian commented Mar 21, 2026

Hi @heylf , lol I appologize for another PR wtih 100 files modified. But I guess this is what we need to pay for making nf-core linting happy. 😄 I will run another round of real data testing and let you know when it is ready to merge.

an-altosian and others added 7 commits March 22, 2026 15:54
- Fix cellpose patch: remove flows/cells output, use *_cp_masks.tif glob
- Fix ficture --features argument to be optional when no gene list provided
- Fix get_coordinates and extract_data bugs from code review
- Fix baysor preprocess to use resources/usr/bin pattern consistently
- Fix resolift container and version channel
- Update cellpose meta.yml and diff patches
- Update modules.config with corrected ext.args
- Update baysor subworkflows for proper channel handling
- Clean README contributor link (an-altosian → dongzehe)
- Update linting workflow and nf-core config

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
nf-core tools 3.4.1 cannot parse the cellpose module output format
(4-element tuples for topic-based version channels), causing
"too many values to unpack (expected 2, got 4)" lint error.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The resolift module now emits ["resolift", "1.0"] instead of
["python", "3.12.3"] after migration to topic-based version channels.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resource labels:
- Add process_xl label (30 CPUs, 240 GB) for large processes
- Add process_gpu_single label for single-GPU processes (Cellpose, StarDist)
- Remove hardcoded AWS-specific resource values from base.config
- Remove per-process withName overrides (SEGGER_TRAIN, SEGGER_PREDICT, MULTIQC)
- Rename gpu_single → process_gpu_single for nf-core consistency
- Segger train/predict use process_xl + process_gpu labels
- Segger create_dataset uses process_xl label

Channel naming:
- Rename ch_transcripts_parquet → ch_transcripts_file throughout pipeline
- Rename BAYSOR_PREPROCESS emit from transcripts_parquet → transcripts_file
  (output is CSV, not Parquet — name was misleading)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix baysor version eval: replace broken Julia one-liner (escaped
  single quotes fail in eval()) with baysor --version command
- Fix get_coordinates stub: bare echo produces empty stdout, now
  emits valid placeholder coordinates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…t emit name

- Update baysor preview/run/segfree module snapshots: "" -> "0.7.1"
- Update pipeline-level snapshots (image/preview/segfree): null -> "0.7.1"
- Fix preprocess test: transcripts_parquet -> transcripts_file (matching emit rename)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update cellpose module from ad65d06 to 0780b96 (topic channel migration).
Restores flows/cells optional outputs. Diff now only changes labels
(process_high + process_gpu_single) and adds zero-cell detection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@an-altosian an-altosian force-pushed the feature/xenium-processing-updates branch from c2f19f2 to 4974467 Compare March 22, 2026 17:07
an-altosian and others added 3 commits March 22, 2026 17:33
…f-core patch format

The unzip.diff had system diff -u format with absolute paths and timestamps,
and xeniumranger-import-segmentation.diff had git a/b/ prefixes. Both cause
nf-core lint to crash with "too many values to unpack (expected 2, got 4)"
in components_differ.py which expects bare relative paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- stardist: update to 4e783502 (Seqera Containers GPU image, model input,
  topic-based versions) — this is our own merged upstream PR
- opt/flip, opt/stat, opt/track: update to 7d3e5c9d (linting cleanup)
- multiqc: not updated — new upstream has breaking input signature change
  (tuple val(meta)) that requires workflow adaptation
- All .diff patch files regenerated against new upstream bases

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Bump actions/checkout and actions/upload-artifact to latest pinned SHAs
to match the nf-core pipeline template.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@an-altosian an-altosian changed the title fix: restore missing modules and update CI snapshots feat: xenium patch tiling/stitching, cellpose downscaling, module updates, and bug fixes Mar 22, 2026
Update multiqc from old 6-positional-arg signature to new tuple-based
input/output format. Adapt all 3 workflow call sites (MULTIQC_PRE_XR_RUN,
MULTIQC_POST_XR_RUN, MULTIQC) to construct tuple input with meta map.
Bump multiqc to 1.33, regenerate patch for custom xenium-extra container.
Update test snapshots for new stub file entries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@DongzeHE DongzeHE requested a review from heylf March 22, 2026 20:16
@heylf
Copy link
Collaborator

heylf commented Mar 24, 2026

Oh wow. No worries. I will go over. If this solves the linting issues then big big thanks!

Copy link
Collaborator

@heylf heylf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@heylf heylf merged commit fb0cb32 into nf-core:dev Mar 24, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants