Mismatches channels with multiple samples in the input sheet#97
Mismatches channels with multiple samples in the input sheet#97an-altosian wants to merge 16 commits into
Conversation
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.2.1. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
khersameesh24
left a comment
There was a problem hiding this comment.
Hi Dongze, can you please go over the comments below
|
Done. Also, I want to bring the nf-core module for proseg to your attention. They now have their docker image as well, although it is not upgraded to proseg version 3 yet. |
|
the latest commit is to solve a cellpose error I encountered, suggested by MIT-LCP/wfdb-python#493 I am working on enabling GPU for cellpose cuz it runs really slow on CPU. |
|
Hi @khersameesh24 I removed all bidcell related code. Now it is good to merge. |
|
Hi @heylf , this PR is about the channel mismatch issue we discussed previously. Please let me know what you think. thanks! |
Hi @DongzeHE , I am working to get the pipeline to run with multiple samples from the sampleheet. Will merge your changes with it |
|
Wonderful! Feel free to let me know if there is anything I can help! |
|
FYI, all my changes are for getting the pipeline to run with multiple samples from the sampleheet. |
I am fixing all the subworkflows to support multi-sample. Can you add your github username in the README file in the contributors section? |
|
Hi @khersameesh24 and @heylf , I just went through the PRs for multi-sample support. I think it applied most changes I made in this PR, including the bug fix for the cellpose module. Great work! I guess the only task remaining is adding my name somewhere in the contributor section, so that I can safely delete this PR. Thanks! |
@an-altosian I have included your details in the config file, this PR can be closed now |
|
Addressed in #102. |
Description of Changes
This PR addresses an, in my opinion, important file-mismatching bug in the
prosegsubworkflow that occurs when processing multiple samples. The changes ensure that all input and output channels within the subworkflow are correctly paired using sample-specific metadata.Reason for Changes
I'm applying the
prosegsubworkflow to several datasets and have identified a file mismatching issue. The currentprosegsubworkflow, like most inspatialxe, does not pair channels by sample before passing them to downstream tasks. Because Nextflow channels operate on a First-In, First-Out (FIFO) basis (see here), this can lead to incorrect file pairings when processing multiple samples.For example, the
proseg2baysormodule outputs multiple channels. One channel, for the transcript assignment file, is tagged bymeta. Another, for the polygon mask file, is not. When these are passed to thexeniumranger import-segmentationmodule (aliased asxris), the pipeline discards themetatag from the transcript channel. It then passes two "file-only" channels toxrisindividually, along with the original xenium bundle channel.Due to the FIFO nature of these independent channels, it's nearly certain that
xriswill receive mismatched files when a run includes multiple input samples. This is exactly what happened in my recent analysis.To fix this, I propose updating the input and output blocks of the affected modules. The changes I've made for the
prosegsubworkflow ensure that all input files are matched bymeta.idand all output files are tagged withmeta. I suggest, and can help, to apply this changes to other subworkflows as well to avoid file mismatches.Additional Notes
bidcellWorkflow: This PR also includes an initial draft of thebidcellworkflow. It is not called anywhere in the pipeline because it is not fully functional as it depends on single-cell reference data, an input type not yet supported by the pipeline. I suggest merging this code as a foundation, and I will continue its development once the required input functionality is added.PR checklist
nf-core lint).nextflow run . -profile test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).