-
Notifications
You must be signed in to change notification settings - Fork 9
Description
Description of feature
At the moment the combination between DNA-RNA samples at the consensus module of the pipeline is done through patient in the samplesheet but in more complex scenarios where there is more than one sample per patient this does not work. Ideally, DNA and RNA should be done through sample name to make sure we assign the proper pairs together.
To dos:
- Revise code for DNA-RNA combinations in the
vcf_consensussubworkflow - Change
patientcombinations forsamplecombinations - Review ids of filenames and how outputs are named (many are named through
sampleas well and could cause conflicts in DNA and RNA have same name). - Review samplesheet: potentially add
data_type=RNA/DNAto samplessheet instead of identifying RNA by thestatus, i.e. status should only define if tumour or normal and data_type of DNA or RNA.
This comes from a very useful discussion with @tdanhorn , thank you.
Note:
Current combinations when you have several samples for same patient is unclear due to lack of testing and test data. A workaround is to treat the desired combinations as different patients. For example, patient1 might become patient1_A , patient1_B , patient1_C, etc. A very big flaw of this workaround is that you will need to specify the normal sample for each patient as well, making things redundant if you want to use the same normal for most samples.Alternatively, one could submit the pipeline separately for each patient set and hit -resume every run so the normal processes are cached and only the tumours will be done. Not ideal but hopefully this will get solved in time. If you have any feedback or what to propose any solutions please feel free to comment.