This project will work on adding a module to filter out genomic contaminants to the nf-core/sarek pipeline. This would be analogous to what is currently implemented in the nf-core/rnaseq pipeline with BBSplit. We will focus on implementing xengsort to do this. The steps here would be:
- finishing off the xengsort module
- making a subworkflow to run xengsort
- integrating this subworkflow into sarek
All skill levels welcome.
Goals
Introduce the filtering of genomic contaminants as a feature in the sarek pipeline.