We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

GATK GermlineCNVcaller Procedure

Hello

I am trying to implement GATK 4 gCNVcaller on Exome sequencing data to call CNVs from read depth. I seem to be getting confused on how to actually use the tool. From my understanding I will need to run the worflow using training samples, in COHORT MODE. Then run the workflow again with test samples. I have completed running my training samples through DetermineGermlineContigPloidy and germlineCNVcaller, but do not know if I should run the test samples using RUN mode, or should i use PostProcessIntervals?

One other question, is where the call and model "shards" would be?

Answers

  • sleeslee Member, Broadie, Dev ✭✭✭

    @dislek you might find the main gCNV tutorial at https://gatkforums.broadinstitute.org/gatk/discussion/11684 useful, in addition to the supplementary notebooks that @bhanuGandham linked. Hopefully, this main tutorial makes the intended workflow more clear.

    In COHORT mode, GermlineCNVCaller will learn a denoising model while simultaneously calling CNVs in your training samples. You can subsequently use this denoising model to run additional test samples in CASE mode, which will simply call CNVs in those samples.

    GermlineCNVCaller is designed to run on suitably sized shards of the exome/genome (however, each GermlineCNVCaller shard will need the result of DetermineGermlineContigPloidy, which is run over the entire exome/genome). Results from individual shards are then stitched together using PostprocessGermlineCNVCalls to create single-sample VCFs that cover the entire exome/genome.

    On the other hand, PreprocessIntervals is a simple tool that is intended to help you construct bins for CollectReadCounts. It can either take a target list (e.g., for WES) or any other list of genomic regions via -L (e.g., autosomes + allosomes for WGS) and perform simple operations such as padding and dividing regions into equally sized bins.

Sign In or Register to comment.