If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Information about the result of CNVDiscoveryPipeline

hi Bob,I got the result of CNVDiscoveryPipeline
In the vcf file:
SVTYPE=CNV,How could I known this CNV is belong to DEL or Dup?
and if it is Dup,how many times the CNV repeat?
and what are the mean CN,CNQ,CNL,CNP fields respectively?
Thank you very much!



  • bhandsakerbhandsaker Member, Broadie, Moderator admin

    Those four fields are defined in the VCF specification (and are listed in the VCF header).
    CN is the integer copy number call from Genome STRiP.
    CNQ/CNL/CNP are analogous to GQ/GL/GP. They represent:
    CNQ: phred-scaled quality of the CN call
    CNL: Vector of log10 likelihoods of each CN state starting from zero up to some maximum derived from the data (copy number states above the maximum have negligible likelihood)
    CNP: Like CNL, but a posterior likelihood based on the frequency distribution in the population estimated from the genotyped cohort

    You didn't mention CNF, but this is the "fractional" copy number, which is a point estimate of the most likely copy number based on read depth alone. This isn't currently in the VCF spec.

    To determine if a particular sample carries a deletion or duplication, compare CN to the expected ploidy for that sample at that site (i.e. taking into account sex on the sex chromosomes).

    You may also be interested in the CopyNumberClass annotator, which emits the distribution of observed copy numbers and also classifies the sites as DEL/DUP/MIXED (MIXED meaning that there is evidence for both deletion and duplication alleles compared to the reference).

Sign In or Register to comment.