We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

ContEst without normal.bam

Hello. We only have tumor samples and don't have the paired normal samples. Can we use panel_of_normal.vcf file that mutect2 generated for over 100 normal samples as the genotypes? I tried the following command, but the result is empty. How can I run contEst if I don't have normal sample? Thanks.

java -jar GenomeAnalysisTK.jar \
-T ContEst \
-R hs37d.fa \
-I tumor.bam \
--genotypes Panel_of_normal.vcf \
--popfile hg19_population_stratified_af_hapmap_3.3.FIX.vcf.gz \
-L target.bed \
-isr INTERSECTION -o contamination_out.txt


Issue · Github
by Sheila

Issue Number
Last Updated
Closed By

Best Answer


  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭
    edited October 2016

    Hi @RebeccaS,

    ContEst requires the matched normal's genotypes as it looks for some subset of germline homozygous variant (hom-var) sites in your normal that are non-hom-var in the tumor to calculate contamination. If you don't have the normal BAM, you can use a VCF that genotypes the matched normal, e.g. genotype array data. Alternatively, you subset your normal sample from cohort VCF using SelectVariants. See this previous discussion and this one for additional information.

  • RebeccaSRebeccaS Member

    Thanks. Shlee.

    I don't understand "you subset your normal sample from cohort VCF using SelectVariants.". In the link of "this previous discussion" you shared with me, I don't see the details about SelectVariants runing. I could check how to run SelectVariants command, however, I am not sure what I should select from panel_of_normal.vcf by using SelectVaraints, select the site with >80% bases showing ALT and at least 50x coverage homozygous sites or anything else? Thanks.

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭


    Refer to the SelectVariants tool documentation for information on SelectVariants. I shared the other links for additional general background information that should be of interest to those using ContEst. As for your question, you need to select the matching normal sample from the cohort VCF that comes from the same individual as your tumor sample of interest. Just to confirm, does your cohort VCF contain sample-level columns?

    If what I mean is still unclear, then let me refer you to the workshop presentation that covers ContEst. Study information presented in slides 15–19 of the GATKwr12-9-Somatic_SNPs_and_indels.pdf in the March 2016 workshop (1603) folder. I expanded this portion of the presentation back in the day, so if anything is unclear, please ping me with your questions. This is the last workshop that extensively covers somatic variant calling in that it covers ContEst (the other workshops covering ContEst were 1511 and 1602). Meaning, you won't find these slides in the later presentations on the same subject.

  • RebeccaSRebeccaS Member


    Unfortunately, we don't have the matching normal sample in the cohort VCF. As I said in my original comment, we only have tumor sample, no normal sample from the same individual. The panel of normal vcf that we have is constructed from 100 normal samples, but these normal samples are not matching with the tumor sample that I have. With the scenario, can we still use ContEst? Thanks again.

Sign In or Register to comment.