Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

GATK4,Cann't get right CalculateContamination result

yyjyyj Member
Question regarding CalculateContamination(GATK/4.1.2.0):

With CalculateContamination in tumor matched mode, I get:
contamination error
NaN 1.0

When I look at the tumor.table and normal. table files generated by Getpileupsummaries, I don't see any unusual data structure/value

What can be the problem?

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @yyj

    Can you please post the exact commands you used for Getpileupsummaries and CalculateContamination. Also please post the first few records of the Getpileupsummaries table.

  • yyjyyj Member
    @ bhanuGandham
    The Getpileupsummaries commands as follow:
    for sample in *.bam;
    do
    base=${sample%%.*};
    gatk4 GetPileupSummaries
    -I $sample
    -V ../../reference/somatic-hg38_af-only-gnomad.hg38.SNP_biallelic.vcf.gz
    -O ./contamination/"$base"_getpileupsummaies.table;
    done

    The CalculateContamination commands as follow:
    for sample in *.bam;
    do base=${sample%%.*};
    gatk4 GetPileupSummaries
    -I $sample
    -V ../../reference/somatic-hg38_af-only-gnomad.hg38.SNP_biallelic.vcf.gz
    -L ../../reference/S07604514_hs_hg38/S07604514_hs_hg38/S07604514_Padded.bed
    -O ./contamination/"$base"_getpileupsummaies.table;done
    The first 30 lines of getpileupsummaies.table:
    contig position ref_count alt_count other_alt_count allele_frequency
    chr1 12882 0 0 0 0.021
    chr1 13110 0 0 0 0.149
    chr1 13143 0 0 0 0.05
    chr1 13149 0 0 0 0.011
    chr1 13178 0 0 0 0.061
    chr1 13273 0 0 0 0.115
    chr1 13281 0 0 0 0.04
    chr1 13418 0 0 0 0.183
    chr1 13613 0 0 0 0.02
    chr1 13621 0 0 0 0.011
    chr1 13649 0 0 0 0.054
    chr1 13684 0 0 0 0.027
    chr1 13752 0 0 0 0.018
    chr1 13757 0 0 0 0.021
    chr1 14522 0 0 0 0.05
    chr1 14542 0 0 0 0.072
    chr1 14574 0 0 0 0.101
    chr1 14590 0 0 0 0.097
    chr1 14599 0 0 0 0.123
    chr1 14604 0 0 0 0.125
    chr1 14610 0 0 0 0.128
    chr1 14626 0 0 0 0.011
    chr1 14671 0 0 0 0.011
    chr1 14677 0 0 0 0.058
    chr1 14773 0 0 0 0.016
    chr1 14843 0 0 0 0.018
    chr1 14933 0 0 0 0.158
    chr1 14948 0 0 0 0.055
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited May 29

    @yyj
    1)
    --intervals is a required argument for GetPileupSummaries, see : https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_hellbender_tools_walkers_contamination_GetPileupSummaries.php

    2)

    The CalculateContamination commands as follow:
    for sample in .bam;
    do base=${sample%%.
    };
    gatk4 GetPileupSummaries

    In this case you are using GetPileupSummaries instead of CalculateContamination. See: https://software.broadinstitute.org/gatk/documentation/tooldocs/current/org_broadinstitute_hellbender_tools_walkers_contamination_CalculateContamination.php

  • RishabhRishabh IndiaMember
    edited July 22
    I also got the same result as @yyj .The command i use were

    java -jar /PathToGatk4.1.2.0 GetPileupSummaries
    -I tumor.bam
    -V output.vcf
    -L V6.bed
    -O tumor_getpileupsummaries.table

    OUTPUT
    #<METADATA>SAMPLE=Hg19_P1
    contig position ref_count alt_count other_alt_count allele_frequency
    chr1 981727 0 0 0 0.076
    chr1 986538 0 0 0 0.115
    chr1 986541 0 0 0 0.154
    chr1 986545 0 0 0 0.148
    chr1 986549 0 0 0 0.143
    chr1 986557 0 0 0 0.154

    I generated output.vcf by this command (is this approach right)
    COMM : bcftools query -H -f '%CHROM %POS %ID %REF %ALT %QUAL %FILTER AF=[%AF]\t \n' normal1.vcf.gz > filtered.vcf
    and then i added the header to this file and filtered bialleles by select variants.

    Command used for generating Calculate Contamination
    java -jar /PathToGatk CalculateContamination
    -I tumor_getpileupsummaries.table
    -tumor-segmentation segments.table
    -O tumor_calculatecontamination.table

    OUTPUT (segments.table)
    #<METADATA>SAMPLE=Hg19_P1
    contig start end minor_allele_fraction

    OUTPUT (tumor_calculatecontamination.table)
    sample contamination error
    Hg19_P1 NaN 1.0

    I am stuck at this, no values are comming, except 0.

    can you guide me with this issue @bhanuGandham
  • RishabhRishabh IndiaMember
    @bshifaw
    Thankyou for reply
    I looked the documents provided by you ,but it's still not resolved, can you suggest some other thing.
  • bshifawbshifaw Member, Broadie, Moderator admin

    Running CalculateContamination using the following jar file resulted in the same issue?
    contamination-patch-5-27-2019.jar

Sign In or Register to comment.