Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

GATK v3.7.0 SelectVariants discards variant

Hi

I've found an usual case where a variant disappears during the hard-filtering workflow.

#Joint genotyping /share/apps/jre-distros/jre1.8.0_101/bin/java -Djava.io.tmpdir=/state/partition1/tmpdir -Xmx16g -jar /share/apps/GATK-distros/GATK_3.7.0/GenomeAnalysisTK.jar \ -T GenotypeGVCFs \ -R /state/partition1/db/human/gatk/2.8/b37/human_g1k_v37.fasta \ -V GVCFs.list \ -L /data/diagnostics/pipelines/GermlineEnrichment/GermlineEnrichment-"$version"/"$panel"/"$panel"_ROI_b37.bed \ -o "$seqId"_variants.vcf \ -ped "$seqId"_pedigree.ped \ -dt NONE

After joint genotyping GATK outputs a HET true-positive SNV 11:2906165A>G and a deletion.

11 2906165 . AGCCGGGGCCGGG GGCCGGGGCCGGG,A 5324.68 . AC=16,1;AF=0.364,0.023;AN=44;BaseQRankSum=0.786;ClippingRankSum=0.00;DP=707;ExcessHet=1.2164;FS=0.760;InbreedingCoeff=0.1366;MLEAC=16,1;MLEAF=0.364,0.023;MQ=60.20;MQRankSum=0.00;QD=13.31;ReadPosRankSum=-1.033e+00;SOR=0.643 GT:AD:DP:GQ:PGT:PID:PL 0/1:21,26,0:47:99:.:.:507,0,364,569,442,1011

I select SNVs for hard filtering with:

#Select SNPs /share/apps/jre-distros/jre1.8.0_101/bin/java -Djava.io.tmpdir=/state/partition1/tmpdir -Xmx4g -jar /share/apps/GATK-distros/GATK_3.7.0/GenomeAnalysisTK.jar \ -T SelectVariants \ -R /state/partition1/db/human/gatk/2.8/b37/human_g1k_v37.fasta \ -V "$seqId"_variants.lcr.vcf \ -selectType SNP \ -L /data/diagnostics/pipelines/GermlineEnrichment/GermlineEnrichment-"$version"/"$panel"/"$panel"_ROI_b37.bed \ -o "$seqId"_snps.vcf \ -dt NONE

and Indels with:

#Select INDELs /share/apps/jre-distros/jre1.8.0_101/bin/java -Djava.io.tmpdir=/state/partition1/tmpdir -Xmx16g -jar /share/apps/GATK-distros/GATK_3.7.0/GenomeAnalysisTK.jar \ -T SelectVariants \ -R /state/partition1/db/human/gatk/2.8/b37/human_g1k_v37.fasta \ -V "$seqId"_variants.lcr.vcf \ -selectType INDEL \ -L /data/diagnostics/pipelines/GermlineEnrichment/GermlineEnrichment-"$version"/"$panel"/"$panel"_ROI_b37.bed \ -o "$seqId"_indels.vcf \ -dt NONE

This row is absent from both output VCF files. See attached screenshot (order: all, SNPs, Indels).

Thanks
Matt

Best Answers

Answers

  • CardiffBioinfCardiffBioinf CardiffMember
    Accepted Answer

    My oversight sorry -- found this answer

    @Kurt said:
    That would be -selectType MIXED I believe.

  • CardiffBioinfCardiffBioinf CardiffMember

    Here is the workaround I'm planning:

    #Select SNPs /share/apps/jre-distros/jre1.8.0_101/bin/java -Djava.io.tmpdir=/state/partition1/tmpdir -Xmx4g -jar /share/apps/GATK-distros/GATK_3.7.0/GenomeAnalysisTK.jar \ -T SelectVariants \ -R /state/partition1/db/human/gatk/2.8/b37/human_g1k_v37.fasta \ -V "$seqId"_variants.lcr.vcf \ -selectType SNP \ -L /data/diagnostics/pipelines/GermlineEnrichment/GermlineEnrichment-"$version"/"$panel"/"$panel"_ROI_b37.bed \ -o "$seqId"_snps.vcf \ -dt NONE

    #Filter SNPs /share/apps/jre-distros/jre1.8.0_101/bin/java -Djava.io.tmpdir=/state/partition1/tmpdir -Xmx4g -jar /share/apps/GATK-distros/GATK_3.7.0/GenomeAnalysisTK.jar \ -T VariantFiltration \ -R /state/partition1/db/human/gatk/2.8/b37/human_g1k_v37.fasta \ -V "$seqId"_snps.vcf \ --filterExpression "QUAL < 30.0" \ --filterName "LowQual" \ --filterExpression "QD < 2.0" \ --filterName "QD" \ --filterExpression "FS > 60.0" \ --filterName "FS" \ --filterExpression "SOR > 3.0" \ --filterName "SOR" \ --filterExpression "MQ < 40.0" \ --filterName "MQ" \ --filterExpression "MQRankSum < -12.5" \ --filterName "MQRankSum" \ --filterExpression "ReadPosRankSum < -8.0" \ --filterName "ReadPosRankSum" \ -L /data/diagnostics/pipelines/GermlineEnrichment/GermlineEnrichment-"$version"/"$panel"/"$panel"_ROI_b37.bed \ -o "$seqId"_snps_filtered.vcf \ -dt NONE

    #Select non-snps (INDEL, MIXED, MNP, SYMBOLIC, NO_VARIATION) /share/apps/jre-distros/jre1.8.0_101/bin/java -Djava.io.tmpdir=/state/partition1/tmpdir -Xmx16g -jar /share/apps/GATK-distros/GATK_3.7.0/GenomeAnalysisTK.jar \ -T SelectVariants \ -R /state/partition1/db/human/gatk/2.8/b37/human_g1k_v37.fasta \ -V "$seqId"_variants.lcr.vcf \ --selectTypeToExclude SNP \ -L /data/diagnostics/pipelines/GermlineEnrichment/GermlineEnrichment-"$version"/"$panel"/"$panel"_ROI_b37.bed \ -o "$seqId"_non_snps.vcf \ -dt NONE

    #Filter non-snps (INDEL, MIXED, MNP, SYMBOLIC, NO_VARIATION) /share/apps/jre-distros/jre1.8.0_101/bin/java -Djava.io.tmpdir=/state/partition1/tmpdir -Xmx4g -jar /share/apps/GATK-distros/GATK_3.7.0/GenomeAnalysisTK.jar \ -T VariantFiltration \ -R /state/partition1/db/human/gatk/2.8/b37/human_g1k_v37.fasta \ -V "$seqId"_non_snps.vcf \ --filterExpression "QUAL < 30.0" \ --filterName "LowQual" \ --filterExpression "QD < 2.0" \ --filterName "QD" \ --filterExpression "FS > 200.0" \ --filterName "FS" \ --filterExpression "SOR > 10.0" \ --filterName "SOR" \ --filterExpression "ReadPosRankSum < -20.0" \ --filterName "ReadPosRankSum" \ --filterExpression "InbreedingCoeff < -0.8" \ --filterName "InbreedingCoeff" \ -L /data/diagnostics/pipelines/GermlineEnrichment/GermlineEnrichment-"$version"/"$panel"/"$panel"_ROI_b37.bed \ -o "$seqId"_non_snps_filtered.vcf \ -dt NONE

    #Combine filtered VCF files /share/apps/jre-distros/jre1.8.0_101/bin/java -Djava.io.tmpdir=/state/partition1/tmpdir -Xmx4g -jar /share/apps/GATK-distros/GATK_3.7.0/GenomeAnalysisTK.jar \ -T CombineVariants \ -R /state/partition1/db/human/gatk/2.8/b37/human_g1k_v37.fasta \ --variant "$seqId"_snps_filtered.vcf \ --variant "$seqId"_non_snps_filtered.vcf \ -o "$seqId"_combined_filtered.vcf \ -genotypeMergeOptions UNSORTED \ -dt NONE

    @Geraldine_VdAuwera if this sounds nuts please let me know

    Thanks
    Matt

  • CardiffBioinfCardiffBioinf CardiffMember

    @Sheila Thanks thats really helpful!

Sign In or Register to comment.