We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!
False indel PASSed but not covered in IGV
I came across a set of false positive indels obtained in some targeted sequencing experiments.
Situations are similar to the following, where a deletion looks to have high quality, according to quality and filter. Genotype quality looks also good. But in IGV the situation is different being not possible to view exactly the variant
chr1 68905301 GACTTGCTTATTGGATCTTCCTTGTC G 412.73 PASS AC=1;AF=0.500;AN=2;BaseQRankSum=-1.810e-01;ClippingRankSum=0.00;DP=99;ExcessHet=3.0103;FS=18.765;MLEAC=1;MLEAF=0.500;MQ=56.92;MQ0=0;MQRankSum=-3.160e-01;QD=4.17;ReadPosRankSum=-8.529e+00;SOR=2.712;set=variant2 GT:AD:DP:GQ:PGT:PID:PL 0/1:74,25:99:99:0|1:68905301_GA
Please find attached IGV screenshot of the previous variant. indel1.png and indel2.png represent the same variant with different zoom in.
The commands I used respectively for variant calling and filtering are the following:
java -Xmx64g -jar GenomeAnalysisTK.jar -T HaplotypeCaller -I EYE113_trim_align_sortidx.bam -R ucsc.hg19.fa -nct 10 --emitRefConfidence GVCF --variant_index_type LINEAR --variant_index_parameter 128000 -stand_call_conf 30 -stand_emit_conf 10 --min_base_quality_score 30 --intervals eye_haloplex.bed -A Coverage -A FisherStrand -A BaseQualityRankSumTest -A HaplotypeScore -A MappingQualityRankSumTest -A MappingQualityZero -A QualByDepth -A RMSMappingQuality -A ReadPosRankSumTest -A SpanningDeletions -o EYE113_varcall.g.vcf
java -Xmx25g -jar GenomeAnalysisTK.jar -T GenotypeGVCFs -V EYE113_varcall.g.vcf -R ucsc.hg19.fa -stand_call_conf 30 -stand_emit_conf 10 -A Coverage -A FisherStrand -A BaseQualityRankSumTest -A HaplotypeScore -A InbreedingCoeff -A MappingQualityRankSumTest -A MappingQualityZero -A QualByDepth -A RMSMappingQuality -A ReadPosRankSumTest -o EYE113_TRY_genot.g.vcf
SelectVariants to select only Indels
java -Xmx8g -jar GenomeAnalysisTK.jar -T SelectVariants --variant EYE113_TRY_genot.g.vcf -R ucsc.hg19.fa -selectType INDEL -o EYE113_TRY_genot_varfilt_INDEL.vcf
Variant Filtration to filter only indels
/java -Xmx64g -jar GenomeAnalysisTK.jar -T VariantFiltration --variant EYE113_TRY_genot_varfilt_INDEL.vcf -R ucsc.hg19.fa --clusterWindowSize 10 --filterExpression "QD < 2.0 || FS > 200.0 || ReadPosRankSum < -20.0" --filterName "HARD_TO_VALIDATE" --filterExpression "QUAL < 30.0" --filterName "VeryLowQual" --filterExpression "QUAL > 30.0 && QUAL < 100.0" --filterName "LowQual" -o EYE113_TRY_genot_varfilt_INDEL_filt.vcf
//I did selectVariants and VariantFiltration for SNPs, too then I combine the two vcfs with CombineVariants.
I did not have any specific error in any of the program. Only during VariantFiltration I have the warning:
WARN 21:46:22,732 Interpreter - ![38,47]: 'QD < 2.0 || FS > 60.0 || MQ < 40.0 || MQRankSum < -12.5 || ReadPosRankSum < -8.0;' undefined variable MQRankSum
but, as far as I can see in the forum, I can ignore this.