Our documentation websites are currently offline due to a data center fire. We do not yet have an ETA for restoring service; we’ll update this message when we know more.

GVCF error

psanchez820psanchez820 Mexico CityMember

Hello, I ran the Genotype Caller using the GATK version GenomeAnalysisTK-3.3-0 but now that we are in the filtering process we are having trouble and we found this error in the gvcf. This is an example of the error (in bold), where it says that the variant has a second alternate allele when it does not. It is affecting the filtering process and I am wondering how we could fix it.

Thanks!

-Paulina-

Final GVCF
5 37036492 . C ** T ** 301.07 PASS AC=3;AF=0.065;AN=46;BaseQRankSum=-7.200e-01;ClippingRankSum=0.00;DP=201;FS=2.105;GQ_MEAN=27.70;GQ_STDDEV=19.31;InbreedingCoeff=-0.1151;MLEAC=3;MLEAF=0.065;MQ=60.00;MQ0=0;MQRankSum=-3.540e-01;NCC=7;QD=13.69;ReadPosRankSum=-7.200e-01;SOR=0.242;VQSLOD=5.78;culprit=MQ GT:AD:DP:GQ:PL 0/2:13,0,3:16:19:19,57,345,0,288,279 0/0:19,0,0:19:0:0,0,400,0,400,400 0/1:3,2:5:34:34,0,74 0/0:2,0:2:6:0,6,69 0/2:3,0,1:4:12:12,21,75,0,54,51 0/0:5,0:5:15:0,15,176 0/2:17,0,3:20:12:12,63,458,0,395,386 0/0:16,0,0:16:0:0,0,359,0,359,359 0/0:1,0:1:3:0,3,35 0/0:2,0:2:6:0,6,67 0/0:2,0:2:6:0,6,60 0/0:13,0,0:13:5:0,5,368,5,368,368 0/0:17,0,0:17:16:0,16,485,16,485,485 0/0:4,0:4:12:0,12,137 0/0:3,0:3:0:0,0,37

Pan001N GVCF
5 37036492 . C . . END=37036492 GT:DP:GQ:MIN_DP:PL 0/0:3:0:3:0,0,37

Tagged:

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @psanchez820
    Hi Paulina,

    This is indeed very strange. Can you post the exact commands you ran, starting with Haplotype Caller?

    Thanks,
    Sheila

  • psanchez820psanchez820 Mexico CityMember

    @Geraldine_VdAuwera Ok, I'll run it and let you know.

    Thanks!

  • wubinwubin ChinaMember

    @Geraldine_VdAuwera said:
    Also, can you run the 3.4 version on this site and let us know if the error persists?

    a position existed in "./target.bed" , didn't exist in gVCF file, but after GenotypeGVCFs, a SNP turned up at this position

    I'm running the "HaplotypeCaller" walker to generate a GVCF file, the commandline was as follows:

    java -Xmx15g -Djava.io.tmpdir=pwd/tmp \
    -jar ./GATK/GenomeAnalysisTK.jar \
    -T HaplotypeCaller \
    -R ./hg19/ucsc.hg19.fasta \
    -I ./output.recal.cleaned.bam \
    --dbsnp ./Data/dbsnp_138.hg19.excluding_sites_after_129.vcf \
    --emitRefConfidence GVCF \
    --variant_index_type LINEAR \
    --variant_index_parameter 128000 \
    -L ./target.bed \

    -o ./SNP_Indel_HaplotypeCaller.g.vcf

    and then I used "GenotypeGVCFs" to generate a vcf file which contains only variants. the commandline was as follows:

    ==============================================
    java -Xmx10g -Djava.io.tmpdir=pwd/tmp -jar ./GATK/GenomeAnalysisTK.jar \
    -T GenotypeGVCFs \
    -R ./hg19/ucsc.hg19.fasta \
    --variant ./SNP_Indel_HaplotypeCaller.g.vcf \
    -stand_call_conf 30 \
    -stand_emit_conf 10 \

    -o ./pedi_merged.vcf

    In the file "pedi_merged.vcf", I found many variants which cannot be found in the corresponding gVCF file,such as

    ==============================================

    chr10 126089434 . G A 36.78 . AC=1;AF=0.500;AN=2;BaseQRankSum=0.736;ClippingRankSum=-7.360e-01;DP=3;FS=0.000;GQ_MEAN=26.00;MLEAC=1;MLEAF=0.500;MQ=60.00;MQ0=0;MQRankSum=-7.360e-01;NCC=0;QD=12.26;ReadPosRankSum=0.736;SOR=1.179 GT:AD:DP:GQ:PL 0/1:1,2:3:26:65,0,26

    this SNP can not be found in file "SNP_Indel_HaplotypeCaller.g.vcf", in the file "SNP_Indel_HaplotypeCaller.g.vcf", we can see

    chr10 126089432 . G . . END=126089433 GT:DP:GQ:MIN_DP:PL 0/0:4:12:4:0,12,139

    chr10 126089435 . T . . END=126089437 GT:DP:GQ:MIN_DP:PL 0/0:5:15:5:0,15,171

    we can see that not only the SNP, even the position "chr10 126089434" was not present in the gVCF file. while after "GenotypeGVCFs ", we can get a SNP which had no information in the corresponding gVCF file

    when I used the "HaplotypeCaller" walker to generate a gVCF file, I used the "-L ./target.bed " argument. the file " ./target.bed " contained the position "chr10 126089434",

    ==============================================

    chr10 126089161 126089800

    So we can see that a position existed in "./target.bed" , didn't exist in gVCF file, but after GenotypeGVCFs, a SNP turned up at this position ! can anyone tell me what's wrong with my commandline or there are some other problem about GATK "HaplotypeCaller "?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator
Sign In or Register to comment.