We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

VariantsToTable issue

Dear Gatk team,

I have used VariantsToTable to extract specific features, by using the following command:

gatk VariantsToTable -V Annotated_NG.hg19_multianno.vcf -F CHROM -F AF -F MAF -F DP -GF GT -GF AD -GF DP -GF GQ -GF PL -F gnomAD_genome_ALL -F gnomAD_genome_AFR -F gnomAD_genome_AMR -F gnomAD_genome_ASJ -F gnomAD_genome_EAS -F gnomAD_genome_FIN -F gnomAD_genome_NFE -F gnomAD_genome_OTH -O Saudi_NG_gnomAD.txt

And I got the following error:

htsjdk.tribble.TribbleException: The provided VCF file is malformed at approximately line number 544: unparsable vcf record with allele *TTCT, for input source: Annotated_NG.hg19_multianno.vcf

Could anyone help me in that issue?

Tagged:

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Sakhaa

    Looks like there is an issue with the input vcf file. How was this file generated? Please run ValidateVariants tool on Annotated_NG.hg19_multianno.vcf to detect the cause for the error.

  • SakhaaSakhaa Member

    Hi @bhanuGandham

    I did the validation bu using the fplloing command:
    gatk ValidateVariants -R $REF -V Annotated_NG.hg19_multianno.vcf --dbsnp $dbSN

    and I got the following error:
    A USER ERROR has occurred: Input Annotated_NG.hg19_multianno.vcf fails strict validation: one or more of the ALT allele(s) for the record at position 1:10409 are not observed at all in the sample genotypes of type:

    And the position 1:10409

    1   10409   .   ACCCTAACCCTAACCCTAACCCTAACCCTAAC    A,* 1478.73 .   AC=11,0;AF=0.0514019,0;AN=214;BaseQRankSum=0.524;ClippingRankSum=0;DP=3050;ExcessHet=9.4611;FS=30.723;InbreedingCoeff=-0.1038;MLEAC=10,18;MLEAF=0.083,0.15;MQ=24.2;MQRankSum=0.21;QD=20.39;ReadPosRankSum=-0.524;SF=0,1;SOR=3.228;NS=107;MAF=0.0514019,0;AC_Het=9,0;AC_Hom=2,0;AC_Hemi=0,0;HWE=0.236917,1;ExcHet=0.979421,1;ANNOVAR_DATE=2018-04-16;gnomAD_genome_ALL=0.0535;gnomAD_genome_AFR=0.0217;gnomAD_genome_AMR=0.0317;gnomAD_genome_ASJ=0.0625;gnomAD_genome_EAS=0.0130;gnomAD_genome_FIN=0.0443;gnomAD_genome_NFE=0.0784;gnomAD_genome_OTH=0.0556;ALLELE_END;ANNOVAR_DATE=2018-04-16;gnomAD_genome_ALL=.;gnomAD_genome_AFR=.;gnomAD_genome_AMR=.;gnomAD_genome_ASJ=.;gnomAD_genome_EAS=.;gnomAD_genome_FIN=.;gnomAD_genome_NFE=.;gnomAD_genome_OTH=.;ALLELE_END   GT:PID:PGT:GQ:DP:PL:AD  0/0:.:.:0:28:0,0,484,0,484,484:28,0,0   0/1:10409_ACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:0|1:78:4:78,0,90,84,96,181:2,2,0   0/0:10403_ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:1|1:23:7:302,302,302,23,23,0:0,0,7   0/0:.:.:6:2:0,6,54,6,54,54:2,0,0    0/1:.:.:75:5:121,0,75,127,85,213:2,3,0  0/0:.:.:36:3:81,84,126,0,42,36:1,0,2    0/0:.:.:23:25:0,23,584,23,584,584:25,0,0    0/0:.:.:5:28:0,5,606,5,606,606:28,0,0   0/0:.:.:0:48:0,0,909,0,909,909:48,0,0   0/0:.:.:0:64:0,0,1161,0,1161,1161:64,0,0    0/0:.:.:0:67:0,0,1183,0,1183,1183:67,0,0    0/0:.:.:0:74:0,0,1457,0,1457,1457:74,0,0    0/0:.:.:21:9:0,22,631,21,630,629:9,0,0  0/0:.:.:42:50:0,42,630,42,630,630:50,0,0    0/0:.:.:0:79:0,0,1070,0,1070,1070:79,0,0    0/0:.:.:89:8:89,104,276,0,172,162:5,0,3 0/0:.:.:0:45:0,0,798,0,798,798:45,0,0   0/0:.:.:99:10:187,202,387,0,185,168:5,0,5   0/0:.:.:0:76:0,0,1573,0,1573,1573:76,0,0    0/0:.:.:29:2:29,32,74,0,42,39:1,0,1 0/0:.:.:0:58:0,0,1000,0,1000,1000:58,0,0    0/0:.:.:0:43:0,0,778,0,778,778:43,0,0   0/0:.:.:0:53:0,0,1011,0,1011,1011:53,0,0    0/0:10403_ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:0|1:99:9:109,126,532,0,406,397:6,0,3 0/0:10403_ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:0|1:53:11:53,79,362,0,283,276:9,0,2  0/0:.:.:0:26:0,0,483,0,483,483:26,0,0   0/0:10403_ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:0|1:14:6:0,16,276,14,274,273:5,0,1   0/0:.:.:0:43:0,0,448,0,448,448:43,0,0   0/0:.:.:0:25:0,0,406,0,406,406:25,0,0   0/0:10403_ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:0|1:8:3:0,8,95,8,95,95:3,0,0 0/1:10409_ACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:0|1:23:4:23,0,89,32,92,124:3,1,0   0/0:.:.:0:22:0,0,349,0,349,349:22,0,0   0/0:.:.:0:28:0,0,354,0,354,354:28,0,0   0/0:10403_ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:0|1:99:7:115,126,301,0,175,166:4,0,3 0/0:10403_ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:0|1:39:2:39,42,84,0,42,39:1,0,1  0/0:.:.:18:6:0,18,304,18,304,303:6,0,0  0/0:.:.:36:41:0,36,540,36,540,540:41,0,0    0/0:10403_ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:0|1:69:7:204,210,294,0,84,69:2,0,5   0/0:.:.:36:52:0,36,540,36,540,540:52,0,0    0/0:.:.:0:37:0,0,633,0,633,633:37,0,0   0/0:10403_ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:0|1:53:7:53,71,458,0,387,380:6,0,1   0/0:.:.:0:29:0,0,236,0,236,236:29,0,0   0/0:.:.:0:57:0,0,961,0,961,961:57,0,0   0/0:.:.:0:62:0,0,1053,0,1053,1053:62,0,0    0/0:.:.:0:32:0,0,463,0,463,463:32,0,0   0/1:.:.:24:4:81,0,24,86,33,119:2,2,0    0/0:10403_ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:0|1:7:3:0,7,233,7,233,233:3,0,0  0/0:.:.:99:6:117,126,243,0,117,108:3,0,3    0/0:.:.:70:4:117,120,199,0,79,70:1,0,3  0/0:.:.:36:38:0,36,540,36,540,540:38,0,0    0/1:.:.:99:8:145,0,146,157,158,316:4,4,0    0/0:.:.:0:27:0,0,565,0,565,565:27,0,0   0/0:.:.:24:35:0,24,360,24,360,360:35,0,0    0/1:.:.:14:6:14,0,115,30,118,148:5,1,0  0/0:.:.:0:36:0,0,623,0,623,623:36,0,0   0/0:.:.:0:42:0,0,548,0,548,548:42,0,0   0/1:.:.:99:5:109,0,133,115,142,257:2,3,0    0/0:.:.:0:36:0,0,641,0,641,641:36,0,0   0/0:.:.:16:5:0,16,192,16,192,192:5,0,0  1/1:.:.:3:1:44,3,0,45,4,46:0,1,0    0/0:10343_CCCTAACCCTA_C:0|1:12:4:0,12,163,12,163,163:4,0,0  0/0:.:.:0:24:0,0,473,0,473,473:24,0,0   0/0:.:.:21:31:0,21,315,21,315,315:31,0,0    0/0:.:.:0:31:0,0,505,0,505,505:31,0,0   0/0:.:.:9:3:96,97,98,9,10,0:0,0,3   0/0:.:.:9:25:0,9,135,9,135,135:25,0,0   0/0:.:.:0:25:0,0,225,0,225,225:25,0,0   0/0:.:.:0:22:0,0,540,0,540,540:22,0,0   0/0:.:.:0:26:0,0,341,0,341,341:26,0,0   0/0:.:.:0:22:0,0,353,0,353,353:22,0,0   0/0:.:.:75:3:90,96,180,0,84,75:2,0,1    0/0:.:.:33:38:0,33,495,33,495,495:38,0,0    0/1:.:.:26:3:26,0,75,32,78,110:2,1,0    0/0:.:.:15:42:0,15,955,15,955,955:42,0,0    0/0:.:.:0:24:0,0,381,0,381,381:24,0,0   0/0:.:.:0:33:0,0,418,0,418,418:33,0,0   0/0:.:.:92:5:155,158,263,0,105,92:1,0,4 0/0:.:.:3:40:0,3,783,3,783,783:40,0,0   0/0:.:.:84:5:84,90,428,0,337,328:2,0,3  0/0:.:.:0:24:0,0,364,0,364,364:24,0,0   0/0:.:.:0:26:0,0,483,0,483,483:26,0,0   0/1:10403_ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:1|0:54:4:78,0,54,84,60,144:2,2,0 0/0:.:.:18:34:0,18,270,18,270,270:34,0,0    0/0:.:.:0:35:0,0,655,0,655,655:35,0,0   0/0:.:.:15:29:0,15,225,15,225,225:29,0,0    0/0:.:.:12:28:0,12,180,12,180,180:28,0,0    0/0:.:.:0:32:0,0,527,0,527,527:32,0,0   0/0:.:.:9:25:0,9,135,9,135,135:25,0,0   0/0:.:.:0:16:0,0,223,0,223,223:16,0,0   0/0:.:.:39:2:39,42,109,0,67,64:1,0,1    0/0:.:.:1:4:0,1,129,9,132,140:3,1,0 0/0:.:.:0:40:0,0,743,0,743,743:40,0,0   0/0:.:.:0:27:0,0,534,0,534,534:27,0,0   0/0:.:.:0:35:0,0,497,0,497,497:35,0,0   0/0:10403_ACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAAC_A:1|1:1:2:42,43,46,1,3,0:1,0,1 0/0:.:.:0:26:0,0,292,0,292,292:26,0,0   0/0:.:.:0:23:0,0,225,0,225,225:23,0,0   0/0:.:.:0:34:0,0,651,0,651,651:34,0,0   0/0:.:.:15:24:0,15,225,15,225,225:24,0,0    0/0:.:.:99:7:142,150,294,0,144,131:3,0,4    0/0:.:.:0:26:0,0,321,0,321,321:26,0,0   0/0:.:.:0:33:0,0,468,0,468,468:33,0,0   0/0:.:.:99:5:131,134,268,0,134,121:1,0,4    0/0:.:.:12:24:0,12,180,12,180,180:24,0,0    0/0:.:.:21:24:0,21,315,21,315,315:24,0,0    0/0:.:.:0:21:0,0,322,0,322,322:21,0,0   0/0:.:.:0:30:0,0,572,0,572,572:30,0,0
    
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Sakhaa

    The error message suggests that your vcf file is malformed. How was this vcf generated? Unfortunately there’s not much we can do for you if your files have formatting issues that seem to have been introduced by a third party program.

  • SakhaaSakhaa Member
    edited November 2019

    Thank you @bhanuGandham

    I have used the best practice workflow for germline. I have generated 2 vcf files as cohorts, I want to merge them in one file, so I used another tool to marge the files.

    May I ask you which tool from gatk4 can be useful to marge to cohorts in one VCF file?

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @Sakhaa

    Hmmm, if you generated gvcf files then you could use the tools CombineGVCFs or GenomicsDBImport. On the other hand, if you want to combine vcf files from different cohorts, we do not have a tool for that purpose in GATK4. You could however use CombineVariants from GATK3 for this, but use it at your own risk. I say this only because we do not support GATK3 anymore but many users have found CombineVariants very helpful. Here is a link to its tool docs: https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_variantutils_CombineVariants.php

Sign In or Register to comment.