Bug with VariantFiltration with missing genotype info column?

Hi there,

I noticed that when I combine Gvcfs from multiple samples and genotype them, some samples that have missing genotypes are denoted by ./.:0,0:0.

1) It looks like VariantFiltration does not like it when it finds ./.:0,0:0. right after GT:AD:DP:GQ:PL column, i.e., when this happens for the first sample it has this problem, but if it happens for 2nd or nth sample, it seems okay..

2) Why are the missing genotypes denoted by ./.:0,0:0 and not ./.:0,0:0:0:0 , the GQ and PL fields dont have corresponding zeros. I am wondering if other tools that take vcf files, dont seem to like them due to this issue.

Thanks,
Deepthi

p.s I am using the latest build v3.4-46-gbc02625

Best Answer

Answers

  • dr153dr153 DukeMember
    edited October 2015

    Yes, I get an error message.

    command :

    INFO 15:22:05,350 HelpFormatter - Program Args: -R /data/davelab/bix/resources/genomes/hg19/ucsc.hg19.fasta -T VariantFiltration --variant /data/davelab/projects/Xenomousie/xeno_mousie.gvcf.list.restricted.genot
    yped.vcf -o /data/davelab/projects/Xenomousie/xeno_mousie.gvcf.list.restricted.genotyped.filtereddpandmq.v2.vcf --filterExpression DP > 10 && MQ > 30 --filterName vcfqual

    The error message is here :

    ERROR MESSAGE: Line 409: there aren't enough columns for line chr (we expected 9 tokens, and saw 1 ), for input source:

    The line with the error is below:

    chr1 990517 . C T 518.51 . AC=12;AF=1.00;AN=12;DP=18;FS=0.000;MLEAC=12;MLEAF=1.00;MQ=60.00;MQ0=0;QD=28.81;SOR=1.609 GT:AD:DP:GQ:PL ./.:0,0:0 ./.:0,0:0 1/1:0,2:2:6:76,6,0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 1/1:0,2:2:6:74,6,0 ./.:0,0:0 1/1:0,2:2:6:49,6,0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 1/1:0,3:3:9:79,9,0 ./.:0,0:0 ./.:0,0 1/1:0,2:2:6:70,6,0 ./.:0,0:0 ./.:0,0:0 ./.:0,0:0 1/1:0,7:7:21:193,21,0 ./.:0,0:0

    Is this because the DP values are zeros or ./.?

    2) Thanks for the vcf format clarification.

  • dr153dr153 DukeMember

    This solved my issue : I reran gatk-genotyper with "-never_trim_vcf_format_field" and then used VariantFiltration tool.. That seemed to work!

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @dr153
    Hi,

    Thanks for reporting your solution!

    -Sheila

Sign In or Register to comment.