Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Get all VCF files in proper format?

ankurc17ankurc17 BangaloreMember

Hi,

I tried running the following command to generate the VCF file using Haplotype Caller
java -jar GenomeAnalysisTK.jar -T HaplotypeCaller --dbsnp chr7.vcf -R chr7.fa -I chr7.bam -A AlleleBalance -stand_call_conf 30.0 -stand_emit_conf 10.0 -dt NONE --output_mode EMIT_VARIANTS_ONLY -nt 1 -filterMBQ -rf BadCigar -rf MappingQualityZero -rf BadMate -o test.vcf

However, the poutput files obtained are having issues in the Format column i.e. instead of getting values in GT:AD:DP:GQ:PL format for all the varaints it is giving me only GT in some and GT:GQ:PL in others. Now is there any way to provide an input so that the format is same for all variants i.e. even if there is no value for DP it gives me NA / 0 instead of not printing it at all.

This vcf file acts as a part of further down stream analysis for annotation purposes so I need it a specific format so that I can create a parser for the same.

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
    Yes there is an argument to do this.. I forget the exact name but you can find it in the engine documentation.
  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Ah, my teammate @Sheila informs me that the argument I'm thinking of won't work. She will jump in here with some more helpful information shortly.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @ankurc17
    Hi,

    Have a look at this thread for more information. However, in that case, the truncated FORMAT field seems to be only at non-variant sites. Are you saying you have some sites that are variant and missing the extra annotations? Can you post some examples?

    Thanks.
    Sheila

  • ankurc17ankurc17 BangaloreMember
    edited August 2016

    Hello @Sheila

    Some examples for your reference are

    chr19 34882985 rs77922621 A C 10.90 LowCoverage;LowQual;VeryLowQual AC=2;AF=1.00;AN=2;DB;DP=0;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=NaN;SOR=0.693 GT:GQ:PL 1/1:3:37,3,0

    chr10 120260837 rs2420480 T C 77.78 LowCoverage AC=1;AF=0.500;AN=2;BaseQRankSum=0.000;ClippingRankSum=0.000;DB;DP=4;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=19.44;ReadPosRankSum=-0.319;SOR=0.916 GT ./.

    chr10 135500177 . G C 62.74 LowCoverage AC=2;AF=1.00;AN=2;DP=2;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=39.00;QD=31.37;SOR=0.693 GT:AD:DP:GQ:PL 1/1:0,2:2:6:90,6,0
    chr10 135500179

    As you can see the last variant has all 5 columns whereas the other two don't.

    I did try the --never_trim_vcf_format_field but it didn't work.

  • ankurc17ankurc17 BangaloreMember

    Otherwise the only thing I can think of is to convert my vcf file into a tab delimited text file using --VariantsToTable command get all the required fields in a proper order, map the chromosome and position in vcf and tab-delimited file and regerate the vcf file again.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @ankurc17
    Hi,

    Can you tell us if the missing fields happen very often or just in some small portion of your calls? We have a bug report in for a similar case.

    For now, yes, I think it is best to use VariantsToTable for your purpose.

    -Sheila

  • ankurc17ankurc17 BangaloreMember

    Hi,

    It is for only few of the variants. I shall proceed with the way discussed earlier and wait for an updated solution from your end.

Sign In or Register to comment.