Holiday Notice:
The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!

Use VariantsToTable to extract alternate allele count

I'm using the following VariantsToTable command options to extract fields from a VCF file:

/usr/lib/jvm/jre-1.8.0-openjdk/bin/java -Xmx8g -jar /home/1GenomeRef/GATK/GATK_3.5/GenomeAnalysisTK.jar \
-T VariantsToTable \
-R ref.fa \
-V file1.vcf \
-F POS -F ID -F REF -F ALT -F QUAL -F FILTER -F AC -F AN -GF GT  \
--showFiltered \
--out outputfile \

This extracts the correct information, but my original VCF file reports each sample genotype (GT field) as an alternate allele count (0/0, 0/1 or 1/1) and the new output file reports the genotype as the base (C/T, for example.) So the GT for sample1 in my original file might be "0/1" but in the new file it's recoded as "C/T."

I prefer to retain the original genotype format but do not see an option that allows me to request this. Is there an option I can use for this? Or another tool I can apply that will quickly recode the new VCF?

Thanks so much. (And I am following the Best Practices Guidelines - although we are using GATK version 3.5, this is a choice we made for the purpose of ensuring the highest possible consistency with older data called using version 3.5)

Tagged:

Best Answer

Answers

  • AEGentryAEGentry Member

    Thank you, @shlee for the very quick and helpful response.

    I decided to use VCFtools to extract the information I wanted using the command:

    --extract-FORMAT-info GT
    

    It's not optimal because the VCFtools commands are generally less flexible, but it serves the purpose for now.

Sign In or Register to comment.