Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

VariantsToBinaryPed tool missing from GATK4?

gaelgarciagaelgarcia Member
edited May 2018 in Ask the GATK team

Hi all,

I've been prepping my data to input to GATK's Variant Manipulation Tool VariantsToBinaryPed , but I just realized that GATK4 doesn't list this tool in its documentation. It is only under the GATK3 documentation. Is there a reason for this?

A bit of background - I need to convert my VCF and accompanying family info into PLINK's binary ped file(set), .bed / .bim / .fam to check for pedigree errors using KING.

I was able to install GATK4 after the usual new software hassles - and I'm worried I'll have to install GATK3 instead to actually use VariantsToBinaryPed! 😫

Running the example in the documentation (which I just realized is under GATK3):

       java -jar GenomeAnalysisTK.jar \
       -T VariantsToBinaryPed \
       -R reference.fasta \
       -V  ~/MIPS_CSE/MIPS-02-13-18.vcf.bgz \ 
       -m ~/MIPS/03_IdentityCheck/KING/targeted_seq_ped.fam \
       -bed output.bed \
       -bim output.bim \
       -fam output.fam

returns:

`Error: Unable to access jarfile GenomeAnalysisTK.jar`

Will I have to install GATK3 to use this tool? If so, can GATK3 and GATK4 coexist on my system?

Thanks for any info.

Post edited by gaelgarcia on

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @gaelgarcia
    Hi,

    Yep, you will need to install GATK3 if you want to use that tool. VariantsToBinaryPed is not in GATK4. But, perhaps you can find some outside tools to do what you want?

    GATK3 and GATK4 can indeed coexist on your system. It should not be much of an issue to download and run GATK3 from our Downloads section.

    -Sheila

  • gaelgarciagaelgarcia Member
    edited May 2018

    Hi @Sheila - Thank you for your answer.

    I was able to run VariantstoBinaryPed with the Docker image of GATK3 (I wasn't able to run it locally due to my system consistently running out of open file handles (please see here.)

    However, I am still not getting the desired output from the tool - the .bim and .bed files are empty. However, the .fam file is being subset and ordered correctly according to the SampleIDs of the VCF.

    I get the following error message:

    INFO 22:50:32,289 ProgressMeter - done 0.0 10.0 s 18.0 w 100.0% 10.0 s 0.0 s

    INFO 22:50:32,289 ProgressMeter - Total runtime 10.92 secs, 0.18 min, 0.00 hours

    Done. There were 1 WARN messages, the first 1 are repeated below.

    WARN 22:50:19,495 IndexDictionaryUtils - Track variant doesn't have a sequence dictionary built in, skipping dictionary validation

    I am wondering if this issue is related to this - the thread is from a few years ago but perhaps the issue has not been solved? Perhaps @Geraldine_VdAuwera knows something about this.

    Thanks again.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @gaelgarcia
    Hi,

    Can you post the exact command you ran? Can you also post the entire log output?

    Thanks,
    Sheila

Sign In or Register to comment.