How to select insertion or deletion from a indels vcf file?

I have a InDels vcf file which contains both insertions and deletions. I would like to select only insertions or only deletions from this InDels vcf file using the following command line.

$ java -jar $GATK -T SelectVariants -R ref_genome/IRGSP.fasta -V indel.vcf -selectType insertion -o insertion.vcf

From this command, I got the following errors-

ERROR A USER ERROR has occurred (version 3.7-0-gcfedb67):
ERROR This means that one or more arguments or inputs in your command are incorrect.
ERROR The error message below tells you what is the problem.
ERROR If the problem is an invalid argument, please check the online documentation guide
ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions https://software.broadinstitute.org/gatk
ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
ERROR MESSAGE: Could not create module String because an exception of type NullPointerException occurred caused by exception null

Can anyone tell why is this error? Thanks

Best Answer


  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭
    edited February 2017

    Hi @shis,

    SelectVariants can subset out indels but not insertions or deletions in isolation. Valid types are INDEL, SNP, MIXED, MNP, SYMBOLIC, NO_VARIATION. Please refer to the documentation here.

    I think your other option is to use AWK to filter based on column contents. I have to think about this more.

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Yes, I think AWK is the way to go. You can compare the length of the strings in each column, and using > or < have the command print rows to a new file.

  • shisshis USAMember

    Hi Shlee,
    Thank you so much for your answer regarding my questions, and put my questions as a feature request on future GATK4 tools. I think it would be much easier for users to select either insertions or deletions from the InDEls variants. I do highly appreciate your initiative for adding this feature in GATK4 tools.

    Thanks, Shis

