CombineVariants in GATK4

Is it planned to add CombineVariants tool into GATK4.0 toolkit (it existed in previous GATK versions)? The only similar tool currently available in GATK4.0 Beta is GatherVCFs which has very limited possibility and cannot concatenate unsorted VCFs or merge different INFO fields correctly.
Thanks! :)

Tagged:

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @Vladimir_Kovacevic
    Hi.

    There is a tool called MergeVcfs which you can use instead of CombineVariants. It looks like there is no documentation for it yet, but if you use --list with gatk-launch, it lists the available tools. We should have better documentation within the coming months when GATK4 is out of beta.

    -Sheila

  • Hi @Sheila!
    Thank you for your response and suggestion. We tried MergeVcfs and unfortunately it failed with two VCFs that pass with CombineVariants. Here is the error:
    Using GATK jar /GATK/gatk-4.beta.2/gatk-package-4.beta.2-local.jar
    Running:
    java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=1 -Dsnappy.disable=true -jar /GATK/gatk-4.beta.2/gatk-package-4.beta.2-local.jar MergeVcfs --input reheadered_subset.vcf --input tp.fp.subset.vcf --output output.vcf
    12:00:35.965 INFO NativeLibraryLoader - Loading libgkl_compression.dylib from jar:file:/GATK/gatk-4.beta.2/gatk-package-4.beta.2-local.jar!/com/intel/gkl/native/libgkl_compression.dylib
    [September 19, 2017 12:00:35 PM CEST] MergeVcfs --input reheadered_subset.vcf --input tp.fp.subset.vcf --output output.vcf --VALIDATION_STRINGENCY STRICT --COMPRESSION_LEVEL 1 --MAX_RECORDS_IN_RAM 500000 --CREATE_INDEX true --CREATE_MD5_FILE false --help false --version false --showHidden false --verbosity INFO --QUIET false --use_jdk_deflater false --use_jdk_inflater false
    [September 19, 2017 12:00:35 PM CEST] Executing as user@users-MacBook-Pro.local on Mac OS X 10.12.6 x86_64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_121-b13; Version: 4.beta.2
    12:00:41.015 INFO MergeVcfs - HTSJDK Defaults.COMPRESSION_LEVEL : 1
    12:00:41.016 INFO MergeVcfs - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    12:00:41.016 INFO MergeVcfs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    12:00:41.016 INFO MergeVcfs - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    12:00:41.016 INFO MergeVcfs - Deflater: IntelDeflater
    12:00:41.016 INFO MergeVcfs - Inflater: IntelInflater
    12:00:41.016 INFO MergeVcfs - Initializing engine
    12:00:41.016 INFO MergeVcfs - Done initializing engine
    12:00:41.573 INFO MergeVcfs - Processed 10,000 records. Elapsed time: 00:00:00s. Time for last 10,000: 0s. Last read position: 3:189,995,416
    12:00:41.785 INFO MergeVcfs - Processed 20,000 records. Elapsed time: 00:00:00s. Time for last 10,000: 0s. Last read position: 8:132,077,728
    12:00:41.957 INFO MergeVcfs - Shutting down engine
    [September 19, 2017 12:00:41 PM CEST] org.broadinstitute.hellbender.tools.picard.vcf.MergeVcfs done. Elapsed time: 0.10 minutes.
    Runtime.totalMemory()=352845824
    java.lang.IllegalStateException: The elements of the input Iterators are not sorted according to the comparator htsjdk.variant.variantcontext.VariantContextComparator
    at htsjdk.samtools.util.MergingIterator.next(MergingIterator.java:113)
    at org.broadinstitute.hellbender.tools.picard.vcf.MergeVcfs.doWork(MergeVcfs.java:126)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:115)
    at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:170)
    at org.broadinstitute.hellbender.cmdline.PicardCommandLineProgram.instanceMain(PicardCommandLineProgram.java:62)
    at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:131)
    at org.broadinstitute.hellbender.Main.mainEntry(Main.java:152)
    at org.broadinstitute.hellbender.Main.main(Main.java:230)

    Do you have any more suggestions?

    FYI @teodora_aleksic

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @Vladimir_Kovacevic
    Hi,

    Hmm. Can you confirm the VCFs pass ValidateVariants?

    Also, can you try deleting the VCF indices and re-generating them?

    Thanks,
    Sheila

  • said3427said3427 MexicoMember
    edited September 2017

    I am moving to GATK4 and had the same question. It worked for me :smile:

    Thank you,
    Said MM

  • mjtivmjtiv Newark, DEMember
    edited April 4

    It appears the MergeVcfs is built on top of Picard (GATK 4.0 does mentions this too). So, if you run into issues with this command go to Picard and look at what they recommend to do (files must be sorted the same, output file has a file type designated (vcf etc). Just skimming the error message above I think the error is caused by the files not being sorted the same.

    Here is a similar command using straight Picard

    java -jar picard MergeVcfs \
    I=sample_8_filtered_raw_SNPs.vcf \
    I=filtered_sample_8_raw_indels.vcf \
    O=combined_Filtered_Variants_4-4-2018.vcf

Sign In or Register to comment.