CombineVariants_ERROR

anupamanupam Posts: 8Member
edited January 2013 in Ask the GATK team

Hi. I want to merge two VCF files. Initially I was selected only indels(by select variant option). Now I want to merge these two VCF file which contains only INDELS. But When I run the command, I am getting the same error:

ERROR ------------------------------------------------------------------------------------------
##### ERROR stack trace 
java.lang.NumberFormatException: For input string: "."

I run this command:

java -jar -Xmx2g GenomeAnalysisTK.jar -R hg19_5.fasta -T CombineVariants -V indelsample1.vcf -V indelsample3.vcf -o indels1s3.vcf -genotypeMergeOptions UNIQUIFY

Could you please tell me what is the reason behind this? and how to merge two VCF file having INDELS?

Thanks in advance.

Post edited by Geraldine_VdAuwera on

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,672Administrator, GATK Developer admin

    Hi there, could you please post the entire error message?

    Geraldine Van der Auwera, PhD

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,672Administrator, GATK Developer admin

    Also, please validate your VCF files to make sure there's nothing wrong with them.

    Geraldine Van der Auwera, PhD

  • anupamanupam Posts: 8Member

    Hi. The VCF file was fine. First, I have selecting the INDELS by running the command:

    java -jar -Xmx2g GenomeAnalysisTK.jar -T SelectVariants -V input.vcf -selectType INDEL -o output.vcf

    By this way, I extracted INDEL from my other vcf file.

    Then I run the combine command, as i describe above.

    but, when I choose, -selectType SNP, the program works smoothly and combine the two vcf files. (but give me error when combine INDELS)

    Here is my total error message: INFO 07:36:26,913 HelpFormatter - --------------------------------------------------------------------------------- INFO 07:36:26,942 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.2-16-g9f648cb, Compiled 2012/12/04 03:46:58 INFO 07:36:26,942 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 07:36:26,942 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk INFO 07:36:26,947 HelpFormatter - Program Args: -R hg19_5.fasta -T CombineVariants -V indelsample1.vcf -V indelsample3.vcf -o indels1s3.vcf INFO 07:36:26,947 HelpFormatter - Date/Time: 2013/01/14 07:36:26 INFO 07:36:26,947 HelpFormatter - --------------------------------------------------------------------------------- INFO 07:36:26,947 HelpFormatter - --------------------------------------------------------------------------------- INFO 07:36:26,957 ArgumentTypeDescriptor - Dynamically determined type of indelsample1.vcf to be VCF INFO 07:36:26,959 ArgumentTypeDescriptor - Dynamically determined type of indelsample3.vcf to be VCF INFO 07:36:26,967 GenomeAnalysisEngine - Strictness is SILENT INFO 07:36:27,160 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE Target Coverage: 1000
    INFO 07:36:27,226 RMDTrackBuilder - Loading Tribble index from disk for file indelsample1.vcf INFO 07:36:27,300 RMDTrackBuilder - Loading Tribble index from disk for file indelsample3.vcf INFO 07:36:27,363 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 07:36:27,363 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining INFO 07:36:27,436 CombineVariants - Priority string not provided, using arbitrary genotyping order: null WARN 07:36:27,540 VCFUtils$HeaderConflictWarner - Allowing unequal description fields through: keeping INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele"> excluding INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed"> WARN 07:36:27,540 VCFUtils$HeaderConflictWarner - Ignoring header line already in map: this header line = source=SelectVariants already present header = source=CGAPipeline_2.0.2.19;cgatools_1.6.0 WARN 07:36:27,569 VCFUtils$HeaderConflictWarner - Ignoring header line already in map: this header line = SelectVariants="analysis_type=SelectVariants input_file=[] read_buffer_size=null phone_home=STANDARD gatk_key=null tag=NA read_filter=[] intervals=null excludeIntervals=null interval_set_rule=UNION interval_merging=ALL interval_padding=0 reference_sequence=hg19_5.fasta nonDeterministicRandomSeed=false disableRandomization=false maxRuntime=-1 maxRuntimeUnits=MINUTES downsampling_type=BY_SAMPLE downsample_to_fraction=null downsample_to_coverage=1000 enable_experimental_downsampling=false baq=OFF baqGapOpenPenalty=40.0 performanceLog=null useOriginalQualities=false BQSR=null quantize_quals=0 disable_indel_quals=false emit_original_quals=false preserve_qscores_less_than=6 defaultBaseQualities=-1 validation_strictness=SILENT remove_program_records=false keep_program_records=false unsafe=null num_threads=1 num_cpu_threads_per_data_thread=1 num_io_threads=0 monitorThreadEfficiency=false num_bam_file_handles=null read_group_black_list=null pedigree=[] pedigreeString=[] pedigreeValidationType=STRICT allow_intervals_with_unindexed_bam=false generateShadowBCF=false logging_level=INFO log_to_file=null help=false variant=(RodBinding name=variant source=sortmsample1-7492.vcf) discordance=(RodBinding name= source=UNBOUND) concordance=(RodBinding name= source=UNBOUND) out=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub no_cmdline_in_header=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub sites_only=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub bcf=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub sample_name=[] sample_expressions=null sample_file=null exclude_sample_name=[] exclude_sample_file=[] select_expressions=[] excludeNonVariants=false excludeFiltered=false regenotype=false restrictAllelesTo=ALL keepOriginalAC=false mendelianViolation=false mendelianViolationQualThreshold=0.0 select_random_fraction=0.0 remove_fraction_genotypes=0.0 selectTypeToInclude=[INDEL] keepIDs=null fullyDecode=false forceGenotypesDecode=false justRead=false maxIndelSize=2147483647 ALLOW_NONOVERLAPPING_COMMAND_LINE_SAMPLES=false filter_mismatching_base_and_quals=false" already present header = SelectVariants="analysis_type=SelectVariants input_file=[] read_buffer_size=null phone_home=STANDARD gatk_key=null tag=NA read_filter=[] intervals=null excludeIntervals=null interval_set_rule=UNION interval_merging=ALL interval_padding=0 reference_sequence=hg19_5.fasta nonDeterministicRandomSeed=false disableRandomization=false maxRuntime=-1 maxRuntimeUnits=MINUTES downsampling_type=BY_SAMPLE downsample_to_fraction=null downsample_to_coverage=1000 enable_experimental_downsampling=false baq=OFF baqGapOpenPenalty=40.0 performanceLog=null useOriginalQualities=false BQSR=null quantize_quals=0 disable_indel_quals=false emit_original_quals=false preserve_qscores_less_than=6 defaultBaseQualities=-1 validation_strictness=SILENT remove_program_records=false keep_program_records=false unsafe=null num_threads=1 num_cpu_threads_per_data_thread=1 num_io_threads=0 monitorThreadEfficiency=false num_bam_file_handles=null read_group_black_list=null pedigree=[] pedigreeString=[] pedigreeValidationType=STRICT allow_intervals_with_unindexed_bam=false generateShadowBCF=false logging_level=INFO log_to_file=null help=false variant=(RodBinding name=variant source=sortmsample3.vcf) discordance=(RodBinding name= source=UNBOUND) concordance=(RodBinding name= source=UNBOUND) out=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub no_cmdline_in_header=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub sites_only=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub bcf=org.broadinstitute.sting.gatk.io.stubs.VariantContextWriterStub sample_name=[] sample_expressions=null sample_file=null exclude_sample_name=[] exclude_sample_file=[] select_expressions=[] excludeNonVariants=false excludeFiltered=false regenotype=false restrictAllelesTo=ALL keepOriginalAC=false mendelianViolation=false mendelianViolationQualThreshold=0.0 select_random_fraction=0.0 remove_fraction_genotypes=0.0 selectTypeToInclude=[INDEL] keepIDs=null fullyDecode=false forceGenotypesDecode=false justRead=false maxIndelSize=2147483647 ALLOW_NONOVERLAPPING_COMMAND_LINE_SAMPLES=false filter_mismatching_base_and_quals=false" WARN 07:36:27,570 VCFUtils$HeaderConflictWarner - Ignoring header line already in map: this header line = fileDate=20121120 already present header = fileDate=20121212 WARN 07:36:27,570 VCFUtils$HeaderConflictWarner - Ignoring header line already in map: this header line = source_MEAN_GC_CORRECTED_CVG=GS00884-DNA_A01:58.76 already present header = source_MEAN_GC_CORRECTED_CVG=GS01131-DNA_A01:58.26 WARN 07:36:27,570 VCFUtils$HeaderConflictWarner - Ignoring header line already in map: this header line = source_NUMBER_LEVELS=GS00884-DNA_A01:5 already present header = source_NUMBER_LEVELS=GS01131-DNA_A01:5 INFO 07:36:28,712 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    java.lang.NumberFormatException: For input string: "." at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:470) at java.lang.Integer.valueOf(Integer.java:570) at org.broadinstitute.sting.utils.codecs.vcf.AbstractVCFCodec.decodeInts(AbstractVCFCodec.java:680) at org.broadinstitute.sting.utils.codecs.vcf.AbstractVCFCodec.createGenotypeMap(AbstractVCFCodec.java:639) at org.broadinstitute.sting.utils.codecs.vcf.AbstractVCFCodec$LazyVCFGenotypesParser.parse(AbstractVCFCodec.java:92) at org.broadinstitute.sting.utils.variantcontext.LazyGenotypesContext.decode(LazyGenotypesContext.java:130) at org.broadinstitute.sting.utils.variantcontext.LazyGenotypesContext.getGenotypes(LazyGenotypesContext.java:120) at org.broadinstitute.sting.utils.variantcontext.GenotypesContext.iterator(GenotypesContext.java:461) at org.broadinstitute.sting.utils.variantcontext.VariantContextUtils.mergeGenotypes(VariantContextUtils.java:853) at org.broadinstitute.sting.utils.variantcontext.VariantContextUtils.simpleMerge(VariantContextUtils.java:514) at org.broadinstitute.sting.gatk.walkers.variantutils.CombineVariants.map(CombineVariants.java:292) at org.broadinstitute.sting.gatk.walkers.variantutils.CombineVariants.map(CombineVariants.java:114) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:243) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano$TraverseLociMap.apply(TraverseLociNano.java:231) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:287) at org.broadinstitute.sting.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:252) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:120) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:67) at org.broadinstitute.sting.gatk.traversals.TraverseLociNano.traverse(TraverseLociNano.java:23) at org.broadinstitute.sting.gatk.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:74) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:281) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:237) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:147) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:94)

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.2-16-g9f648cb):
    ERROR
    ERROR Please visit the wiki to see if this is a known problem
    ERROR If not, please post the error, with stack trace, to the GATK forum
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: For input string: "."
    ERROR ------------------------------------------------------------------------------------------
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,672Administrator, GATK Developer admin

    I see that you're using version 2.2-16. Could you please try again (from the first step) with the latest version (2.3-9)? This may be a bug that has been fixed recently.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.