Our documentation websites are currently offline due to a data center fire. We do not yet have an ETA for restoring service; we’ll update this message when we know more.

The list of input alleles must contain as an allele but that is not the case at position.

Dear the GATK team,
There are a bug when running CombineGVCF, i have 330 sample GVCF files that all be outputted in "-ERC GVCF" mode by Haplotype Caller.

the error information is:
==========start at : Wed Sep 16 17:10:28 HKT 2015 ==========
INFO 17:10:33,406 HelpFormatter - ---------------------------------------------------------------------------------
INFO 17:10:33,412 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.4-46-gbc02625, Compiled 2015/07/09 17:38:12
INFO 17:10:33,413 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 17:10:33,413 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 17:10:33,418 HelpFormatter - Program Args: -et NO_ET -K /ifshk5/PC_PA_EU/USER/zhangbaifeng/software/gatk/zhangbaifeng_genomics.cn.key -T CombineGVCFs -R /ifshk1/BC_CANCER/01bin/DNA/software/pipeline/CSAP_v5.2.4/Database/human_19/hg19_fasta_GATK/hg19.fasta --variant 1.sample.g.vcf --variant 2.sample.g.vcf ...--variant 330.sample.g.vcf
-o /ifshk7/BC_RES/TECH/PMO/zhangbaifeng/330.snp.analysis/330.sample.GVCF/haplotypecaller/vcf/cohort.g.vcf
INFO 17:10:33,450 HelpFormatter - Executing as zhangbaifeng@login-0-3.local on Linux 2.6.18-194.blc amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_45-b14.
INFO 17:10:33,450 HelpFormatter - Date/Time: 2015/09/16 17:10:33
INFO 17:10:33,451 HelpFormatter - ---------------------------------------------------------------------------------
INFO 17:10:33,451 HelpFormatter - ---------------------------------------------------------------------------------
INFO 17:10:43,301 GenomeAnalysisEngine - Strictness is SILENT
INFO 17:10:43,571 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
INFO 17:10:54,982 GenomeAnalysisEngine - Preparing for traversal
INFO 17:10:54,994 GenomeAnalysisEngine - Done preparing for traversal
INFO 17:10:54,995 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 17:10:54,995 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 17:10:54,996 ProgressMeter - Location | sites | elapsed | sites | completed | runtime | runtime

ERROR ------------------------------------------------------------------------------------------
ERROR A USER ERROR has occurred (version 3.4-46-gbc02625):
ERROR
ERROR This means that one or more arguments or inputs in your command are incorrect.
ERROR The error message below tells you what is the problem.
ERROR
ERROR If the problem is an invalid argument, please check the online documentation guide
ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
ERROR
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
ERROR
ERROR MESSAGE: The list of input alleles must contain as an allele but that is not the case at position 15274; please use the Haplotype Caller with gVCF output to generate appropriate records
ERROR ------------------------------------------------------------------------------------------

Could you tell me how to solve it ? Thanks very much.

Tagged:

Best Answers

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    You need to narrow down the source of the problem by running on subsets of the files. This will show if it always happens or if there is one or more particular files that are associated with the problem.

  • ok. Meanwhile, i want to ask whether it is a trouble or not to generate the GVCF files in "-ERC GVCF" mode by Haplotype Caller version 2.8, because i used the CombineGVCFs version 3.4 to combine these files. My HaplotypeCaller script is
    /ifshk5/PC_PA_EU/USER/huxiaoshu/softwares/java/jre1.7.0_45/bin/java -Xmx4g -Djava.io.tmpdir=/ifshk7/BC_RES/TECH/PMO/zhangbaifeng/330.snp.analysis/330.sample.GVCF/java_tmp -jar /ifshk5/PC_PA_EU/USER/maolikai/bin/GATK/GenomeAnalysisTK.jar \
    -et NO_ET -K /ifshk5/PC_PA_EU/USER/zhangbaifeng/software/gatk/zhangbaifeng_genomics.cn.key \
    -T HaplotypeCaller -R /ifshk1/BC_CANCER/01bin/DNA/software/pipeline/CSAP_v5.2.4/Database/human_19/hg19_fasta_GATK/hg19.fasta -ERC GVCF --variant_index_type LINEAR --variant_index_parameter 128000 \
    -I /ifshk5/PC_PA_EU/USER/zhangbaifeng/deadapter_bwa/Hani_M16/3_18/deadapter_bwa/result/Hani_M16_003_M_RA/result_alignment/Hani_M16_003_M_RA.recal.rmdup.bam \
    --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 \
    -o /ifshk7/BC_RES/TECH/PMO/zhangbaifeng/330.snp.analysis/330.sample.GVCF/haplotypecaller/vcf/Hani_M16_003_M_RA.recal.raw_variants.g.vcf

    thanks very much.

    @Geraldine_VdAuwera said:
    You need to narrow down the source of the problem by running on subsets of the files. This will show if it always happens or if there is one or more particular files that are associated with the problem.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Oh, that is probably the source of your problem. You shouldn't mix and match versions, especially major versions like 2.x and 3.x, and especially between analysis steps that are tightly coupled.

  • Thanks for your reply, i agree with you.But i want to know why couldn't find the "CombineGVCFs" module for 2.8x version, because i could successfully use the 2.8x version HaplotypeCaller "--ERC GVCF" to generate .g.vcf files and the 2.8x version should have corresponding "CombineGVCFs" module. Due to my analysis pipeline is depend on your 2.8 version GATK and i spent much time to generate this .g.vcf files, i doesn't want to do it again by 3.x version. So, Could you tell me how to run next step for 2.8x version .g.vcf files? thanks very much.

    @Geraldine_VdAuwera said:
    Oh, that is probably the source of your problem. You shouldn't mix and match versions, especially major versions like 2.x and 3.x, and especially between analysis steps that are tightly coupled.

  • thanks very much. my .g.vcf files is outputted by the Haptypecaller "--ERC GVCF" in 2.8 version, the CombineGVCFs and GenotypeGVCFs in version 3.0 maybe couldn't work in these files.

    @Sheila said:
    Dragon_fire
    Hi,

    Both CombineGVCFs and GenotypeGVCFs were introduced in version 3.0
    http://gatkforums.broadinstitute.org/discussion/3862/release-notes-for-gatk-version-3-0

    -Sheila

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    The GVCF format emitted by HC in 2.8 was immature and did not contain all the information that is now contained in GVCFs emitted by 3.x versions.

  • Thanks. For these immature GVCF files, Could you tell me what i should do next ?

    @Geraldine_VdAuwera said:
    The GVCF format emitted by HC in 2.8 was immature and did not contain all the information that is now contained in GVCFs emitted by 3.x versions.

  • thanks very much.

    @Geraldine_VdAuwera said:
    It would really be better to redo the calling with a 3.x version. If that's not an option, all you can do is analyze the files the old-fashioned way, as if they were regular VCFs. Sorry to be the bearer of bad news.

  • hi, @Geraldine_VdAuwera,
    Sorry For Bothering you again. I have bam files that have undergone BQSR (with V2.8 GATK), I am wondering if it is good idea to use the existing recalibrated base quality scores for the Haplotypecaller GVCF mode ( with V3.4 GATK) ? thanks.

    @Geraldine_VdAuwera said:
    It would really be better to redo the calling with a 3.x version. If that's not an option, all you can do is analyze the files the old-fashioned way, as if they were regular VCFs. Sorry to be the bearer of bad news.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @Dragon_fire
    Hi,

    If you have time,it is best to rerun BQSR on the bams. However, if time is an issue, simply using the latest version for variant calling will be fine.

    -Sheila

  • Fine, thanks.

    @Sheila said:
    Dragon_fire
    Hi,

    If you have time,it is best to rerun BQSR on the bams. However, if time is an issue, simply using the latest version for variant calling will be fine.

    -Sheila

Sign In or Register to comment.