HaplotypeCaller non-descript error

Hello all,

I'm trying to run HaplotypeCaller on a bam that I've created following the Best Practices, however it fails and gives me the following error message:

ERROR ------------------------------------------------------------------------------------------

That is literally the extent of the message. Preceding that is a list of scaffold/contig titles ending with a ']', proceedign that is my terminal command line.

I turned on logging (added "-log log.txt" to the end of my command) and the log contained the following:
INFO 11:30:11,884 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:30:11,886 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.6-0-g89b7209, Compiled 2016/06/01 22:27:29
INFO 11:30:11,886 HelpFormatter - Copyright (c) 2010-2016 The Broad Institute
INFO 11:30:11,886 HelpFormatter - For support and documentation go to https://www.broadinstitute.org/gatk
INFO 11:30:11,886 HelpFormatter - [Fri Sep 02 11:30:11 EDT 2016] Executing on Linux 3.16.0-77-generic amd64
INFO 11:30:11,886 HelpFormatter - Java HotSpot(TM) 64-Bit Server VM 1.8.0_101-b13 JdkDeflater
INFO 11:30:11,889 HelpFormatter - Program Args: -T HaplotypeCaller -R Csativa_genome.fa -I GBS_Csativa_clean_dedup.bam --genotyping_mode DISCOVERY -stand_emit_conf 10 -stand_call_conf 30 -o GBS_Csativa_clean_raw-variants.vcf -log HC_clean.log
INFO 11:30:11,891 HelpFormatter - Executing as [email protected] on Linux 3.16.0-77-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_101-b13.
INFO 11:30:11,891 HelpFormatter - Date/Time: 2016/09/02 11:30:11
INFO 11:30:11,892 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:30:11,892 HelpFormatter - --------------------------------------------------------------------------------
INFO 11:30:11,905 GenomeAnalysisEngine - Strictness is SILENT
INFO 11:55:56,387 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 500
INFO 11:55:56,392 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO 11:55:58,019 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 1.63
INFO 11:55:59,327 HCMappingQualityFilter - Filtering out reads with MAPQ < 20

The command I'm using is:
java -jar /Apps/gatk/GenomeAnalysisTK.jar -T HaplotypeCaller -R Csativa_genome.fa -I GBS_Csativa_clean_dedup.bam --genotyping_mode DISCOVERY -hets 0.01 -stand_emit_conf 10 -stand_call_conf 30 -o GBS_Csativa_clean_raw-variants.vcf

Any assistance would be appreciated, even if only to identify what the error actually is, I couldn't find anything on the forums with this kind of non-specific error. If you need more information I will gladly post it

Best Answers

Answers

  • MunhollMunholl WindsorMember

    Sorry, the command I pasted was not the correct one, this error came from a run that did not have the "-hets 0.01" argument.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Munholl
    Hi,

    I suspect it is an error with your reference contigs mismatching your BAM contigs. Are you using the exact same reference you aligned your BAM file to? Have a look at this article.

    -Sheila

  • MunhollMunholl WindsorMember

    Hi Sheila,

    I only have two fast* files in the directory, one is the reference fasta and the other is the fastq I aligned to generate the BAM. Everything I've done has been based on those two files so that can't be the issue.

    My entire pipeline has been the following:
    fastq_quality_trimmer -v -t 30 -l 80 -i Cannabis_GBS_Plate1.fastq -o Q30L80_Cannabis_GBS_Plate1.fastq

    samtools faidx Csativa_genome.fa

    java -jar /Apps/picard/build/libs/picard.jar CreateSequenceDictionary R=Csativa_genome.fa O=Csativa_genome.dict

    bwa aln -t 12 -0 Csativa_genome.fa Q30L80_Cannabis_GBS_Plate1.fastq >GBS_Csativa_bwa.bwaln

    bwa samse -r '@RG\tID:Csativa\tSM:GBS\tPL:illumina\tLB:lib1\tPU:unit1' Csativa_genome.fa GBS_Csativa_bwa.bwaln Q30L80_Cannabis_GBS_Plate1.fastq > GBS_Csativa_bwa.sam

    java -jar /Apps/picard/build/libs/picard.jar CleanSam INPUT=GBS_Csativa_bwa.sam OUTPUT=GBS_Csativa_bwa_clean.sam

    java -jar /Apps/picard/build/libs/picard.jar SortSam INPUT=GBS_Csativa_bwa_clean.sam OUTPUT=GBS_Csativa_clean_sorted.bam SORT_ORDER=coordinate

    java -jar /Apps/picard/build/libs/picard.jar MarkDuplicates INPUT=GBS_Csativa_clean_sorted.bam OUTPUT=GBS_Csativa_clean_dedup.bam METRICS_FILE=GBS_Csativa_clean_metrics.txt

    java -jar /Apps/picard/build/libs/picard.jar BuildBamIndex INPUT=GBS_Csativa_clean_dedup.bam

    java -jar /Apps/gatk/GenomeAnalysisTK.jar -T HaplotypeCaller -R Csativa_genome.fa -I GBS_Csativa_clean_dedup.bam --genotyping_mode DISCOVERY -hets 0.01 -stand_emit_conf 10 -stand_call_conf 30 -o GBS_Csativa_clean_raw-variants.vcf

    Do any of those steps alter the Csativa_genome.fa file?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Munholl
    Hi,

    No, they should not alter the reference FASTA file. Can you please post the reference .dict file and the BAM header here?

    Thanks,
    Sheila

  • MunhollMunholl WindsorMember

    Hi Sheila,

    I attached a txt for each. Let me know if you need anything else and thank you for your help.

  • MunhollMunholl WindsorMember
    edited September 2016

    @Sheila

    Hi Sheila,

    I followed your Best Practices instructions to develop the BAM, did I miss a step or make a mistake in my posted commands earlier?
    Also, wouldn't that mean there are scaffolds that didn't have anything align to them as the BAM is from the alignment while the dict is the reference genome? if there's no alignment wouldn't that scaffold be ignored when haplotype calling anyways?
    Lastly, aside from writing a script to filter out scaffolds that had no alignment is there an easy way to get them to match?

    Issue · Github
    by Sheila

    Issue Number
    1250
    State
    closed
    Last Updated
    Assignee
    Array
    Milestone
    Array
    Closed By
    vdauwera
  • MunhollMunholl WindsorMember

    @Geraldine_VdAuwera
    @Sheila

    I completely missed that and feel foolish for it :open_mouth:

    This should get me back on track, thanks for the help!

  • MunhollMunholl WindsorMember

    In case someone comes here with a similar issue, I had to rerun "bwa index" to correct the naming issue.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Munholl

    Thanks for reporting your solution!

    -Sheila

Sign In or Register to comment.