ERROR stack trace Code exception

shinkenshinken IrapuatoMember

Hi I am running the RealignerTargetCreator and I am having the next problem:

   INFO  22:02:18,183 HelpFormatter - -------------------------------------------------------------------------------- 
   INFO  22:02:18,207 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.5-0-g36282e4, Compiled      2015/11/25 04:03:56 
   INFO  22:02:18,208 HelpFormatter - Copyright (c) 2010 The Broad Institute 
   INFO  22:02:18,208 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
   INFO  22:02:18,213 HelpFormatter - Program Args: -T RealignerTargetCreator -R /LUSTRE/usuario/egonzalez/referencias/Zea_mays.AGPv3.22.dna.genome.fasta -
   I dedup_sorted_JE001.bam -o dedup_sorted_JE001.bam-forIndelRealigner.intervals 
   INFO  22:02:18,226 HelpFormatter - Executing as [email protected] on Linux 2.6.32-642.el6.x86_64 amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_91-b14. 
   INFO  22:02:18,227 HelpFormatter - Date/Time: 2017/02/12 22:02:18 
   INFO  22:02:18,228 HelpFormatter - -------------------------------------------------------------------------------- 
   INFO  22:02:18,229 HelpFormatter - -------------------------------------------------------------------------------- 
   INFO  22:02:19,769 GenomeAnalysisEngine - Strictness is SILENT 
   INFO  22:02:20,252 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target   Coverage: 1000 
   INFO  22:02:20,262 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
   INFO  22:02:20,441 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.18 
   INFO  22:02:22,993 GATKRunReport - Uploaded run statistics report to AWS S3 
   ##### ERROR ------------------------------------------------------------------------------------------
   ##### ERROR stack trace 
   java.lang.NullPointerException
at java.util.TreeMap.compare(TreeMap.java:1290)
at java.util.TreeMap.put(TreeMap.java:538)
at java.util.TreeSet.add(TreeSet.java:255)
at org.broadinstitute.gatk.utils.sam.ReadUtils.getSAMFileSamples(ReadUtils.java:70)
at org.broadinstitute.gatk.engine.samples.SampleDBBuilder.addSamplesFromSAMHeader(SampleDBBuilder.java:66)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.initializeSampleDB(GenomeAnalysisEngine.java:846)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:296)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:106)
   ##### ERROR ------------------------------------------------------------------------------------------
   ##### ERROR A GATK RUNTIME ERROR has occurred (version 3.5-0-g36282e4):
   ##### ERROR
   ##### ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
   ##### ERROR If not, please post the error message, with stack trace, to the GATK forum.
   ##### ERROR Visit our website and forum for extensive documentation and answers to 
   ##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
   ##### ERROR
   ##### ERROR MESSAGE: Code exception (see stack trace for error itself)

This is weird because I was running the same code with other data and this doesn't happened before. Do you know what is happening?

Best Answer

  • shinkenshinken Irapuato
    Accepted Answer

    Thank you very much, I change two parameters in my lines for preprocessing the files.

    I have several bam files so my scripts look like this:

    map

       for i in $(cat Lista_para_preproceso.txt); do nombre=$(echo $i | cut -d "_" -f1); bwa mem -t 8 -M -R '@RG\tID:something\tSM:something\tPL:SE\tLB:no\tPU:unit1' REF.fasta "$nombre""_1_clean.fq.gz" "$nombre""_2_clean.fq.gz" > "$nombre"".sam" ; done
    

    sort

       for i in $(ls *.sam); do nombre=$(echo $i | cut -d "." -f1); java -jar $PICARD SortSam INPUT=$i OUTPUT="sorted_""$nombre"".bam" SORT_ORDER=coordinate; done
    

    dedup

       for i in $(ls *.bam); do nombre=$(echo $i | cut -d "." -f1); java -jar $PICARD MarkDuplicates INPUT=$i OUTPUT="dedup_""$nombre"".bam" REMOVE_DUPLICATES=true METRICS_FILE="$nombre""_metrics.txt"; done
    

    index

       for i in $(ls dedup_*); do java -jar $PICARD BuildBamIndex INPUT=$i; done 
    

    indel_realignment

       for i in $(ls dedup_*bam); do java -jar $GATK -T RealignerTargetCreator -R REF.fasta -I $i -o "$i""-forIndelRealigner.intervals" ; done
    
       for i in $(ls *forIndelRealigner.intervals); do nombre=$(echo $i| cut -d "-" -f1); si=$(echo $nombre | cut -d "." -f1);  java -jar $GATK -T IndelRealigner -R REF.fasta -I $nombre  -targetIntervals $i -o "$si""-indelrealigned.bam"; done
    

    The problem was mainly in the header, I forgot to write the SM in the bwa mem script but now is there inside the -R flag, I also change in the dedup script REMOVE_SEQUENCING_DUPLICATES =true for REMOVE_DUPLICATES=true, but probably was only the header problem.

    Best Wishes,

    Eric

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Try validating your input file; it's possible that this is caused by a data problem that is not being handled gracefully. Also please check whether the error still reproduces with the latest version (3.7).

  • shinkenshinken IrapuatoMember
    Accepted Answer

    Thank you very much, I change two parameters in my lines for preprocessing the files.

    I have several bam files so my scripts look like this:

    map

       for i in $(cat Lista_para_preproceso.txt); do nombre=$(echo $i | cut -d "_" -f1); bwa mem -t 8 -M -R '@RG\tID:something\tSM:something\tPL:SE\tLB:no\tPU:unit1' REF.fasta "$nombre""_1_clean.fq.gz" "$nombre""_2_clean.fq.gz" > "$nombre"".sam" ; done
    

    sort

       for i in $(ls *.sam); do nombre=$(echo $i | cut -d "." -f1); java -jar $PICARD SortSam INPUT=$i OUTPUT="sorted_""$nombre"".bam" SORT_ORDER=coordinate; done
    

    dedup

       for i in $(ls *.bam); do nombre=$(echo $i | cut -d "." -f1); java -jar $PICARD MarkDuplicates INPUT=$i OUTPUT="dedup_""$nombre"".bam" REMOVE_DUPLICATES=true METRICS_FILE="$nombre""_metrics.txt"; done
    

    index

       for i in $(ls dedup_*); do java -jar $PICARD BuildBamIndex INPUT=$i; done 
    

    indel_realignment

       for i in $(ls dedup_*bam); do java -jar $GATK -T RealignerTargetCreator -R REF.fasta -I $i -o "$i""-forIndelRealigner.intervals" ; done
    
       for i in $(ls *forIndelRealigner.intervals); do nombre=$(echo $i| cut -d "-" -f1); si=$(echo $nombre | cut -d "." -f1);  java -jar $GATK -T IndelRealigner -R REF.fasta -I $nombre  -targetIntervals $i -o "$si""-indelrealigned.bam"; done
    

    The problem was mainly in the header, I forgot to write the SM in the bwa mem script but now is there inside the -R flag, I also change in the dedup script REMOVE_SEQUENCING_DUPLICATES =true for REMOVE_DUPLICATES=true, but probably was only the header problem.

    Best Wishes,

    Eric

Sign In or Register to comment.