The current GATK version is 3.6-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.
Last chance to register for the GATK workshop next week in Basel, Switzerland! http://www.sib.swiss/training/upcoming-training-events/training/gatk-workshop-lecture

IndelRealigner no initialization

seqseekseqseek Posts: 11Member
edited September 2012 in Ask the GATK team

I got this when I ran the IndelRealigner. The output bam is empty.

INFO  15:19:35,568 TraversalEngine - Total runtime 0.00 secs, 0.00 min, 0.00 hours
INFO  15:19:36,910 GATKRunReport - Uploaded run statistics report to AWS S3

It didn't initialize. The sample is aligned to specific region of the genome and I did use -L option. For whole genome alignment of the same sample, I don't have any problems. Do you know why?

Tagged:

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 10,469Administrator, Dev admin

    Can you post the full stack trace?

    Geraldine Van der Auwera, PhD

  • seqseekseqseek Posts: 11Member
    edited September 2012
    INFO  15:54:24,747 HelpFormatter - --------------------------------------------------------------------------------- 
    INFO  15:54:24,751 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.0-39-gd091f72, Compiled 2012/08/10 15:55:35 
    INFO  15:54:24,751 HelpFormatter - Copyright (c) 2010 The Broad Institute 
    INFO  15:54:24,752 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
    INFO  15:54:24,752 HelpFormatter - Program Args: -I test.sorted.bam -R hg19_bundle/ucsc.hg19.fasta -T IndelRealigner -targetIntervals targetcreator.interval_list -o test.out.bam -known hg19_bundle/Mills_and_1000G_gold_standard.indels.hg19.vcf -known hg19_bundle/1000G_phase1.indels.hg19.vcf --consensusDeterminationModel KNOWNS_ONLY -L my.interval_list -LOD 0.4 
    INFO  15:54:24,753 HelpFormatter - Date/Time: 2012/09/25 15:54:24 
    INFO  15:54:24,753 HelpFormatter - --------------------------------------------------------------------------------- 
    INFO  15:54:24,753 HelpFormatter - --------------------------------------------------------------------------------- 
    INFO  15:54:24,766 ArgumentTypeDescriptor - Dynamically determined type of hg19_bundle/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF 
    INFO  15:54:24,768 ArgumentTypeDescriptor - Dynamically determined type of hg19_bundle/1000G_phase1.indels.hg19.vcf to be VCF 
    INFO  15:54:24,854 GenomeAnalysisEngine - Strictness is SILENT 
    INFO  15:54:24,946 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
    INFO  15:54:24,968 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 
    INFO  15:54:24,988 RMDTrackBuilder - Loading Tribble index from disk for file hg19_bundle/Mills_and_1000G_gold_standard.indels.hg19.vcf 
    WARN  15:54:25,080 VCFStandardHeaderLines$Standards - Repairing standard header line for field GQ because -- type disagree; header has Float but standard is Integer 
    INFO  15:54:25,084 RMDTrackBuilder - Loading Tribble index from disk for file hg19_bundle/1000G_phase1.indels.hg19.vcf 
    WARN  15:54:25,169 VCFHeader - Found GL format, but no PL field.  As the GATK now only manages PL fields internally automatically adding a corresponding PL field to your VCF header 
    WARN  15:54:25,171 VCFStandardHeaderLines$Standards - Repairing standard header line for field AC because -- count types disagree; header has UNBOUNDED but standard is A -- descriptions disagree; header has 'Alternate Allele Count' but standard is 'Allele count in genotypes, for each ALT allele, in the same order as listed' 
    WARN  15:54:25,171 VCFStandardHeaderLines$Standards - Repairing standard header line for field AF because -- count types disagree; header has INTEGER but standard is A -- descriptions disagree; header has 'Global Allele Frequency based on AC/AN' but standard is 'Allele Frequency, for each ALT allele, in the same order as listed' 
    INFO  15:54:29,685 TraversalEngine - Total runtime 0.00 secs, 0.00 min, 0.00 hours 
    INFO  15:54:30,914 GATKRunReport - Uploaded run statistics report to AWS S3 
    
  • seqseekseqseek Posts: 11Member

    The attach file is better formatted.

    txt
    txt
    stack_trace.txt
    3K
  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 10,469Administrator, Dev admin

    To post a well-formatted stack trace, just copy-paste it into any good text editor that can indent the entire block for you, indent it, then copy-paste that back into the forum post.

    Geraldine Van der Auwera, PhD

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 10,469Administrator, Dev admin

    I see you are passing your intervals as a listfile. Check if it is properly formatted -- the most likely explanation to your problem is that the engine is not finding any valid intervals to run IndelRealigner on.

    Geraldine Van der Auwera, PhD

  • seqseekseqseek Posts: 11Member

    Thanks for your quick reply! The interval that I used is correctly formatted. I used it for all other processes and didn't have any problems. And there are 112 Indels falls in my intervals.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 10,469Administrator, Dev admin

    OK then , to be sure to rule out the file, can you try rerunning the same command but passing an interval directly (like -L 20:10000-20000 ? Or even a simpler interval like -L 20. Just make sure there are reads in it.

    Geraldine Van der Auwera, PhD

  • seqseekseqseek Posts: 11Member
    edited September 2012

    It's the same output.

    INFO  17:40:28,440 HelpFormatter - Program Args: -I test.sorted.bam -R hg19_bundle/ucsc.hg19.fasta -T IndelRealigner -targetIntervals targetcreator.interval_list -o test.out.bam -known hg19_bundle/Mills_and_1000G_gold_standard.indels.hg19.vcf -known hg19_bundle/1000G_phase1.indels.hg19.vcf --consensusDeterminationModel KNOWNS_ONLY -L chr1:6000000-7000000 -LOD 0.4 
    INFO  17:40:28,440 HelpFormatter - Date/Time: 2012/09/25 17:40:26 
    INFO  17:40:28,441 HelpFormatter - --------------------------------------------------------------------------------- 
    INFO  17:40:28,441 HelpFormatter - --------------------------------------------------------------------------------- 
    INFO  17:40:28,457 ArgumentTypeDescriptor - Dynamically determined type of hg19_bundle/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF 
    INFO  17:40:28,459 ArgumentTypeDescriptor - Dynamically determined type of hg19_bundle/1000G_phase1.indels.hg19.vcf to be VCF 
    INFO  17:40:28,543 GenomeAnalysisEngine - Strictness is SILENT 
    INFO  17:40:28,632 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
    INFO  17:40:28,653 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 
    INFO  17:40:28,672 RMDTrackBuilder - Loading Tribble index from disk for file hg19_bundle/Mills_and_1000G_gold_standard.indels.hg19.vcf 
    WARN  17:40:28,765 VCFStandardHeaderLines$Standards - Repairing standard header line for field GQ because -- type disagree; header has Float but standard is Integer 
    INFO  17:40:28,769 RMDTrackBuilder - Loading Tribble index from disk for file hg19_bundle/1000G_phase1.indels.hg19.vcf 
    WARN  17:40:28,863 VCFHeader - Found GL format, but no PL field.  As the GATK now only manages PL fields internally automatically adding a corresponding PL field to your VCF header 
    WARN  17:40:28,865 VCFStandardHeaderLines$Standards - Repairing standard header line for field AC because -- count types disagree; header has UNBOUNDED but standard is A -- descriptions disagree; header has 'Alternate Allele Count' but standard is 'Allele count in genotypes, for each ALT allele, in the same order as listed' 
    WARN  17:40:28,865 VCFStandardHeaderLines$Standards - Repairing standard header line for field AF because -- count types disagree; header has INTEGER but standard is A -- descriptions disagree; header has 'Global Allele Frequency based on AC/AN' but standard is 'Allele Frequency, for each ALT allele, in the same order as listed' 
    INFO  17:40:30,652 TraversalEngine - Total runtime 0.00 secs, 0.00 min, 0.00 hours 
    INFO  17:40:32,634 GATKRunReport - Uploaded run statistics report to AWS S3 
    
  • seqseekseqseek Posts: 11Member

    I figured it out! It's the header file. When I mapped to the restricted region, I kept an simpler version of the header which obviously doesn't work well with the reference dict. I changed the header and it works fine. Thanks for your help.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 10,469Administrator, Dev admin

    Glad you found a solution to your problem! File headers/dicts were going to be my next suggestion :)

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.