The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

IndelRealigner no initialization

seqseekseqseek Member Posts: 11
edited September 2012 in Ask the GATK team

I got this when I ran the IndelRealigner. The output bam is empty.

INFO  15:19:35,568 TraversalEngine - Total runtime 0.00 secs, 0.00 min, 0.00 hours
INFO  15:19:36,910 GATKRunReport - Uploaded run statistics report to AWS S3

It didn't initialize. The sample is aligned to specific region of the genome and I did use -L option. For whole genome alignment of the same sample, I don't have any problems. Do you know why?

Tagged:

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,413 admin

    Can you post the full stack trace?

    Geraldine Van der Auwera, PhD

  • seqseekseqseek Member Posts: 11
    edited September 2012
    INFO  15:54:24,747 HelpFormatter - --------------------------------------------------------------------------------- 
    INFO  15:54:24,751 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.0-39-gd091f72, Compiled 2012/08/10 15:55:35 
    INFO  15:54:24,751 HelpFormatter - Copyright (c) 2010 The Broad Institute 
    INFO  15:54:24,752 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
    INFO  15:54:24,752 HelpFormatter - Program Args: -I test.sorted.bam -R hg19_bundle/ucsc.hg19.fasta -T IndelRealigner -targetIntervals targetcreator.interval_list -o test.out.bam -known hg19_bundle/Mills_and_1000G_gold_standard.indels.hg19.vcf -known hg19_bundle/1000G_phase1.indels.hg19.vcf --consensusDeterminationModel KNOWNS_ONLY -L my.interval_list -LOD 0.4 
    INFO  15:54:24,753 HelpFormatter - Date/Time: 2012/09/25 15:54:24 
    INFO  15:54:24,753 HelpFormatter - --------------------------------------------------------------------------------- 
    INFO  15:54:24,753 HelpFormatter - --------------------------------------------------------------------------------- 
    INFO  15:54:24,766 ArgumentTypeDescriptor - Dynamically determined type of hg19_bundle/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF 
    INFO  15:54:24,768 ArgumentTypeDescriptor - Dynamically determined type of hg19_bundle/1000G_phase1.indels.hg19.vcf to be VCF 
    INFO  15:54:24,854 GenomeAnalysisEngine - Strictness is SILENT 
    INFO  15:54:24,946 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
    INFO  15:54:24,968 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 
    INFO  15:54:24,988 RMDTrackBuilder - Loading Tribble index from disk for file hg19_bundle/Mills_and_1000G_gold_standard.indels.hg19.vcf 
    WARN  15:54:25,080 VCFStandardHeaderLines$Standards - Repairing standard header line for field GQ because -- type disagree; header has Float but standard is Integer 
    INFO  15:54:25,084 RMDTrackBuilder - Loading Tribble index from disk for file hg19_bundle/1000G_phase1.indels.hg19.vcf 
    WARN  15:54:25,169 VCFHeader - Found GL format, but no PL field.  As the GATK now only manages PL fields internally automatically adding a corresponding PL field to your VCF header 
    WARN  15:54:25,171 VCFStandardHeaderLines$Standards - Repairing standard header line for field AC because -- count types disagree; header has UNBOUNDED but standard is A -- descriptions disagree; header has 'Alternate Allele Count' but standard is 'Allele count in genotypes, for each ALT allele, in the same order as listed' 
    WARN  15:54:25,171 VCFStandardHeaderLines$Standards - Repairing standard header line for field AF because -- count types disagree; header has INTEGER but standard is A -- descriptions disagree; header has 'Global Allele Frequency based on AC/AN' but standard is 'Allele Frequency, for each ALT allele, in the same order as listed' 
    INFO  15:54:29,685 TraversalEngine - Total runtime 0.00 secs, 0.00 min, 0.00 hours 
    INFO  15:54:30,914 GATKRunReport - Uploaded run statistics report to AWS S3 
    
  • seqseekseqseek Member Posts: 11

    The attach file is better formatted.

    txt
    txt
    stack_trace.txt
    3K
  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,413 admin

    To post a well-formatted stack trace, just copy-paste it into any good text editor that can indent the entire block for you, indent it, then copy-paste that back into the forum post.

    Geraldine Van der Auwera, PhD

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,413 admin

    I see you are passing your intervals as a listfile. Check if it is properly formatted -- the most likely explanation to your problem is that the engine is not finding any valid intervals to run IndelRealigner on.

    Geraldine Van der Auwera, PhD

  • seqseekseqseek Member Posts: 11

    Thanks for your quick reply! The interval that I used is correctly formatted. I used it for all other processes and didn't have any problems. And there are 112 Indels falls in my intervals.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,413 admin

    OK then , to be sure to rule out the file, can you try rerunning the same command but passing an interval directly (like -L 20:10000-20000 ? Or even a simpler interval like -L 20. Just make sure there are reads in it.

    Geraldine Van der Auwera, PhD

  • seqseekseqseek Member Posts: 11
    edited September 2012

    It's the same output.

    INFO  17:40:28,440 HelpFormatter - Program Args: -I test.sorted.bam -R hg19_bundle/ucsc.hg19.fasta -T IndelRealigner -targetIntervals targetcreator.interval_list -o test.out.bam -known hg19_bundle/Mills_and_1000G_gold_standard.indels.hg19.vcf -known hg19_bundle/1000G_phase1.indels.hg19.vcf --consensusDeterminationModel KNOWNS_ONLY -L chr1:6000000-7000000 -LOD 0.4 
    INFO  17:40:28,440 HelpFormatter - Date/Time: 2012/09/25 17:40:26 
    INFO  17:40:28,441 HelpFormatter - --------------------------------------------------------------------------------- 
    INFO  17:40:28,441 HelpFormatter - --------------------------------------------------------------------------------- 
    INFO  17:40:28,457 ArgumentTypeDescriptor - Dynamically determined type of hg19_bundle/Mills_and_1000G_gold_standard.indels.hg19.vcf to be VCF 
    INFO  17:40:28,459 ArgumentTypeDescriptor - Dynamically determined type of hg19_bundle/1000G_phase1.indels.hg19.vcf to be VCF 
    INFO  17:40:28,543 GenomeAnalysisEngine - Strictness is SILENT 
    INFO  17:40:28,632 SAMDataSource$SAMReaders - Initializing SAMRecords in serial 
    INFO  17:40:28,653 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 
    INFO  17:40:28,672 RMDTrackBuilder - Loading Tribble index from disk for file hg19_bundle/Mills_and_1000G_gold_standard.indels.hg19.vcf 
    WARN  17:40:28,765 VCFStandardHeaderLines$Standards - Repairing standard header line for field GQ because -- type disagree; header has Float but standard is Integer 
    INFO  17:40:28,769 RMDTrackBuilder - Loading Tribble index from disk for file hg19_bundle/1000G_phase1.indels.hg19.vcf 
    WARN  17:40:28,863 VCFHeader - Found GL format, but no PL field.  As the GATK now only manages PL fields internally automatically adding a corresponding PL field to your VCF header 
    WARN  17:40:28,865 VCFStandardHeaderLines$Standards - Repairing standard header line for field AC because -- count types disagree; header has UNBOUNDED but standard is A -- descriptions disagree; header has 'Alternate Allele Count' but standard is 'Allele count in genotypes, for each ALT allele, in the same order as listed' 
    WARN  17:40:28,865 VCFStandardHeaderLines$Standards - Repairing standard header line for field AF because -- count types disagree; header has INTEGER but standard is A -- descriptions disagree; header has 'Global Allele Frequency based on AC/AN' but standard is 'Allele Frequency, for each ALT allele, in the same order as listed' 
    INFO  17:40:30,652 TraversalEngine - Total runtime 0.00 secs, 0.00 min, 0.00 hours 
    INFO  17:40:32,634 GATKRunReport - Uploaded run statistics report to AWS S3 
    
  • seqseekseqseek Member Posts: 11

    I figured it out! It's the header file. When I mapped to the restricted region, I kept an simpler version of the header which obviously doesn't work well with the reference dict. I changed the header and it works fine. Thanks for your help.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,413 admin

    Glad you found a solution to your problem! File headers/dicts were going to be my next suggestion :)

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.