The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.10.4 has MAJOR CHANGES that impact throughput of pipelines. Default compression is now 1 instead of 5, and Picard now handles compressed data with the Intel Deflator/Inflator instead of JDK.
GATK version 4.beta.2 (i.e. the second beta release) is out. See the GATK4 BETA page for download and details.

Tutorial: error running examples

on the forum page

http://gatkforums.broadinstitute.org/discussion/1209/how-to-run-the-gatk-for-the-first-time#latest

there are two examples. The first runs fine. The second generates this error

MESSAGE: Bad input: We encountered a non-standard non-IUPAC base in the provided reference: '10'

but the input files are the same. I only changed "Reads" to "Loci" in the command. I am running Unix so I do not need to retype the entire command. This command works fine

java -jar GenomeAnalysisTK.jar -T CountReads -R exampleFASTA.fasta -I exampleBAM.bam

This command produces the error

java -jar GenomeAnalysisTK.jar -T CountLoci -R exampleFASTA.fasta -I exampleBAM.bam -o output.txt

Any suggestions?

Tagged:

Comments

  • CarneiroCarneiro Charlestown, MAMember
    edited May 2013

    I definitely cannot replicate that error, maybe your FASTA file is corrupted?

    
    $ java -jar ../../dist/GenomeAnalysisTK.jar -T CountLoci -R exampleFASTA.fasta -I exampleBAM.bam -o output.txt                                                    [17:28:15]
    INFO  17:28:21,280 HelpFormatter - ---------------------------------------------------------------------------------
    INFO  17:28:21,282 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-76-gf39bc59, Compiled 2013/05/21 17:23:44
    INFO  17:28:21,282 HelpFormatter - Copyright (c) 2010 The Broad Institute
    INFO  17:28:21,282 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
    INFO  17:28:21,286 HelpFormatter - Program Args: -T CountLoci -R exampleFASTA.fasta -I exampleBAM.bam -o output.txt
    INFO  17:28:21,286 HelpFormatter - Date/Time: 2013/05/21 17:28:21
    INFO  17:28:21,286 HelpFormatter - ---------------------------------------------------------------------------------
    INFO  17:28:21,286 HelpFormatter - ---------------------------------------------------------------------------------
    INFO  17:28:21,409 GenomeAnalysisEngine - Strictness is SILENT
    INFO  17:28:21,498 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000
    INFO  17:28:21,524 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
    INFO  17:28:21,541 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02
    INFO  17:28:21,642 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files
    INFO  17:28:21,652 GenomeAnalysisEngine - Done creating shard strategy
    INFO  17:28:21,652 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO  17:28:21,653 ProgressMeter -        Location processed.sites  runtime per.1M.sites completed total.runtime remaining
    INFO  17:28:21,768 ProgressMeter -            done        2.05e+03    0.0 s       55.0 s     97.3%         0.0 s     0.0 s
    INFO  17:28:21,769 ProgressMeter - Total runtime 0.12 secs, 0.00 min, 0.00 hours
    INFO  17:28:21,851 MicroScheduler - 0 reads were filtered out during traversal out of 33 total (0.00%)
    INFO  17:28:22,438 GATKRunReport - Uploaded run statistics report to AWS S3
    
    
  • Strange thing. What corruption will allow the data to run through -T CountReads but not run through -T CountLoci?

  • CarneiroCarneiro Charlestown, MAMember
    edited May 2013

    great question, it should visit the same locations in the reference exactly.

    I'm afraid I don't have an answer to what you are observing. The error states an invalid base in the reference fasta. Can you md5 checksum the reference, dict and index?

    
    $ md5sum exampleFASTA.fasta exampleFASTA.fasta.fai exampleFASTA.dict
    36880691cf9e4178216f7b52e8d85fbe  exampleFASTA.fasta
    c50494fca6bb42ae02f26e9f0c585ee6  exampleFASTA.fasta.fai
    852fa68dbe31f42743c060ad2913279c  exampleFASTA.dict
    
Sign In or Register to comment.