The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?

Then follow instructions in Article#1894.

Formatting tip!

Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

UnifiedGenotyper gives empty output on test set

JetseJetse Member Posts: 2
edited June 2013 in Ask the GATK team

java -jar GenomeAnalysisTK.jar -T UnifiedGenotyper -dcov 1 -R smallRefGenome.fa -I testWh.bam -o test.vcf

Gives me this output:

INFO  15:48:37,070 HelpFormatter - --------------------------------------------------------------------------------
INFO  15:48:37,074 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02
INFO  15:48:37,074 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO  15:48:37,074 HelpFormatter - For support and documentation go to
INFO  15:48:37,081 HelpFormatter - Program Args: -T UnifiedGenotyper -dcov 1 -R smallRefGenome.fa -I testWh.bam -o test.vcf
INFO  15:48:37,081 HelpFormatter - Date/Time: 2013/06/11 15:48:37
INFO  15:48:37,081 HelpFormatter - --------------------------------------------------------------------------------
INFO  15:48:37,081 HelpFormatter - --------------------------------------------------------------------------------
INFO  15:48:37,220 GenomeAnalysisEngine - Strictness is SILENT
INFO  15:48:37,331 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1
INFO  15:48:37,342 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
INFO  15:48:37,366 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02
INFO  15:48:37,504 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files
INFO  15:48:37,527 GenomeAnalysisEngine - Done creating shard strategy
INFO  15:48:37,528 ProgressMeter -        Location processed.sites  runtime per.1M.sites completed total.runtime remaining
INFO  15:48:38,635 ProgressMeter -            done        1.20e+03    1.0 s       15.3 m     99.9%         1.0 s     0.0 s
INFO  15:48:38,636 ProgressMeter - Total runtime 1.11 secs, 0.02 min, 0.00 hours
INFO  15:48:38,636 MicroScheduler - 0 reads were filtered out during traversal out of 418 total (0.00%)
INFO  15:48:45,809 GATKRunReport - Uploaded run statistics report to AWS S3

The reference genome is 1200 nt long, all 418 reads map between position 100 and 1100 of this reference genome and are 100nt long. The reads are generated by Illumina and mapped with BWA. The bam file contains paired end data, but none are properly paired.
The output looks like everything worked, with -dcov 1 I have to find many SNPs...

All steps I did before GATK, after mapping:

Convert sam to bam
samtools view -u -S -b test.sam > test.bam

sort the bam file
samtools sort test.bam testSorted

Add the missing header line in the bam file
java -jarAddOrReplaceReadGroups.jar I=testSorted.bam O=testWh.bam LB=test PL=illumina PU=lane SM=samplename

Index the bam file
samtools index ../testFiles/output//test/testWh.bam

Does anyone see where I made a mistake?

Is there another setting which I have to set for finding a SNP with a mall dataset on a small reference genome?

Best Answer


  • JetseJetse Member Posts: 2

    Thank you for your quick response. I took a look at the bam files in IGV and there where no SNPs.
    This is just for testing so I don't need any significant SNPs, just some random SNPs will work too. But then I think the best option will be creating a real SNP in the bam file and check If I can find this SNP back.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,414 admin

    That sounds good. We do provide test data in our resource bundle if you want to play with some real SNPs.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.