The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?

Then follow instructions in Article#1894.

Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.9.4 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

UnifiedGenotyper gives empty output on test set

JetseJetse Member
edited June 2013 in Ask the GATK team

java -jar GenomeAnalysisTK.jar -T UnifiedGenotyper -dcov 1 -R smallRefGenome.fa -I testWh.bam -o test.vcf

Gives me this output:

INFO 15:48:37,070 HelpFormatter - -------------------------------------------------------------------------------- INFO 15:48:37,074 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02 INFO 15:48:37,074 HelpFormatter - Copyright (c) 2010 The Broad Institute INFO 15:48:37,074 HelpFormatter - For support and documentation go to INFO 15:48:37,081 HelpFormatter - Program Args: -T UnifiedGenotyper -dcov 1 -R smallRefGenome.fa -I testWh.bam -o test.vcf INFO 15:48:37,081 HelpFormatter - Date/Time: 2013/06/11 15:48:37 INFO 15:48:37,081 HelpFormatter - -------------------------------------------------------------------------------- INFO 15:48:37,081 HelpFormatter - -------------------------------------------------------------------------------- INFO 15:48:37,220 GenomeAnalysisEngine - Strictness is SILENT INFO 15:48:37,331 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1 INFO 15:48:37,342 SAMDataSource$SAMReaders - Initializing SAMRecords in serial INFO 15:48:37,366 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.02 INFO 15:48:37,504 GenomeAnalysisEngine - Creating shard strategy for 1 BAM files INFO 15:48:37,527 GenomeAnalysisEngine - Done creating shard strategy INFO 15:48:37,527 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] INFO 15:48:37,528 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining INFO 15:48:38,635 ProgressMeter - done 1.20e+03 1.0 s 15.3 m 99.9% 1.0 s 0.0 s INFO 15:48:38,636 ProgressMeter - Total runtime 1.11 secs, 0.02 min, 0.00 hours INFO 15:48:38,636 MicroScheduler - 0 reads were filtered out during traversal out of 418 total (0.00%) INFO 15:48:45,809 GATKRunReport - Uploaded run statistics report to AWS S3

The reference genome is 1200 nt long, all 418 reads map between position 100 and 1100 of this reference genome and are 100nt long. The reads are generated by Illumina and mapped with BWA. The bam file contains paired end data, but none are properly paired.
The output looks like everything worked, with -dcov 1 I have to find many SNPs...

All steps I did before GATK, after mapping:

Convert sam to bam
samtools view -u -S -b test.sam > test.bam

sort the bam file
samtools sort test.bam testSorted

Add the missing header line in the bam file
java -jarAddOrReplaceReadGroups.jar I=testSorted.bam O=testWh.bam LB=test PL=illumina PU=lane SM=samplename

Index the bam file
samtools index ../testFiles/output//test/testWh.bam

Does anyone see where I made a mistake?

Is there another setting which I have to set for finding a SNP with a mall dataset on a small reference genome?

Best Answer


  • JetseJetse Member

    Thank you for your quick response. I took a look at the bam files in IGV and there where no SNPs.
    This is just for testing so I don't need any significant SNPs, just some random SNPs will work too. But then I think the best option will be creating a real SNP in the bam file and check If I can find this SNP back.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    That sounds good. We do provide test data in our resource bundle if you want to play with some real SNPs.

Sign In or Register to comment.