Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

QualifyMissingIntervals - clarification on input files

dreadrea CanadaMember
edited January 2015 in Ask the GATK team

Greetings GATK users,
I'm trying to run QualifyMissingIntervals in GATK, and want to verify the output of my command. I am using:

java -jar GenomeAnalysisTK.jar -T QualifyMissingIntervals -o outputtest.grp -R ref.fasta -I input.bam -L list.interval_list --targetsfile targets.intervals.

My interval list looks like this:

@HD VN:1.4  SO:coordinate
@SQ SN:1    LN:4000000
chromosome  1   4000000 +   target1

This is a subset of my targets file which was output from the RealignerTargetCreator function :

chromosome:889608-889611
chromosome:926218-926667
... 24 lines

My output gives me data on only a single interval:

INTERVAL                                 GC          BQ           MQ           DP            POS_IN_TARGET  TARGET_SIZE  BAITED  MISSING_SIZE  INTERPRETATION
chromosome:1-4411709  0.65615955  31.01751693  42.77457476  421.83747409       -3522098            4  true         4000000  UNKNOWN   

I get the feeling that one of my files is formatted improperly, but I can't figure out which it is. I have tried several iterations of the -L and --targetsfiles based on both the documentation and what has been previously posted on the forum, but to no avail, usually resulting in the command not running at all.

I would very much appreciate any help that might be provided!

Best Answer

Answers

Sign In or Register to comment.