Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

ComputeInsertSizeDistributions

Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin
edited September 2012 in GenomeSTRiP Documentation

1. Introduction

The ComputeInsertSizeDistributions walker traverses a set of BAM files to generate histograms of insert sizes.

The insert size histograms are stored in a binary file format. Many histograms can be stored in the same file. The histograms are identified by <Sample, Library, ReadGroup> triples. The trailing components can be null. For example,
if histograms are computed library-by-library (the default), then the ReadGroup in each triple will be null.

See also MergeInsertSizeDistributions, ComputeInsertStatistics.

2. Inputs / Arguments

  • -I <bam-file> : The set of input BAM files.

  • -md <directory> : The metadata directory. Currently only used to check for a default list of excluded read groups.

  • -overwrite : If true (the default), overwrite the output file, otherwise append.

  • -createEmpty : If true, create a zero length output file if there are no paired reads in the input (default false).

3. Outputs

  • -O <histogram-file> : Location of the output binary histogram file.
Post edited by Geraldine_VdAuwera on
Sign In or Register to comment.