The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Frequently Asked Questions

Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,651 admin
edited September 2012 in GenomeSTRiP Documentation

1. What does error message X mean?

See the FAQ section on frequently encountered errors.

2. Can I use Genome STRiP to do discovery or genotyping in a single high-coverage individual?

Genome STRiP is designed to discover and genotype variants in populations and
uses the information from multiple individuals simultaneously. Typically you
will need data from at least 20 or 30 individuals to get good results.

That being said, it may be possible to use a "background population" along
with a single high-coverage individual to run Genome STRiP. The background
population does not need to have the same depth of coverage as the target
genome you want to process, but reads will need to be aligned to the same
reference sequence. A good background population might be 50 or so individuals
from the 1000 Genomes Project chosen from diverse population groups. This
approach has not been widely tested, although I have looked at targeted
resequencing loci using this strategy with some success. If you try this
strategy, please share your experiences.

3. Does Genome STRiP only work with deletions?

In the current version, only deletions (relative to the reference) are
supported in discovery and genotyping. We are actively working on discovery
and genotyping of other kinds of structural variants.

4. Is the source code available?

Not at this time, but we are planning to release the source code shortly.

5. Can I run discovery on a small genomic region?

If you have whole-genome sequence data, you can run on just a small region
using the standard -L argument to the GATK. For example


If you have targeted resequencing data, where you have only sequenced a small
subset of the genome, then you additionally need to set the effective genome
size to be smaller. To do this, you currently need to modify the configuration
parameters in conf/genstrip_parameters.txt (the file location is specified
with the -configFile command line argument).

You will need to change these three parameters:

input.genomeSize = A + X + Y 
input.genomeSizeMale = 2*A + X + Y 
input.genomeSizeFemale = 2*A + 2*X

where A is the total size of the autosomal reference and X and Y are the
lengths of the X and Y chromosomes. Note that genomeSize is in haploid bases
while genomeSizeMale and genomeSizeFemale are in diploid bases.

Of course, if your target region doesn't include X or Y, then just set
genomeSizeMale and genomeSizeFemale to 2*genomeSize. See the installtest
configuration file for an example, where the effective genome size is set to

Geraldine Van der Auwera, PhD

Post edited by Geraldine_VdAuwera on


Sign In or Register to comment.