The current GATK version is 3.6-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

edited September 2012

### 1. What does error message X mean?

See the FAQ section on frequently encountered errors.

### 2. Can I use Genome STRiP to do discovery or genotyping in a single high-coverage individual?

Genome STRiP is designed to discover and genotype variants in populations and
uses the information from multiple individuals simultaneously. Typically you
will need data from at least 20 or 30 individuals to get good results.

That being said, it may be possible to use a "background population" along
with a single high-coverage individual to run Genome STRiP. The background
population does not need to have the same depth of coverage as the target
genome you want to process, but reads will need to be aligned to the same
reference sequence. A good background population might be 50 or so individuals
from the 1000 Genomes Project chosen from diverse population groups. This
approach has not been widely tested, although I have looked at targeted
resequencing loci using this strategy with some success. If you try this

### 3. Does Genome STRiP only work with deletions?

In the current version, only deletions (relative to the reference) are
supported in discovery and genotyping. We are actively working on discovery
and genotyping of other kinds of structural variants.

### 4. Is the source code available?

Not at this time, but we are planning to release the source code shortly.

### 5. Can I run discovery on a small genomic region?

If you have whole-genome sequence data, you can run on just a small region
using the standard -L argument to the GATK. For example

-L
chr1:1000000-2000000
.

If you have targeted resequencing data, where you have only sequenced a small
subset of the genome, then you additionally need to set the effective genome
size to be smaller. To do this, you currently need to modify the configuration
parameters in conf/genstrip_parameters.txt (the file location is specified
with the -configFile command line argument).

You will need to change these three parameters:

input.genomeSize = A + X + Y
input.genomeSizeMale = 2*A + X + Y
input.genomeSizeFemale = 2*A + 2*X


where A is the total size of the autosomal reference and X and Y are the
lengths of the X and Y chromosomes. Note that genomeSize is in haploid bases
while genomeSizeMale and genomeSizeFemale are in diploid bases.

Of course, if your target region doesn't include X or Y, then just set
genomeSizeMale and genomeSizeFemale to 2*genomeSize. See the installtest
configuration file for an example, where the effective genome size is set to
200Kb.

Post edited by Geraldine_VdAuwera on

Geraldine Van der Auwera, PhD

Tagged:

• Member Posts: 2

Does Genome STRiP now find CNVs that other than deletions?

• FranceMember Posts: 44 ✭✭

Is the source code available?

Not at this time, but we are planning to release the source code shortly.

shortly ?

I would like to add an option to ComputeInsertSizeHistogramsWalker so it can use SM instead of LB in the sam header ... ( was http://gatkforums.broadinstitute.org/wdl/discussion/6339/ )

• UC DavisMember Posts: 7

You really need to update this or provide a link to a current FAQ.

• University of Sussex, UKMember Posts: 117 ✭✭

@RoddyP said:
Does Genome STRiP now find CNVs that other than deletions?

Yes, I've attached a png and pdf of a heterozygous 4Kb duplication, identified and genotyped using the GS CNV pipeline (not the GS SV pipeline).

Beware of single events that GS splits into multiple.