The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Got a problem?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.10.4 has MAJOR CHANGES that impact throughput of pipelines. Default compression is now 1 instead of 5, and Picard now handles compressed data with the Intel Deflator/Inflator instead of JDK.
GATK version 4.beta.2 (i.e. the second beta release) is out. See the GATK4 BETA page for download and details.

parallel running in GATK

In HC, CombineGVCFs, and GenotypeGVCFs, besides running each chr separately in parallel, can I also break a chr into smaller sections and run each in parallel?

Thanks!

Tagged:

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    That should be fine for GenotypeGVCFs. Not sure about HC and CombineGVCFs as you may run into edge cases at the section starts and ends.

  • mcg255mcg255 San FranciscoMember

    Hi, just posted a similar question, but specific to GenotypeGVCFs. Apologies, it's taken a lot of searching to get here! :smile:

    So this is an affirmative answer that we can use sub-chromosome splits, say 1Mb, for scatter-gather of GenotypeGVCFs?

    If so, two more follow-ups :smile:

    1. If we had a large multi-sample, whole-genome combined gVCF that we wanted to run this strategy on. Would we...

    A. Split that gVCF into many, many small, interval-only gVCFs and invoke GATK separately on all of them? Like the following pseudocode recipe?

    # make small, interval gVCFs from whole genome gVCFs
    
    SelectVariants -L 1:1-1,000,000 -V CohortWholeGenome.gvcf -o Split1.gvcf
    SelectVariants -L 1:1,000,001-2,000,000 -V CohortWholeGenome.gvcf -o Split2.gvcf
    ... 
    SelectVariants -L 22:... -V CohortWholeGenome.gvcf -o SplitN.gvcf
    
    # invoke GenotypeGVCFs on each interval gVCF
    
    GenotypeGVCFs -L 1:1-1,000,000 -V Split1.gcvf -o Split1.vcf
    GenotypeGVCFs -L 1:1,000,001-2,000,000 -V Split2.gvcf -o Split2.vcf
    ...
    GenotypeGVCFs -L 22:... -V SplitN.gvcf -o SplitN.vcf
    

    B. Run GenotypeGVCFs once for each interval, using just the large, combined whole-genome gVCF, passing the interval of interest with -L?

    # invoke GenotypeGVCFs on each interval, using whole genome gVCF
    
    GenotypeGVCFs -L 1:1-1,000,000 -V CohortWholeGenome.gvcf -o Split1.vcf
    GenotypeGVCFs -L 1:1,000,001-2,000,000 -V CohortWholeGenome.gvcf -o Split2.vcf
    ...
    GenotypeGVCFs -L 22:... -V CohortWholeGenome.gvcf -o SplitN.vcf
    
    1. Are there any caveats to our choice of intervals?
      In a similar question about using small intervals for HaplotypeCaller scatter-gather, rpoplin mentioned that the intervals should not overlap. Anything like that apply here?
  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    Hah, this thread was old! We've learned a lot since then. Let's continue the discussion in your other thread where I just responded.

Sign In or Register to comment.