The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

GATK-Snp calling

JayceJayce Member Posts: 9

hi,

can anyone help to advice if splitting the bam files Versus splitting the -L region, which one could speed up faster? asssume that i have 500 bam files, will splitting the bam files to 22 different chromosomes will increase the speed further as compare to splitting the -L region (intervals)? Can anyone help to advice or probably had experienced it before. thank you.

Best Answer

Answers

  • JayceJayce Member Posts: 9

    hi, i was referring to Unified Genotyper (Snp Calling) in GATK.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,371 admin

    Have a look at these documents, which cover the available parallelism modes:

    http://www.broadinstitute.org/gatk/guide/tagged?tag=parallelism&tab=docs

    Note that the -L argument is meant to specify target intervals if you are working with exome data.

    Geraldine Van der Auwera, PhD

  • JayceJayce Member Posts: 9

    Thanks Geraldine for your answer,i understand that parellism(multi-threading) -nt do help. but i just want to confirm if the speed is equivalent? (with same -nt settings) whether splitting the -L region is equivalent to splitting the bam file?i really need this answer on top of -nt multi-threading to speed up my very huge sample sizes. i just dont want to spend time in splitting the bam file if it is equivalant (speed wise) to splitting the -L interval regions. thank you for help and advice:)

  • JayceJayce Member Posts: 9
    edited March 2013

    thank you for your clear advice. thank you very much :)

Sign In or Register to comment.