We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

GATK-Snp calling

hi,

can anyone help to advice if splitting the bam files Versus splitting the -L region, which one could speed up faster? asssume that i have 500 bam files, will splitting the bam files to 22 different chromosomes will increase the speed further as compare to splitting the -L region (intervals)? Can anyone help to advice or probably had experienced it before. thank you.

Best Answer

Answers

  • JayceJayce Member

    hi, i was referring to Unified Genotyper (Snp Calling) in GATK.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Have a look at these documents, which cover the available parallelism modes:

    http://www.broadinstitute.org/gatk/guide/tagged?tag=parallelism&tab=docs

    Note that the -L argument is meant to specify target intervals if you are working with exome data.

  • JayceJayce Member

    Thanks Geraldine for your answer,i understand that parellism(multi-threading) -nt do help. but i just want to confirm if the speed is equivalent? (with same -nt settings) whether splitting the -L region is equivalent to splitting the bam file?i really need this answer on top of -nt multi-threading to speed up my very huge sample sizes. i just dont want to spend time in splitting the bam file if it is equivalant (speed wise) to splitting the -L interval regions. thank you for help and advice:)

  • JayceJayce Member
    edited March 2013

    thank you for your clear advice. thank you very much :)

Sign In or Register to comment.