If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Regarding ploidy in Haplotyple Caller for multiple replicates of pooled RNAseq

cjaln1994cjaln1994 MelbourneMember

I am a little confused about the best practices for running Haplotyple Caller to call variants given the pooled nature of my study, any feedback is super appreciated!

I have 10 replicates of pooled, RNAseq data each for two samples (10 replicates for Sample A, 10 replicates for Sample B ). By pooled I mean each replicate has mRNA from 20 individuals all mixed together with no barcoding (population genetics study).

I had planned to just merge the bam files of these replicates, who have RGSMs of SampleA and SampleB, and simply run Haplotype Caller for Sample A and Sample B. However, that would mean I would set ploidy = 2 x 200. This seems very high!
Would it be better to run Haplotype Caller for each replicate separately, without merging the bam files and setting ploidy = 2 x 20, And then use some kind of tool such as CombineVariants to stack my vcf files into two samples for downstream comparisons?
Any advice?

Best Answer


  • cjaln1994cjaln1994 MelbourneMember

    I also thought about trying this:

    1. Set unique sample IDs for all the 20 replicates (e.g. SampleA1, SampleA2... etc. SampleB1, SampleB2... etc.)
    2. Merge the files and run HC which recognises them as 20 separate samples (run HC with ploidy = 2x20 = 40)
    3. At the VCF stage find a way to merge these variants such that SampleA1, SampleA2... etc. all collapse into SampleA and SampleB1, SampleB2...etc. collapse into SampleB
Sign In or Register to comment.