If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office on November 11th and 13th 2019, due to the U.S. holiday(Veteran's day) and due to a team event(Nov 13th). We will return to monitoring the GATK forum on November 12th and 14th respectively. Thank you for your patience.
Original bam file vs -bamout bam file, which one sould I rely on?
Dear GATK authors and other scientists,
I wonder which bam file is the 'correct' one. Let me explain. I have to select some interesting variants from my vcf files (called by HaplotypeCaller) and then I'm going to confirm them in the wet lab. I'd like to avoid false positives, so I prepared several filtering strategies with strict conditions. First of all I'd like to see my variant in bam file (IGV). However sometimes the variant of interest is present only in bam from -bamout and there's no any alternation in original bam file (based on position).
Yes I know that is similar question here:
And Sheila has responded that it's the result of a reassembly done by HaplotypeCaller which may change the positions of the reads. I understand this, you are using the De Brujin graph to reconstruct and select the haplotype with best likelihood.
However should I take variant like this into consideration (example below) or treat him like a false positive? What do you think? I got dozens of variants like this one.
Same position, variant present in vcf file with nice score. I see the 'variant pattern' in some reads in bamout one, I think it's pretty suspicous and it may be a group of false positives, what do you think?
Actually I see 'ArtificialHaplotypeRG' section in IGV bamout bam file with this group of variants, so should I ignore them?