If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
About panel sequencing
Hi, dear GATK team:
I am analysing a dataset of exomes of 50 genes from human samples, generated from Ion torrent, average ~1000x. I marked the duplicates (which Life suggests not to do so), skipped realignment and base recalibration,since the regions are very small (mainly <200bp) and the depth is high. Called SNP with Unifiedgenotyper.
java -Djava.io.tmpdir=$tmp_dir -Xmx20G -jar $gatk_dir/GenomeAnalysisTK.jar -T UnifiedGenotyper -L $region -R $ref -glm SNP -mte -nct $thread_num --sample_ploidy $ploidy -I $bamfile --output_mode EMIT_VARIANTS_ONLY --dbsnp $db_vcf_file -o $gatk_vcf
In one of our case, over 1000 raw SNPs were called form a normal sample, which is abnormal.
The quality of the reads and mapping were fine.
I checked some low scored SNPs with IGV, they are far less coverd( <200x). some are less than 10x.
Why the caller called so many SNPs? What options or commands should I use to deal with this problem?