Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

About panel sequencing

Hi, dear GATK team:
I am analysing a dataset of exomes of 50 genes from human samples, generated from Ion torrent, average ~1000x. I marked the duplicates (which Life suggests not to do so), skipped realignment and base recalibration,since the regions are very small (mainly <200bp) and the depth is high. Called SNP with Unifiedgenotyper.

java -Djava.io.tmpdir=$tmp_dir -Xmx20G -jar $gatk_dir/GenomeAnalysisTK.jar -T UnifiedGenotyper -L $region -R $ref -glm SNP -mte -nct $thread_num --sample_ploidy $ploidy -I $bamfile --output_mode EMIT_VARIANTS_ONLY --dbsnp $db_vcf_file -o $gatk_vcf

In one of our case, over 1000 raw SNPs were called form a normal sample, which is abnormal.
The quality of the reads and mapping were fine.
I checked some low scored SNPs with IGV, they are far less coverd( <200x). some are less than 10x.
Why the caller called so many SNPs? What options or commands should I use to deal with this problem?

Best Answer


Sign In or Register to comment.