If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

VCF file size is not reduced after running 'ApplyRecalibration'

rcholicrcholic DenverMember
edited October 2013 in Ask the GATK team

I was expecting the "ApplyRecalibration' to reduce the VCF files output by Haplotypecaller. Below is my command line for VariantRecalibrator and ApplyRecalibration. I was wondering if I did anything wrong or the VCF file size does not always get smaller? or any suggestions to improve my commandlines?

java -Xmx4g -jar $CLASSPATH/GenomeAnalysisTK.jar \
-T VariantRecalibrator \
-R GATK_ref/hg19.fasta \
--input ../GATK/raw_variants_snps_indels-3.vcf \
-nt 6 \
-resource:hapmap,known=false,training=true,truth=true,prior=15.0 GATK_ref/hapmap_3.3.hg19.vcf \
-resource:omni,known=false,training=true,truth=true,prior=12.0 GATK_ref/1000G_omni2.5.hg19.vcf \
-resource:1000G,known=false,training=true,truth=false,prior=10.0 GATK_ref/1000G_phase1.snps.high_confidence.hg19.vcf \
-resource:dbsnp,known=true,training=false,truth=false,prior=2.0 GATK_ref/dbsnp_137.hg19.vcf \
-an QD -an MQRankSum -an ReadPosRankSum -an FS -an DP \
--maxGaussians 4 \
--numBadVariants 2000 \
-mode SNP \
-tranche 100.0 -tranche 99.9 -tranche 99.0 -tranche 90.0 \
-log ../GATK/VQSR/log/raw_variants_snps-3_snps_recal.log \
-recalFile ../GATK/VQSR/SNPs/snps-3_snp.recal.vcf \
-tranchesFile ../GATK/VQSR/SNPs/snps-3_snp.tranches \
-rscriptFile ../GATK/VQSR/SNPs/snps-3_snp_recal.plots.R

java -Xmx6g -Djava.awt.headless=true -jar $CLASSPATH/GenomeAnalysisTK.jar \
-T ApplyRecalibration \
-R GATK_ref/hg19.fasta \
-nt 5 \
--input ../GATK/raw_variants_snps_indels.vcf \
-mode SNP \
--ts_filter_level 99.0 \
-recalFile ../GATK/VQSR/SNPs/snps-3_snp.recal.vcf \
-tranchesFile ../GATK/VQSR/SNPs/snps-3_snp.tranches \
-log ../GATK/VQSR/SNPs/filtered/snps-3_snp.recal_filtered.log
-o ../GATK/VQSR/SNPs/filtered/snps-3_snp.recal_filtered.vcf

Best Answer


Sign In or Register to comment.