The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Powered by Vanilla. Made with Bootstrap.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.
Register now for the upcoming GATK Best Practices workshop, Feb 20-22 in Leuven, Belgium. Open to all comers! More info and signup at http://bit.ly/2i4mGxz

Inconsistency in picard CollectVariantCallingMetrics

vangvang Member Posts: 11
edited November 2015 in Ask the GATK team

I hope this is the right forum to ask about picard these days.
I run CollectVariantCallingMetrics on a one-sample vcf file and get both the *. variant_calling_summary_metrics and *. variant_calling_detail_metrics. I would expect these to be close to identical, when only one sample is used, however, most values are different.
Command:
java -Djava.awt.headless=true -Xmx1500m -jar picard.jar CollectVariantCallingMetrics INPUT=${samplename}.final_variants.vcf.gz DBSNP=${DBSNP} OUTPUT=${samplename}.CollectVariantCallingMetrics

output:

METRICS CLASS picard.vcf.CollectVariantCallingMetrics$VariantCallingDetailMetrics
SAMPLE_ALIAS HET_HOMVAR_RATIO TOTAL_SNPS NUM_IN_DB_SNP NOVEL_SNPS FILTERED_SNPS PCT_DBSNP DBSNP_TITV
SAMPLENAME 1.515108 80563 79938 625 27362 0.992242 2.431405

METRICS CLASS picard.vcf.CollectVariantCallingMetrics$VariantCallingSummaryMetrics
TOTAL_SNPS NUM_IN_DB_SNP NOVEL_SNPS FILTERED_SNPS PCT_DBSNP DBSNP_TITV
84172 82347 1825 29284 0.978318 2.401504

picard version= 1.141

Issue · Github
by Sheila

Issue Number
374
State
closed
Last Updated
Assignee
Array
Closed By
chandrans

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,118 admin

    Hi @vang, this is indeed the right place to ask about Picard tools.

    I'm not sure what's going on here but we'll look into it. Have you tried using any other tools to evaluate which set of numbers might be the most accurate?

    Geraldine Van der Auwera, PhD

  • SheilaSheila Broad InstituteMember, Broadie, Moderator, Dev Posts: 4,443 admin

    @vang
    Hi,

    I just tried with the latest version of Picard, and I get the exact same numbers in both output files. I'm not sure if it is indeed an issue/bug with the version you are using, but can you try again with the latest version?

    Thanks,
    Sheila

  • vangvang Member Posts: 11

    Hi Sheila

    too bad, i just tried the 2.0.1 version and it gave the exact same output as version 1.141. So I still get inconsistency. Could we try on the same vcf file and compare results? Is the one you used publicly available?

    Btw, did picard 2 move to java 8? I use to use java 7, but now it only works with version 8.

    Thanks

  • vangvang Member Posts: 11

    Hi Sheila,
    I removed all homozygote refs with GATK's SelectVariants --excludeNonVariants and now everything fits perfectly. Great!

    The results from picard and GATK VariantEval differ some. Using the same vcf and dbSNP file I get these TiTvRatio results:
    GATK:
    dbSNP: 2.41
    Novel: 1.51

    picard:
    dbSNP: 2.431405
    Novel: 1.753304

    Are there any differences in the way the results are calculated in the two methods?

    Thanks

  • SheilaSheila Broad InstituteMember, Broadie, Moderator, Dev Posts: 4,443 admin
    edited December 2015

    @vang
    Hi!

    I suspect you used -comp for the dbsnp file you input to VariantEval. If you use --dbsnp "your dbsnp file", you should get the same number from both tools :)

    -Sheila

  • vangvang Member Posts: 11

    Hmm, no. These are my parameters:
    -R ucsc.hg19.fasta -T VariantEval --eval:set1 final_variants.vcf.gz --dbsnp dbsnp_138.hg19.vcf -o final_variants.vcf.gz.eval.grp

    -Søren

Sign In or Register to comment.