The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Inconsistency in picard CollectVariantCallingMetrics

vangvang Member Posts: 11
edited November 2015 in Ask the GATK team

I hope this is the right forum to ask about picard these days.
I run CollectVariantCallingMetrics on a one-sample vcf file and get both the *. variant_calling_summary_metrics and *. variant_calling_detail_metrics. I would expect these to be close to identical, when only one sample is used, however, most values are different.
Command:
java -Djava.awt.headless=true -Xmx1500m -jar picard.jar CollectVariantCallingMetrics INPUT=${samplename}.final_variants.vcf.gz DBSNP=${DBSNP} OUTPUT=${samplename}.CollectVariantCallingMetrics

output:

METRICS CLASS picard.vcf.CollectVariantCallingMetrics$VariantCallingDetailMetrics
SAMPLE_ALIAS HET_HOMVAR_RATIO TOTAL_SNPS NUM_IN_DB_SNP NOVEL_SNPS FILTERED_SNPS PCT_DBSNP DBSNP_TITV
SAMPLENAME 1.515108 80563 79938 625 27362 0.992242 2.431405

METRICS CLASS picard.vcf.CollectVariantCallingMetrics$VariantCallingSummaryMetrics
TOTAL_SNPS NUM_IN_DB_SNP NOVEL_SNPS FILTERED_SNPS PCT_DBSNP DBSNP_TITV
84172 82347 1825 29284 0.978318 2.401504

picard version= 1.141

Issue · Github
by Sheila

Issue Number
374
State
closed
Last Updated
Assignee
Array
Closed By
chandrans

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie Posts: 11,388 admin

    Hi @vang, this is indeed the right place to ask about Picard tools.

    I'm not sure what's going on here but we'll look into it. Have you tried using any other tools to evaluate which set of numbers might be the most accurate?

    Geraldine Van der Auwera, PhD

  • SheilaSheila Broad InstituteMember, Broadie, Moderator, Dev Posts: 4,583 admin

    @vang
    Hi,

    I just tried with the latest version of Picard, and I get the exact same numbers in both output files. I'm not sure if it is indeed an issue/bug with the version you are using, but can you try again with the latest version?

    Thanks,
    Sheila

  • vangvang Member Posts: 11

    Hi Sheila

    too bad, i just tried the 2.0.1 version and it gave the exact same output as version 1.141. So I still get inconsistency. Could we try on the same vcf file and compare results? Is the one you used publicly available?

    Btw, did picard 2 move to java 8? I use to use java 7, but now it only works with version 8.

    Thanks

  • vangvang Member Posts: 11

    Hi Sheila,
    I removed all homozygote refs with GATK's SelectVariants --excludeNonVariants and now everything fits perfectly. Great!

    The results from picard and GATK VariantEval differ some. Using the same vcf and dbSNP file I get these TiTvRatio results:
    GATK:
    dbSNP: 2.41
    Novel: 1.51

    picard:
    dbSNP: 2.431405
    Novel: 1.753304

    Are there any differences in the way the results are calculated in the two methods?

    Thanks

  • SheilaSheila Broad InstituteMember, Broadie, Moderator, Dev Posts: 4,583 admin
    edited December 2015

    @vang
    Hi!

    I suspect you used -comp for the dbsnp file you input to VariantEval. If you use --dbsnp "your dbsnp file", you should get the same number from both tools :)

    -Sheila

  • vangvang Member Posts: 11

    Hmm, no. These are my parameters:
    -R ucsc.hg19.fasta -T VariantEval --eval:set1 final_variants.vcf.gz --dbsnp dbsnp_138.hg19.vcf -o final_variants.vcf.gz.eval.grp

    -Søren

Sign In or Register to comment.