question about odd result for VQSR

mikemike Posts: 103Member

Hi,

Recently I run into some odd observation in VQSR. I have 17 samples from a same family and I used all of 17 samples to call SNPs and after VQSR, I got the trench file like this:

Variant quality score tranches file

Version number 5

targetTruthSensitivity,numKnown,numNovel,knownTiTv,novelTiTv,minVQSLod,filterName,model,accessibleTruthSites,callsAtTruthSites,truthSensitivity 90.00,48637,716,2.9527,2.3302,4.8390,VQSRTrancheSNP0.00to90.00,SNP,26182,23563,0.9000 99.00,60114,1531,2.8057,2.3333,1.7766,VQSRTrancheSNP90.00to99.00,SNP,26182,25920,0.9900 99.90,67220,2884,2.7190,1.8222,-10.0009,VQSRTrancheSNP99.00to99.90,SNP,26182,26155,0.9990 100.00,69714,4998,2.6822,1.8300,-1122.0698,VQSRTrancheSNP99.90to100.00,SNP,26182,26182,1.0000

which seems fine. then for research purpose, I only used 5 samples of more tight relation such as two parents and their 3 immediate children and after VQSR, the trench file looks like below:

Variant quality score tranches file

Version number 5

targetTruthSensitivity,numKnown,numNovel,knownTiTv,novelTiTv,minVQSLod,filterName,model,accessibleTruthSites,callsAtTruthSites,truthSensitivity 90.00,50598,2279,2.6625,1.7993,-Infinity,VQSRTrancheSNP0.00to90.00,SNP,20850,20850,1.0000 99.00,50598,2279,2.6625,1.7993,-Infinity,VQSRTrancheSNP90.00to99.00,SNP,20850,20850,1.0000 99.90,50598,2279,2.6625,1.7993,-Infinity,VQSRTrancheSNP99.00to99.90,SNP,20850,20850,1.0000 100.00,50598,2279,2.6625,1.7993,-Infinity,VQSRTrancheSNP99.90to100.00,SNP,20850,20850,1.0000

Notice that the 5-sample VQSR tranch file has exactly the same thing throughout all thresholds: 90, 99, 99.90 and 100. and the VQSR modeling plot is also very odd, no plotting at all being seen (the pdf ifle was created but was almost blank in contrast to the normal projection plots I saw in other cases)

However, we did use the old version to call the same 5 samples before, and the trench file looks like below:

Variant quality score tranches file

Version number 4

targetTruthSensitivity,numKnown,numNovel,knownTiTv,novelTiTv,minVQSLod,filterName,accessibleTruthSites,callsAtTruthSites,truthSensitivity 90.00,36407,361,2.8657,2.3119,5.0854,TruthSensitivityTranche0.00to90.00,20814,18732,0.9000 99.00,44097,638,2.7655,2.2222,2.2592,TruthSensitivityTranche90.00to99.00,20814,20605,0.9900 99.90,47947,1061,2.7078,1.8750,-7.4143,TruthSensitivityTranche99.00to99.90,20814,20793,0.9990 100.00,50426,2318,2.6645,1.7677,-647.3944,TruthSensitivityTranche99.90to100.00,20814,20814,1.0000

this time, it looks reasonable to me. This is troubling us since for 5 samples, the old version (V1.6-7) seems working fine, whereas the new version (V2.1-13) seems having issue or can not get further filtering by VQSR (90, 99 and 100 got the same result, I did repeat multiple times and got the same results), although for all of the 17 samples, the new version seems fine on VQSR.

So my questions are: 1. is it possible that in some occasion, VQSR can simply not work? 2. Why the old version seems working but not the new version for exactly the same set of 5-sample data?

Thanks a lot for your help!

Mike

Tagged:

Answers

  • mikemike Posts: 103Member

    Hi,

    After carefully checking, I found more info although still puzzled:

    I compared the old and new version commands side by side and realized a major difference in Unified genotyper command, in which the old version use -dcov 50 as default option suggested at the time (I do have the old documnetation for old version saved as in pdf file) and now in new version (V2) it changed to -dcov 200 (-dcov [50 for 4x, 200 for >30x WGS or Whole exome], see URL:http://www.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_sting_gatk_walkers_genotyper_UnifiedGenotyper.html) and so for V2, I did use -dcov 200, which caused the odd result as mentioned above, More surprisingly, when I re-run the v2 GATK using UG with -dcov 50 like in old version for the same dataset (just 5 samples), I got similar result as the old version result. the tranch file looks like below, which seems much reasonable in terms of TiTv ratios improvement from 99.9 to 99 to 90 etc:

    Variant quality score tranches file

    Version number 5

    targetTruthSensitivity,numKnown,numNovel,knownTiTv,novelTiTv,minVQSLod,filterName,model,accessibleTruthSites,callsAtTruthSites,truthSensitivity 90.00,35484,316,2.9264,2.1600,4.6455,VQSRTrancheSNP0.00to90.00,SNP,20849,18764,0.9000 99.00,44135,624,2.7757,2.1515,1.9238,VQSRTrancheSNP90.00to99.00,SNP,20849,20640,0.9900 99.90,47937,1007,2.7121,1.8523,-7.7956,VQSRTrancheSNP99.00to99.90,SNP,20849,20828,0.9990 100.00,50469,2133,2.6676,1.7890,-777.6778,VQSRTrancheSNP99.90to100.00,SNP,20849,20849,1.000

    Also the VQSR modeling plot looks also normal as I usually saw, whereas the one using UG -dcov 200 give a very odd plot (only one page and also almost no plot at all for the whole pdf file for the model plot)

    So my question is: why the option of dcov in UG step (although I used what was suggested by both old and new version documentations) has such big impact on the VQSR later disregard of whether it is old or new version I am running?

    Thanks a lot for any insights!

    Mike

  • rpoplinrpoplin Posts: 122GATK Developer mod

    Hi there,

    There seems to be a lot of details here. Can you please post the full command lines that you are using to help us understand what you are seeing.

    Cheers,

  • mikemike Posts: 103Member

    Hi, Ryan:

    Thanks for looking into this.

    I just posted the UG steps that I found may be the major causes, let me know if you need other info:

    java -jar /opt/nasapps/stow/GenomeAnalysisTK-2.2-4-g4a174fb/bin/GenomeAnalysisTK.jar -T UnifiedGenotyper -R /path/hg19.fa
    -I /path/S1.bam -I /path/S2.bam -I /path/S3.bam -I /path/S4.bam -I /path/S5.bam --dbsnp /Path/dbsnp_135.hg19.vcf -glm BOTH -o /Path/GATK_UG_5Samples_snps_indel.raw.afterRecal_2012_V2.vcf -stand_call_conf 50 -stand_emit_conf 10 -dcov 200 -L /Path/029720_D_BED_20101013.bed

    After this, I used SelectVariants to get SNPs only vcf file, which was subjected to VQSR step, then I got the odd VQSR tranches file as shown above

    However, for the same set of 5 samples, for UG step like above command, I just use -dcov 50 instead of -dcov 200, then I got the reasonable tranches file shown above (in my 2nd post).

    Is it odd? Did I do something wrong here?

    Thanks

    Mike

  • rpoplinrpoplin Posts: 122GATK Developer mod

    Hi there,

    I see that you are using an intervals list: -L /Path/029720_D_BED_20101013.bed Is this by any chance a targeted exome or smaller project? We don't recommend using the VQSR on such projects with small numbers of samples. My guess is that the fluctuations you are seeing don't have anything to do with the downsampling in the UG but are rather a sign of numerical instability in the VQSR with small numbers of variants. Take a look at our best practice recommendations for advice on how you might be able to use the VQSR with your project.

    I hope that helps,

  • mikemike Posts: 103Member

    Hi, Ryan:

    Thanks for the input!

    yes, it is a small project but as whole exome seq (the bed file is the whole exome target regions from agilent) with just a family of individuals with some of them with a type of cancer suspected to be inheritance-related or genetic cause. I double check the documentation of VQSR, which emphasize on the number of variant sites in the callsets at range of thousands of variants (except for using InbreedingCoeff needs at least 10 samples), and my dataset does have about 50k variants after UG for only 5 samples ( a subset of the family with Trio relations) and 70 k for all members of the big family (17), so I though it is reasonable to use VQSR, does it? I did on just 5 of them is because these 5 samples are with trio relation, we just want to see any improvement of the calls. I did do the variant calls and VQSR on all 17 samples of the whole family as well. which looks fine. So even if I have 50k variants in the 5-sample callset, the number of samples as 5 still not good enough?

    I just wonder why even for 5 sample callset, after using -docv 50 on UG step, and it seems now that VQSR has the separation for 0.90 and 0.99 thresholds as mentioned above, compared to if use -dcov 200?

    Any insight?

    Thanks again

    Mike

  • mikemike Posts: 103Member
    edited December 2012

    Hi, Ryan:

    I did try VQSR with --maxGaussians 4 (used to be --maxGaussians 6), now I got tranches file as below, which looks fine with me now:

    # Variant quality score tranches file
    # Version number 5
    targetTruthSensitivity,numKnown,numNovel,knownTiTv,novelTiTv,minVQSLod,filterName,model,accessibleTruthSites,callsAtTruthSites,truthSensitivity
    90.00,36765,342,2.8754,2.1963,4.9773,VQSRTrancheSNP0.00to90.00,SNP,20850,18765,0.9000
    99.00,43801,595,2.7815,2.2514,2.5320,VQSRTrancheSNP90.00to99.00,SNP,20850,20641,0.9900
    99.90,47986,1045,2.7042,1.7909,-9.6192,VQSRTrancheSNP99.00to99.90,SNP,20850,20829,0.9990
    100.00,50598,2279,2.6625,1.7993,-1049.0069,VQSRTrancheSNP99.90to100.00,SNP,20850,20850,1.0000
    

    did not realize that --maxGaussians has such great impact on this. What do you think?

    Thanks and best

    Mike

    Post edited by Geraldine_VdAuwera on
  • bredesonbredeson Posts: 24Member

    I have a question about the VQSR's tranches output. I've been scouring the GATK pages and I noticed that my tranches plot does not quite behave the way it is described in the howto:

    The idea is that the lowest tranche is highly specific but less sensitive (there are very few false positives but potentially many false negatives, i.e. missing calls), and each subsequent tranche in turn introduces additional true positive calls along with a growing number of false positive calls.

    My plots appear to be behaving in the opposite manner, or maybe I'm just misunderstanding what is meant by "the lowest tranche"?

    My tranches file:

    # Variant quality score tranches file
    # Version number 5
    targetTruthSensitivity,numKnown,numNovel,knownTiTv,novelTiTv,minVQSLod,filterName,model,accessibleTruthSites,callsAtTruthSites,truthSensitivity
    90.00,0,7954491,0.0000,1.9963,10.6484,VQSRTrancheSNP0.00to90.00,SNP,3856587,3470928,0.9000
    99.00,0,9848620,0.0000,2.0676,8.4217,VQSRTrancheSNP90.00to99.00,SNP,3856587,3818021,0.9900
    99.90,0,11022663,0.0000,2.1288,6.0801,VQSRTrancheSNP99.00to99.90,SNP,3856587,3852730,0.9990
    100.00,0,17485011,0.0000,2.4710,-687620.4316,VQSRTrancheSNP99.90to100.00,SNP,3856587,3856587,1.0000
    

    I'm working in a non-human organism, so I bootstrapped by very stringent filtering of an initial call set (3.8M input training SNPs - 18 samples = WGS) with GATK 2.5-2-gf57256b and fed it into VQSR with the raw VCF.

    GenomeAnalysisTK VariantRecalibrator \
    

    --reference_sequence $CASSAVA/genome/scaffolds+cpDNA.fasta \ --excludeIntervals $CASSAVA/genome/cassavaV4.1685010_168501.repeat.masked.srt.bed \ --mode SNP --dirichlet 0.01 \ --resource:bootstrap,known=false,training=true,truth=true,prior=8.0 bootstrap.red.HWE.flt.g60.recode.vcf \ --input Manihot_UG.rdx.vcf --recal_file Manihot_UG.rdx.DP.recal --tranches_file Manihot_UG.rdx.DP.tranches \ --rscript_file Manihot_UG.rdx.DP.Rplot \ -an BaseQRankSum -an HaplotypeScore -an QD -an DP -an ReadPosRankSum

  • bredesonbredeson Posts: 24Member

    Sorry about that being posted so many times. I think that was a Safari bug...

    Anyway, my question is: Although my number of variants increases at higher sensitivity, does it seem unusual that my FP variant fraction decreases at higher sensitivity as well? And at the 100% truth sensitivity level, no FPs are reported? Am I mis-interpreting these plots? Is this a bug in calculating the tranches plot? As far as I'm able to infer from the example plots posted on the tutorial pages, the Cumulative TPs are calculated by simply by summing the TPs from the previous tranche, yet I have Cumulative TPs reported for my lowest tranche and none for my highest-sensitivity tranche...

    Any help (correctly) interpreting these plots would be much appreciated!!!

    I'm leaving a couple of tranches plots at the public FTP (a v2.5-2 calculated plot and a v2.6-3 calculated plot):

    Manihot_UG.rdx.tranches-v2.5.png Manihot_UG.rdx.tranches-v2.6.png

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,804Administrator, GATK Developer admin

    Hi there, sorry for the delayed response. No worries about the bug, though you almost got banned by the automatic spam filter.

    I can't find your plots on the ftp, are you sure they uploaded successfully? You can attach them to a forum comment too, btw, which may be simpler.

    Geraldine Van der Auwera, PhD

  • bredesonbredeson Posts: 24Member

    Attached is the --rscript_file and the tranches plots for GATK v2.5 and v2.6, respectively.

    Manihot_UG.rdx.Rplot.png
    667 x 694 - 110K
    Manihot_UG.rdx.tranches-v2.5.png
    766 x 484 - 40K
    Manihot_UG.rdx.tranches-v2.6.png
    764 x 437 - 38K
  • rpoplinrpoplin Posts: 122GATK Developer mod

    Hi there,

    The ti/tv values for the tranches are quite a bit different for the two versions which is surprising. Can you post the log output for the two runs of VariantRecalibrator? There is a lot of interesting info in there that will help us debug.

    Thanks for your help,

  • bredesonbredeson Posts: 24Member

    I, unfortunately, no longer have the v2.6 logging output. Perhaps the difference comes from specifying my stringently-filtered (unknown) dataset as known=true/false in the --resources argument? At the time I was fiddling with parameters (setting known vs not). The v2.5 plot above was produced with known=false, and the v2.6 plot was produced with known=true. When I re-ran v2.5 with known=true the tranches plot looked very similar to the v2.6 plot. The log from the v2.5 (with known=true) is provided below (excluding ProgressMeter reports for current position in genome):

    To clarify, my issue is not about the difference between the two plots provided above, but rather that my fractions of TPs/FPs (in both plots) are "flipped" with respect to the titv values, relative to all of the example plots I've seen in the docs -- I'm expecting to see 100% tranche-specific SNPs in the 90-tranche with decreasing fraction of tranche-specific SNPs, and increasing FPs, as tranche sensitivity is increased, but this is not what I'm seeing. I'm wondering if there is an error in the way the tranche plot was calculated or if I had run VQSR incorrectly...

    INFO  22:43:06,315 ArgumentTypeDescriptor - Dynamically determined type of /projectb/projectdirs/plant/analysis/cassava/genome/cassavaV4.1685010_168501.repeat.masked.srt.bed to be BED 
    INFO  22:43:06,382 HelpFormatter - -------------------------------------------------------------------------------- 
    INFO  22:43:06,383 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02 
    INFO  22:43:06,383 HelpFormatter - Copyright (c) 2010 The Broad Institute 
    INFO  22:43:06,383 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
    INFO  22:43:06,389 HelpFormatter - Program Args: -T VariantRecalibrator --reference_sequence /projectb/projectdirs/plant/analysis/cassava/genome/scaffolds+cpDNA.fasta --excludeIntervals /projectb/projectdirs/plant/analysis/cassava/genome/cassavaV4.1685010_168501.repeat.masked.srt.bed --mode SNP --dirichlet 0.01 --resource:bootstrap,known=true,training=true,truth=true,prior=8.0 bootstrap.red.HWE.flt.g60.recode.vcf --input Manihot_UG.rdx.vcf --recal_file Manihot_UG.rdx.DP.recal --tranches_file Manihot_UG.rdx.DP.tranches --rscript_file Manihot_UG.rdx.DP.Rplot -an BaseQRankSum -an HaplotypeScore -an QD -an DP -an ReadPosRankSum 
    INFO  22:43:06,389 HelpFormatter - Date/Time: 2013/06/28 22:43:06 
    INFO  22:43:06,389 HelpFormatter - -------------------------------------------------------------------------------- 
    INFO  22:43:06,389 HelpFormatter - -------------------------------------------------------------------------------- 
    INFO  22:43:06,448 ArgumentTypeDescriptor - Dynamically determined type of Manihot_UG.rdx.vcf to be VCF 
    INFO  22:43:06,552 ArgumentTypeDescriptor - Dynamically determined type of bootstrap.red.HWE.flt.g60.recode.vcf to be VCF 
    INFO  22:43:07,383 GenomeAnalysisEngine - Strictness is SILENT 
    INFO  22:43:08,550 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 
    INFO  22:43:08,582 RMDTrackBuilder - Loading Tribble index from disk for file Manihot_UG.rdx.vcf 
    INFO  22:43:09,560 RMDTrackBuilder - Loading Tribble index from disk for file bootstrap.red.HWE.flt.g60.recode.vcf 
    INFO  22:43:10,655 IntervalUtils - Initial include intervals span 532668733 loci; exclude intervals span 7340626 loci 
    INFO  22:43:10,658 IntervalUtils - Excluding 7340626 loci from original intervals (1.38% reduction) 
    INFO  22:43:10,659 IntervalUtils - Processing 525328107 bp from intervals 
    INFO  22:43:11,093 GenomeAnalysisEngine - Creating shard strategy for 0 BAM files 
    INFO  22:43:11,129 GenomeAnalysisEngine - Done creating shard strategy 
    INFO  22:43:11,130 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING] 
    INFO  22:43:11,130 ProgressMeter -        Location processed.sites  runtime per.1M.sites completed total.runtime remaining 
    INFO  22:43:11,141 TrainingSet - Found bootstrap track:     Known = true    Training = true     Truth = true    Prior = Q8.0 
    INFO  23:08:02,647 VariantDataManager - BaseQRankSum:    mean = 1.66     standard deviation = 1.84 
    INFO  23:08:07,565 VariantDataManager - HaplotypeScore:      mean = 2.77     standard deviation = 2.29 
    INFO  23:08:12,461 VariantDataManager - QD:      mean = 10.14    standard deviation = 6.06 
    INFO  23:08:17,310 VariantDataManager - DP:      mean = 627.20   standard deviation = 141.27 
    INFO  23:08:22,139 VariantDataManager - ReadPosRankSum:      mean = 0.46     standard deviation = 1.53 
    INFO  23:08:30,001 VariantDataManager - Training with 3856585 variants after standard deviation thresholding. 
    INFO  23:08:30,034 GaussianMixtureModel - Initializing model with 30 k-means iterations... 
    INFO  23:13:01,766 VariantRecalibratorEngine - Finished iteration 0. 
    INFO  23:18:42,510 VariantRecalibratorEngine - Finished iteration 5.    Current change in mixture coefficients = 0.27554 
    INFO  23:23:58,014 VariantRecalibratorEngine - Finished iteration 10.   Current change in mixture coefficients = 0.12515 
    INFO  23:29:14,003 VariantRecalibratorEngine - Finished iteration 15.   Current change in mixture coefficients = 0.08745 
    INFO  23:34:30,042 VariantRecalibratorEngine - Finished iteration 20.   Current change in mixture coefficients = 0.07335 
    INFO  23:40:13,629 VariantRecalibratorEngine - Finished iteration 25.   Current change in mixture coefficients = 0.06055 
    INFO  23:46:03,272 VariantRecalibratorEngine - Finished iteration 30.   Current change in mixture coefficients = 0.05687 
    INFO  23:51:48,713 VariantRecalibratorEngine - Finished iteration 35.   Current change in mixture coefficients = 0.06024 
    INFO  23:57:35,193 VariantRecalibratorEngine - Finished iteration 40.   Current change in mixture coefficients = 0.05803 
    INFO  00:03:24,026 VariantRecalibratorEngine - Finished iteration 45.   Current change in mixture coefficients = 0.05324 
    INFO  00:09:08,881 VariantRecalibratorEngine - Finished iteration 50.   Current change in mixture coefficients = 0.04774 
    INFO  00:14:54,715 VariantRecalibratorEngine - Finished iteration 55.   Current change in mixture coefficients = 0.04201 
    INFO  00:20:44,183 VariantRecalibratorEngine - Finished iteration 60.   Current change in mixture coefficients = 0.03552 
    INFO  00:26:32,051 VariantRecalibratorEngine - Finished iteration 65.   Current change in mixture coefficients = 0.02855 
    INFO  00:32:22,767 VariantRecalibratorEngine - Finished iteration 70.   Current change in mixture coefficients = 0.02202 
    INFO  00:34:42,515 VariantRecalibratorEngine - Convergence after 72 iterations! 
    INFO  00:35:28,152 VariantRecalibratorEngine - Evaluating full set of 17485011 variants... 
    INFO  00:37:07,046 VariantDataManager - Found 0 variants overlapping bad sites training tracks. 
    INFO  00:38:00,059 VariantDataManager - Additionally training with worst 3.000% of passing data --> 524550 variants with LOD <= -33.4560. 
    INFO  00:38:00,059 GaussianMixtureModel - Initializing model with 30 k-means iterations... 
    INFO  00:39:18,063 VariantRecalibratorEngine - Finished iteration 0. 
    INFO  00:40:18,395 VariantRecalibratorEngine - Finished iteration 5.    Current change in mixture coefficients = 0.52601 
    INFO  00:41:18,504 VariantRecalibratorEngine - Finished iteration 10.   Current change in mixture coefficients = 0.13586 
    INFO  00:42:19,106 VariantRecalibratorEngine - Finished iteration 15.   Current change in mixture coefficients = 0.08304 
    INFO  00:43:19,823 VariantRecalibratorEngine - Finished iteration 20.   Current change in mixture coefficients = 0.06336 
    INFO  00:44:20,936 VariantRecalibratorEngine - Finished iteration 25.   Current change in mixture coefficients = 0.04779 
    INFO  00:45:21,891 VariantRecalibratorEngine - Finished iteration 30.   Current change in mixture coefficients = 0.03722 
    INFO  00:46:22,500 VariantRecalibratorEngine - Finished iteration 35.   Current change in mixture coefficients = 0.02795 
    INFO  00:47:23,425 VariantRecalibratorEngine - Finished iteration 40.   Current change in mixture coefficients = 0.02178 
    INFO  00:48:24,703 VariantRecalibratorEngine - Finished iteration 45.   Current change in mixture coefficients = 0.02168 
    INFO  00:49:25,391 VariantRecalibratorEngine - Finished iteration 50.   Current change in mixture coefficients = 0.02133 
    INFO  00:50:26,342 VariantRecalibratorEngine - Finished iteration 55.   Current change in mixture coefficients = 0.02017 
    INFO  00:50:38,529 VariantRecalibratorEngine - Convergence after 56 iterations! 
    INFO  00:50:47,249 VariantRecalibratorEngine - Evaluating full set of 17485011 variants... 
    INFO  01:09:34,114 TrancheManager - Finding 4 tranches for 17485011 variants 
    INFO  01:10:35,852 TrancheManager -   Tranche threshold 100.00 => selection metric threshold 0.000 
    INFO  01:10:44,787 TrancheManager -   Found tranche for 100.000: 0.000 threshold starting with variant 0; running score is 0.000  
    INFO  01:10:44,788 TrancheManager -   Tranche is Tranche ts=100.00 minVQSLod=-687620.4316 known=(3856587 @ 1.6917) novel=(13628424 @ 2.7844) truthSites(3856587 accessible, 3856587 called), name=anonymous] 
    INFO  01:10:44,789 TrancheManager -   Tranche threshold 99.90 => selection metric threshold 0.001 
    INFO  01:10:53,313 TrancheManager -   Found tranche for 99.900: 0.001 threshold starting with variant 6462348; running score is 0.001  
    INFO  01:10:53,314 TrancheManager -   Tranche is Tranche ts=99.90 minVQSLod=6.0801 known=(3852730 @ 1.6912) novel=(7169933 @ 2.4316) truthSites(3856587 accessible, 3852730 called), name=anonymous] 
    INFO  01:10:53,314 TrancheManager -   Tranche threshold 99.00 => selection metric threshold 0.010 
    INFO  01:11:01,750 TrancheManager -   Found tranche for 99.000: 0.010 threshold starting with variant 7636391; running score is 0.010  
    INFO  01:11:01,750 TrancheManager -   Tranche is Tranche ts=99.00 minVQSLod=8.4217 known=(3818021 @ 1.6880) novel=(6030599 @ 2.3723) truthSites(3856587 accessible, 3818021 called), name=anonymous] 
    INFO  01:11:01,751 TrancheManager -   Tranche threshold 90.00 => selection metric threshold 0.100 
    INFO  01:11:10,087 TrancheManager -   Found tranche for 90.000: 0.100 threshold starting with variant 9530520; running score is 0.100  
    INFO  01:11:10,088 TrancheManager -   Tranche is Tranche ts=90.00 minVQSLod=10.6484 known=(3470928 @ 1.6673) novel=(4483563 @ 2.3165) truthSites(3856587 accessible, 3470928 called), name=anonymous] 
    INFO  01:11:10,111 VariantRecalibrator - Writing out recalibration table... 
    INFO  01:15:42,042 VariantRecalibrator - Writing out visualization Rscript file... 
    INFO  01:15:42,146 VariantRecalibrator - Building BaseQRankSum x HaplotypeScore plot... 
    INFO  01:15:42,178 VariantRecalibratorEngine - Evaluating full set of 21267 variants... 
    INFO  01:15:45,775 VariantRecalibratorEngine - Evaluating full set of 21267 variants... 
    INFO  01:15:50,071 VariantRecalibrator - Building BaseQRankSum x QD plot... 
    INFO  01:15:50,076 VariantRecalibratorEngine - Evaluating full set of 8062 variants... 
    INFO  01:15:51,466 VariantRecalibratorEngine - Evaluating full set of 8062 variants... 
    INFO  01:15:53,063 VariantRecalibrator - Building BaseQRankSum x DP plot... 
    INFO  01:15:53,066 VariantRecalibratorEngine - Evaluating full set of 25298 variants... 
    INFO  01:15:57,431 VariantRecalibratorEngine - Evaluating full set of 25298 variants... 
    INFO  01:16:02,356 VariantRecalibrator - Building BaseQRankSum x ReadPosRankSum plot... 
    INFO  01:16:02,360 VariantRecalibratorEngine - Evaluating full set of 36974 variants... 
    INFO  01:16:08,772 VariantRecalibratorEngine - Evaluating full set of 36974 variants... 
    INFO  01:16:15,910 VariantRecalibrator - Building HaplotypeScore x QD plot... 
    INFO  01:16:15,912 VariantRecalibratorEngine - Evaluating full set of 8874 variants... 
    INFO  01:16:17,413 VariantRecalibratorEngine - Evaluating full set of 8874 variants... 
    INFO  01:16:19,169 VariantRecalibrator - Building HaplotypeScore x DP plot... 
    INFO  01:16:19,173 VariantRecalibratorEngine - Evaluating full set of 27846 variants... 
    INFO  01:16:23,870 VariantRecalibratorEngine - Evaluating full set of 27846 variants... 
    INFO  01:16:29,363 VariantRecalibrator - Building HaplotypeScore x ReadPosRankSum plot... 
    INFO  01:16:29,369 VariantRecalibratorEngine - Evaluating full set of 40698 variants... 
    INFO  01:16:36,254 VariantRecalibratorEngine - Evaluating full set of 40698 variants... 
    INFO  01:16:44,155 VariantRecalibrator - Building QD x DP plot... 
    INFO  01:16:44,158 VariantRecalibratorEngine - Evaluating full set of 10556 variants... 
    INFO  01:16:45,982 VariantRecalibratorEngine - Evaluating full set of 10556 variants... 
    INFO  01:16:48,058 VariantRecalibrator - Building QD x ReadPosRankSum plot... 
    INFO  01:16:48,061 VariantRecalibratorEngine - Evaluating full set of 15428 variants... 
    INFO  01:16:50,724 VariantRecalibratorEngine - Evaluating full set of 15428 variants... 
    INFO  01:16:53,702 VariantRecalibrator - Building DP x ReadPosRankSum plot... 
    INFO  01:16:53,707 VariantRecalibratorEngine - Evaluating full set of 48412 variants... 
    INFO  01:17:02,067 VariantRecalibratorEngine - Evaluating full set of 48412 variants... 
    INFO  01:17:11,355 VariantRecalibrator - Executing: Rscript /global/projectb/scratch/bredeson/Manihot_esculenta/WGS_HQ/UnifiedGenotyper/Manihot_UG.rdx.DP.Rplot 
    INFO  01:19:24,826 VariantRecalibrator - Executing: Rscript (resource)org/broadinstitute/sting/gatk/walkers/variantrecalibration/plot_Tranches.R /global/projectb/scratch/bredeson/Manihot_esculenta/WGS_HQ/UnifiedGenotyper/Manihot_UG.rdx.DP.tranches 2.15 
    INFO  01:19:39,354 ProgressMeter -            done        2.96e+07    2.6 h        5.3 m    100.0%         2.6 h     0.0 s 
    INFO  01:19:39,354 ProgressMeter - Total runtime 9388.22 secs, 156.47 min, 2.61 hours 
    INFO  01:19:56,862 GATKRunReport - Uploaded run statistics report to AWS S3
    
  • mikemike Posts: 103Member

    Hi,

    I used to be able to run VQSR step for my old data, however, when I started the new version 2.6-4, I got the following error message from my log file:

    ..... ..... INFO 18:38:46,252 VariantRecalibratorEngine - Finished iteration 40. Current change in mixture coefficients = 0.08907 INFO 18:38:46,307 VariantRecalibratorEngine - Finished iteration 45. Current change in mixture coefficients = 0.02334 INFO 18:38:46,363 VariantRecalibratorEngine - Finished iteration 50. Current change in mixture coefficients = 0.03583 INFO 18:38:46,407 VariantRecalibratorEngine - Convergence after 54 iterations! INFO 18:38:46,414 VariantRecalibratorEngine - Evaluating full set of 180431 variants... INFO 18:38:47,634 GATKRunReport - Uploaded run statistics report to AWS S3

    ERROR ------------------------------------------------------------------------------------------
    ERROR stack trace

    org.broadinstitute.sting.utils.exceptions.ReviewedStingException: Unable to retrieve result at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:190) at org.broadinstitute.sting.gatk.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:311) at org.broadinstitute.sting.gatk.CommandLineExecutable.execute(CommandLineExecutable.java:113) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:245) at org.broadinstitute.sting.commandline.CommandLineProgram.start(CommandLineProgram.java:152) at org.broadinstitute.sting.gatk.CommandLineGATK.main(CommandLineGATK.java:91) Caused by: java.lang.IllegalArgumentException: log10p: Values must be non-infinite and non-NAN at org.broadinstitute.sting.utils.MathUtils.log10sumLog10(MathUtils.java:236) at org.broadinstitute.sting.utils.MathUtils.log10sumLog10(MathUtils.java:224) at org.broadinstitute.sting.utils.MathUtils.log10sumLog10(MathUtils.java:249) at org.broadinstitute.sting.gatk.walkers.variantrecalibration.GaussianMixtureModel.evaluateDatum(GaussianMixtureModel.java:238) at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibratorEngine.evaluateDatum(VariantRecalibratorEngine.java:167) at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibratorEngine.evaluateData(VariantRecalibratorEngine.java:100) at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:335) at org.broadinstitute.sting.gatk.walkers.variantrecalibration.VariantRecalibrator.onTraversalDone(VariantRecalibrator.java:132) at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.notifyTraversalDone(HierarchicalMicroScheduler.java:226) at org.broadinstitute.sting.gatk.executive.HierarchicalMicroScheduler.execute(HierarchicalMicroScheduler.java:183) ... 5 more

    ERROR ------------------------------------------------------------------------------------------
    ERROR A GATK RUNTIME ERROR has occurred (version 2.6-4-g3e5ff60):
    ERROR
    ERROR Please check the documentation guide to see if this is a known problem
    ERROR If not, please post the error, with stack trace, to the GATK forum
    ERROR Visit our website and forum for extensive documentation and answers to
    ERROR commonly asked questions http://www.broadinstitute.org/gatk
    ERROR
    ERROR MESSAGE: Unable to retrieve result
    ERROR ------------------------------------

    Here is my command:

    java -Xmx4g -jar /path/gatk/2.6-4/bin/GenomeAnalysisTK.jar -T VariantRecalibrator -R /Path/hg19.fa -nt 4 -input /Path/GATK_UG_SelSNP.vcf -resource:hapmap,known=false,training=true,truth=true,prior=15.0 /Path/bundle-2.5/hg19/hapmap_3.3.hg19.vcf -resource:omni,known=false,training=true,truth=true,prior=12.0 /Pathb/bundle-2.5/hg19/1000G_omni2.5.hg19.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 /Path/bundle-2.5/hg19/dbsnp_137.hg19.vcf -resource:1000G,known=false,training=true,truth=false,prior=10.0 /Path/bundle-2.5/hg19/1000G_phase1.snps.high_confidence.hg19.vcf -an QD -an MQRankSum -an ReadPosRankSum -an FS -an DP -an InbreedingCoeff
    -percentBad 0.01 -minNumBad 1000
    -recalFile /Path/GATK_UG_SelSNP.VarRecal -tranchesFile /Path/GATK_UG_SelSNP.tranches -rscriptFile /Path/GATK_UG_SelSNP.R -mode SNP

    What is even odd is: when I run similar command for indels, it went through without any error, here is the indel command:

    java -Xmx4g -jar /Path/gatk/2.6-4/bin/GenomeAnalysisTK.jar -T VariantRecalibrator -R /Path//hg19.fa -input /Path/GATK_UG_SelIndel.vcf -resource:mills,known=false,training=true,truth=true,prior=12.0 /Path/bundle-2.5/hg19/Mills_and_1000G_gold_standard.indels.hg19.vcf -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 /Path/bundle-2.5/hg19/dbsnp_137.hg19.vcf -percentBad 0.01 -minNumBad 1000 -an DP -an FS -an ReadPosRankSum -an MQRankSum -an InbreedingCoeff --maxGaussians 4 -recalFile /Path/GATK_UG_SelIndel.VarRecal -tranchesFile /Path/GATK_UG_SelIndel.tranches -rscriptFile /Path/GATK_UG_SelIndel.R -mode INDEL

    Any idea why I got such error in SNP VQSR but not in Indel VQSR? In Indel, I used --maxGaussians 4. but in SNP, I used -nt 4 based on documnetation, other than the -resource files from bundle, these are only difference between the indel and SNP VQSR.

    Thanks for the help!

    Mike

  • bredesonbredeson Posts: 24Member
    edited July 2013

    I'm putting a tarball of the v2.5 and v2.6 VQSR plots, tranches, and run-log files on the FTP in the "incoming" folder called Manihot_VQSR.tar.gz

    Post edited by bredeson on
  • bredesonbredeson Posts: 24Member
    edited July 2013

    @bredeson said: I'm putting a tarball of the v2.5 and v2.6 VQSR plots, tranches, and run-log files on the FTP in the "incoming" folder called Manihot_VQSR.tar.gz

    Scratch that, it's in the main public ftp root dir as Manihot_VQSR.tar.gz

    -rw-r--r-- 1 depristo wga 27491078 Jul 4 01:55 Manihot_VQSR.tar.gz

    A little explanation of the content naming in the tar ball:

    1) files are labeled with the version of GATK used to do VQSR, 2) files are labeled with true or false indicating the state that the known attribute was set to at the --resource flag.

    Post edited by bredeson on
  • mikemike Posts: 103Member

    Hi:

    Just follow up on my last post, for SNP VQSR, I found out that the error was caused by the absence of --maxGaussians 4, as long as I added the option: --maxGaussians 4, with or without -nt 4, they all went through well. Not sure what default value for --maxGaussians. With -h , I can not see it. any way, any idea why it behaved like this?

    Thanks for the help!

    Mike

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,804Administrator, GATK Developer admin

    @bredeson, thanks for the files. Due to the holiday we may not be able to get to this for a couple of days though. Thanks for your patience.

    Geraldine Van der Auwera, PhD

  • rpoplinrpoplin Posts: 122GATK Developer mod

    HI there,

    I think one source of the confusion with TP/FPs on the tranche plot is that in the 2.6 plot all of the tranches are over the specified target ti/tv value and so there is no way to estimate TP/FPs (everything is assumed to be a TP) but in the 2.5 plot you have ti/tv that go above and below the target ti/tv value. Is this data from an exome sequencing project? If so you probably want to increases that target ti/tv and I think the tranche plots will make more sense. Take a look at the --target_titv argument in VariantRecalibrator.

    Let me know if that helps.

  • bredesonbredeson Posts: 24Member

    Hello rpoplin, It's been a while, but I'm revisiting this subject, as I do not yet fully understand how my truth=90 tranche can have cumulative [TF]Ps...

    I realigned around indels (which I thought I had done with the prior dataset) and then re-called (with UG) and re-ran VQSR with a --target_titv of 2.5 and the following plots were outputted. This result, as before, still contains tranches with cumulative [TF]Ps in the starting tranche, and it is still not clear to me how that is possible: is this an artifact of my training data (stringently filtered sites from the same dataset) or of the way I ran VariantRecalibrator (see log below)?

    Also, the Cumulative FPs bins overlap the Tranch-Specific TPs bins; this looks like a plotting error to me, but is this intended/meaningful? Again, my concern stems from the dissimilarity of my plot with the example plots here: http://www.broadinstitute.org/gatk/guide/article?id=39

    Thank you so much for your help!

    INFO  12:26:54,300 ArgumentTypeDescriptor - Dynamically determined type of /projectb/projectdirs/plant/analysis/cassava/genome/cassavaV4.1685010_168501.repeat.masked.srt.bed to be BED 
    INFO  12:26:54,384 HelpFormatter - -------------------------------------------------------------------------------- 
    INFO  12:26:54,384 HelpFormatter - The Genome Analysis Toolkit (GATK) v2.5-2-gf57256b, Compiled 2013/05/01 09:27:02 
    INFO  12:26:54,385 HelpFormatter - Copyright (c) 2010 The Broad Institute 
    INFO  12:26:54,385 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk 
    INFO  12:26:54,390 HelpFormatter - Program Args: -T VariantRecalibrator --reference_sequence /projectb/projectdirs/plant/analysis/cassava/genome/scaffolds+cpDNA.fasta --excludeIntervals /projectb/projectdirs/plant/analysis/cassava/genome/cassavaV4.1685010_168501.repeat.masked.srt.bed --mode SNP --resource:bootstrap,known=false,training=true,truth=true,prior=8.0 Manihot_UG.VF2+HW.cDP.vcf --input Manihot_UG.vcf --recal_file Manihot_UG.titv2_5.recal --tranches_file Manihot_UG.titv2_5.tranches --rscript_file Manihot_UG.titv2_5.Rplot -an BaseQRankSum -an FS -an InbreedingCoeff -an HaplotypeScore -an QD -an DP -an MQRankSum -an ReadPosRankSum --target_titv 2.5 
    INFO  12:26:54,391 HelpFormatter - Date/Time: 2013/08/06 12:26:54 
    INFO  12:26:54,391 HelpFormatter - -------------------------------------------------------------------------------- 
    INFO  12:26:54,391 HelpFormatter - -------------------------------------------------------------------------------- 
    INFO  12:26:54,456 ArgumentTypeDescriptor - Dynamically determined type of Manihot_UG.vcf to be VCF 
    INFO  12:26:54,506 ArgumentTypeDescriptor - Dynamically determined type of Manihot_UG.VF2+HW.cDP.vcf to be VCF 
    INFO  12:26:55,427 GenomeAnalysisEngine - Strictness is SILENT 
    INFO  12:26:56,636 GenomeAnalysisEngine - Downsampling Settings: Method: BY_SAMPLE, Target Coverage: 1000 
    INFO  12:26:56,678 RMDTrackBuilder - Loading Tribble index from disk for file Manihot_UG.vcf 
    INFO  12:26:57,559 RMDTrackBuilder - Loading Tribble index from disk for file Manihot_UG.VF2+HW.cDP.vcf 
    INFO  12:26:58,768 IntervalUtils - Initial include intervals span 532668733 loci; exclude intervals span 7340626 loci 
    INFO  12:26:58,772 IntervalUtils - Excluding 7340626 loci from original intervals (1.38% reduction) 
    INFO  12:26:58,773 IntervalUtils - Processing 525328107 bp from intervals 
    INFO  12:26:59,279 GenomeAnalysisEngine - Creating shard strategy for 0 BAM files 
    INFO  12:26:59,317 GenomeAnalysisEngine - Done creating shard strategy 
    INFO  12:26:59,318 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
    INFO  12:26:59,318 ProgressMeter -        Location processed.sites  runtime per.1M.sites completed total.runtime remaining 
    INFO  12:26:59,326 TrainingSet - Found bootstrap track:         Known = false   Training = true         Truth = true    Prior = Q8.0
    ...
    INFO  13:08:36,396 ProgressMeter - scaffold12914:16853        3.04e+07   41.6 m       82.0 s     99.7%        41.8 m     8.0 s 
    INFO  13:08:47,619 VariantDataManager - BaseQRankSum:    mean = 1.09     standard deviation = 1.86 
    INFO  13:08:53,561 VariantDataManager - FS:      mean = 7.53     standard deviation = 9.53 
    INFO  13:08:59,407 VariantDataManager - InbreedingCoeff:         mean = 0.17     standard deviation = 0.36 
    INFO  13:09:05,407 VariantDataManager - HaplotypeScore:          mean = 6.81     standard deviation = 6.65 
    INFO  13:09:06,399 ProgressMeter - scaffold12977:1750        3.05e+07   42.1 m       82.0 s    100.0%        42.1 m     0.0 s 
    INFO  13:09:11,277 VariantDataManager - QD:      mean = 7.96     standard deviation = 5.89 
    INFO  13:09:17,085 VariantDataManager - DP:      mean = 567.93   standard deviation = 267.48 
    INFO  13:09:22,875 VariantDataManager - MQRankSum:       mean = 0.00     standard deviation = 2.74 
    INFO  13:09:28,635 VariantDataManager - ReadPosRankSum:          mean = 0.73     standard deviation = 1.75 
    INFO  13:09:36,401 ProgressMeter - scaffold12977:1750        3.05e+07   42.6 m       83.0 s    100.0%        42.6 m     0.0 s 
    INFO  13:09:37,230 VariantDataManager - Training with 13035063 variants after standard deviation thresholding. 
    INFO  13:09:37,291 GaussianMixtureModel - Initializing model with 30 k-means iterations...
    ...
    INFO  18:07:42,225 VariantRecalibratorEngine - Evaluating full set of 17522718 variants... 
    INFO  18:08:04,181 ProgressMeter - scaffold12977:1750        3.05e+07    5.7 h       11.2 m    100.0%         5.7 h     0.0 s 
    INFO  18:08:34,183 ProgressMeter - scaffold12977:1750        3.05e+07    5.7 h       11.2 m    100.0%         5.7 h     0.0 s 
    INFO  18:09:04,185 ProgressMeter - scaffold12977:1750        3.05e+07    5.7 h       11.2 m    100.0%         5.7 h     0.0 s 
    INFO  18:09:34,187 ProgressMeter - scaffold12977:1750        3.05e+07    5.7 h       11.2 m    100.0%         5.7 h     0.0 s 
    INFO  18:10:04,190 ProgressMeter - scaffold12977:1750        3.05e+07    5.7 h       11.2 m    100.0%         5.7 h     0.0 s 
    INFO  18:10:34,192 ProgressMeter - scaffold12977:1750        3.05e+07    5.7 h       11.3 m    100.0%         5.7 h     0.0 s 
    INFO  18:10:37,827 VariantDataManager - Found 0 variants overlapping bad sites training tracks. 
    INFO  18:11:04,194 ProgressMeter - scaffold12977:1750        3.05e+07    5.7 h       11.3 m    100.0%         5.7 h     0.0 s 
    INFO  18:11:26,512 VariantDataManager - Additionally training with worst 3.000% of passing data --> 525682 variants with LOD <= -14.8842. 
    INFO  18:11:26,513 GaussianMixtureModel - Initializing model with 30 k-means iterations... 
    INFO  18:11:34,196 ProgressMeter - scaffold12977:1750        3.05e+07    5.7 h       11.3 m    100.0%         5.7 h     0.0 s 
    INFO  18:12:04,199 ProgressMeter - scaffold12977:1750        3.05e+07    5.8 h       11.3 m    100.0%         5.8 h     0.0 s 
    INFO  18:12:34,201 ProgressMeter - scaffold12977:1750        3.05e+07    5.8 h       11.3 m    100.0%         5.8 h     0.0 s 
    INFO  18:12:35,512 VariantRecalibratorEngine - Finished iteration 0. 
    INFO  18:13:04,203 ProgressMeter - scaffold12977:1750        3.05e+07    5.8 h       11.3 m    100.0%         5.8 h     0.0 s 
    INFO  18:13:34,205 ProgressMeter - scaffold12977:1750        3.05e+07    5.8 h       11.4 m    100.0%         5.8 h     0.0 s 
    ...
    INFO  18:22:10,281 VariantRecalibratorEngine - Convergence after 36 iterations!
    ...
    INFO  19:06:35,704 VariantRecalibrator - Executing: Rscript /global/projectb/scratch/bredeson/Manihot_esculenta/WGS_HQ/UnifiedGenotyper_Realigned/Manihot_UG.titv2_5.Rplot
    ...
    INFO  19:11:17,745 VariantRecalibrator - Executing: Rscript (resource)org/broadinstitute/sting/gatk/walkers/variantrecalibration/plot_Tranches.R /global/projectb/scratch/bredeson/Manihot_esculenta/WGS_HQ/UnifiedGenotyper_Realigned/Manihot_UG.titv2_5.tranches 2.5
    ...
    INFO  19:11:32,736 ProgressMeter -            done        3.05e+07    6.7 h       13.3 m    100.0%         6.7 h     0.0 s 
    INFO  19:11:32,736 ProgressMeter - Total runtime 24273.42 secs, 404.56 min, 6.74 hours 
    INFO  19:12:07,556 GATKRunReport - Uploaded run statistics report to AWS S3
    Screen Shot 2013-08-26 at 6.24.04 PM.png
    745 x 453 - 43K
    Screen Shot 2013-08-26 at 6.36.55 PM.png
    644 x 662 - 103K
  • bredesonbredeson Posts: 24Member
    edited August 2013

    @rpoplin: To answer your previous question, this is WGS data.

    Post edited by bredeson on
  • dheerajdheeraj luxembourgPosts: 12Member
    edited October 25

    Hi I have analyzed 2000 exome samples using GATK 3.2.2 and I have ended up with a huge around 700gb VCF file after VQSR. My tranch plot looks really bad (attached file) and the ti/tv ratio is really low. I did not use the interval list (-L) during my analysis . Now, can I use the interval list and exclude those variants outside the intervals after the VQSR ?? or do I have to run all the steps again ?? Please let me know. Thank you in advance.

    pdf
    pdf
    recalibrate_SNP.tranches.pdf
    18K
    Post edited by dheeraj on
  • dheerajdheeraj luxembourgPosts: 12Member

    Just checked that the vcf file has got 29513599 SNPs that passed the VQSR to let you know.

  • Geraldine_VdAuweraGeraldine_VdAuwera Posts: 6,804Administrator, GATK Developer admin

    @dheeraj You can re-run just the VQSR steps, this time using the -L intervals. This will eliminate the variants that are outside your target regions. Also, you can specify the expected titv argument to match your expectation for exome, which is different from whole genome.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.