Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

A problem about strand_bias filter in Mutect2

ahdaahda ChinaMember

I run the GATK4.1.0.2 mutect2 best practice, and I found a records which filtered by the only reason strand_bias, but I think I don't observe the strand bias matter in this record:

$ zcat gatk_mutect2/S018_1/S018_1.m2_oncefilt.vcf.gz | awk '$1=="chr13" && $2==32906589'
chr13 32906589 . C G . strand_bias CONTQ=93;DP=5084;ECNT=1;GERMQ=93;MBQ=20,20;MFRL=180,165;MMQ=60,60;MPOS=51;NALOD=3.35;NLOD=1320.04;POPAF=6.00;ROQ=93;SEQQ=93;STRANDQ=83;TLOD=25.30 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:516,24:0.044:540:253,14:261,10:316,200,13,11 0/0:4409,1:2.245e-04:4410:2076,0:2257,0:2303,2106,0,1

below is my command:

GATKCommandLine=<ID=FilterMutectCalls,CommandLine="FilterMutectCalls --output gatk_mutect2/S018_1/S018_1.m2_oncefilt.vcf.gz --stats gatk_mutect2/S018_1/S018_1.m2.vcf.gz.stats --filtering-stats gatk_mutect2/S018_1/S018_1.m2_oncefilt.stats --tumor-segmentation gatk_mutect2/S018_1/S018_1.segments.table --orientation-bias-artifact-priors gatk_mutect2/S018_1/S018_1.artifact-priors.tar.gz --variant gatk_mutect2/S018_1/S018_1.m2.vcf.gz --reference /workplace/public/database/Human_hg19/genome.fa --threshold-strategy OPTIMAL_F_SCORE --f-score-beta 1.0 --false-discovery-rate 0.05 --initial-threshold 0.1 --mitochondria-mode false --max-events-in-region 2 --max-alt-allele-count 1 --unique-alt-read-count 0 --min-median-mapping-quality 30 --min-median-base-quality 20 --max-median-fragment-length-difference 10000 --min-median-read-position 1 --max-n-ratio Infinity --min-reads-per-strand 0 --autosomal-coverage 0.0 --max-numt-fraction 0.85 --min-allele-fraction 0.0 --contamination-estimate 0.0 --log-snv-prior -13.815510557964275 --log-indel-prior -16.11809565095832 --log-artifact-prior -2.302585092994046 --normal-p-value-threshold 0.001 --min-slippage-length 8 --pcr-slippage-rate 0.1 --distance-on-haplotype 100 --long-indel-length 5 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false",Version="4.1.2.0",Date="July 11, 2019 2:03:38 PM CST">

GATKCommandLine=<ID=Mutect2,CommandLine="Mutect2 --f1r2-tar-gz gatk_mutect2/S018_1/chr12.f1r2.tar.gz --tumor-sample S018_1 --normal-sample S018_2 --panel-of-normals gatk_mutect2/pon.vcf.gz --germline-resource af-only-bychr/chr12.hg19.vcf.gz --bam-output gatk_mutect2/S018_1/chr12.m2.bam --output gatk_mutect2/S018_1/chr12.m2.vcf.gz --intervals call_region/chr12.region.bed --input gatk_bqsr/S018_1.bqsr.bam --input gatk_bqsr/S018_2.bqsr.bam --reference /workplace/public/database/Human_hg19/genome.fa --disable-read-filter MateOnSameContigOrNoMappedMateReadFilter --f1r2-median-mq 50 --f1r2-min-bq 20 --f1r2-max-depth 200 --genotype-pon-sites false --genotype-germline-sites false --af-of-alleles-not-in-resource -1.0 --mitochondria-mode false --tumor-lod-to-emit 3.0 --initial-tumor-lod 2.0 --pcr-snv-qual 40 --pcr-indel-qual 40 --max-population-af 0.01 --downsampling-stride 1 --callable-depth 10 --max-suspicious-reads-per-alignment-start 0 --normal-lod 2.2 --ignore-itr-artifacts false --gvcf-lod-band -2.5 --gvcf-lod-band -2.0 --gvcf-lod-band -1.5 --gvcf-lod-band -1.0 --gvcf-lod-band -0.5 --gvcf-lod-band 0.0 --gvcf-lod-band 0.5 --gvcf-lod-band 1.0 --minimum-allele-fraction 0.0 --genotype-filtered-alleles false --disable-adaptive-pruning false --dont-trim-active-regions false --max-disc-ar-extension 25 --max-gga-ar-extension 300 --padding-around-indels 150 --padding-around-snps 20 --kmer-size 10 --kmer-size 25 --dont-increase-kmer-sizes-for-cycles false --allow-non-unique-kmers-in-ref false --num-pruning-samples 1 --min-dangling-branch-length 4 --recover-all-dangling-branches false --max-num-haplotypes-in-population 128 --min-pruning 2 --adaptive-pruning-initial-error-rate 0.001 --pruning-lod-threshold 2.302585092994046 --max-unpruned-variants 100 --debug-assembly false --debug-graph-transformations false --capture-assembly-failure-bam false --error-correct-reads false --kmer-length-for-read-error-correction 25 --min-observations-for-kmer-to-be-solid 20 --likelihood-calculation-engine PairHMM --base-quality-score-threshold 18 --pair-hmm-gap-continuation-penalty 10 --pair-hmm-implementation FASTEST_AVAILABLE --pcr-indel-model CONSERVATIVE --phred-scaled-global-read-mismapping-rate 45 --native-pair-hmm-threads 4 --native-pair-hmm-use-double-precision false --bam-writer-type CALLED_HAPLOTYPES --dont-use-soft-clipped-bases false --min-base-quality-score 10 --smith-waterman JAVA --emit-ref-confidence NONE --max-mnp-distance 1 --min-assembly-region-size 50 --max-assembly-region-size 300 --assembly-region-padding 100 --max-reads-per-alignment-start 50 --active-probability-threshold 0.002 --max-prob-propagation-distance 50 --force-active false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false --max-read-length 2147483647 --min-read-length 30 --minimum-mapping-quality 20 --disable-tool-default-annotations false --enable-all-annotations false",Version="4.1.2.0",Date="July 10, 2019 3:36:01 PM CST">

Best Answer

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited August 9

    Hi @ahda

    Couple of things:
    1) Could you please post the actual command line used and not the vcf output command line? The latter is too full of optional arguments' default values to be readable.
    2) Please use a normal-sized font and a markdown code block. It's hard to read this way.
    3) Please post commands for the full Mutect2 pipeline (Mutect2, GetPileupSummaries, CalculateContamination, and FilterMutectCalls). We would like to see the .filteringStats.tsv output of FilterMutectCalls.

    I checked with the developer and he said, strandQ (and similarly the other ___Q annotations) is the phred-scaled posterior quality of an artifact. In this case a strandQ of 83 means FilterMutectCalls thinks there's a 1 in 10^8.3 chance of a strand artifact, which means the filtering is surprising and that something might be wrong with the pipeline.

    Post edited by bhanuGandham on
  • ahdaahda ChinaMember

    Thank you for the reply. I’m so sorry for the unreadable format.I hope I have fixed the problem here.
    below is my details GATK command:

      1. split target.probes.bed by chromosomes
      1. foreach chr.regions.bed,do
        Mutect2
        GetPileupSummaries

      for example:chr13
      gatk --java-options "-Xmx12g" Mutect2 -R hg19.fa -I S018_1.bqsr.bam -I S018_2.bqsr.bam -tumor S018_1 -normal S018_2 -pon gatk_mutect2/pon.vcf.gz --germline-resource af-only-bychr/chr13.hg19.vcf.gz --f1r2-tar-gz gatk_mutect2/S018_1/chr13.f1r2.tar.gz --disable-read-filter MateOnSameContigOrNoMappedMateReadFilter -L call_region/chr13.region.bed -O gatk_mutect2/S018_1/chr13.m2.vcf.gz -bamout gatk_mutect2/S018_1/chr13.m2.bam

      gatk --java-options "-Xmx12g" GetPileupSummaries -R hg19.fa -I S018_1.bqsr.bam --interval-set-rule INTERSECTION -L call_region/chr13.region.bed -V variants_for_contamination/chr13.hg19.vcf.gz -L variants_for_contamination/chr13.hg19.vcf.gz -O gatk_mutect2/S018_1/chr13.tumor.pileup.table

      gatk --java-options "-Xmx12g" GetPileupSummaries -R hg19.fa -I S018_2.bqsr.bam --interval-set-rule INTERSECTION -L call_region/chr13.region.bed -V variants_for_contamination/chr13.hg19.vcf.gz -L variants_for_contamination/chr13.hg19.vcf.gz -O gatk_mutect2/S018_1/chr13.normal.pileup.table

      1. combine the m2 output files and pileup output files according to all chromosomes
        GATK MergeVcfs
        -I gatk_mutect2/S018_1/chr1.m2.vcf.gz
        -I gatk_mutect2/S018_1/chr2.m2.vcf.gz
        -I … …
        -I gatk_mutect2/S018_1/chrX.m2.vcf.gz
        -O gatk_mutect2/S018_1/S018_1.m2.vcf.gz

      GATK MergeMutectStats
      -stats gatk_mutect2/S018_1/chr1.m2.vcf.gz.stats
      -stats gatk_mutect2/S018_1/chr2.m2.vcf.gz.stats
      -stats … …
      -stats gatk_mutect2/S018_1/chrX.m2.vcf.gz.stats
      -O gatk_mutect2/S018_1/S018_1.m2.vcf.gz.stats

      GATK GatherPileupSummaries --sequence-dictionary hg19.fa.fai
      -I gatk_mutect2/S018_1/chr1.normal.pileup.table
      -I gatk_mutect2/S018_1/chr2.normal.pileup.table
      -I … …
      -O gatk_mutect2/S018_1/S018_1.normal.pileup.tsv

      GATK GatherPileupSummaries --sequence-dictionary hg19.fa.fai
      -I gatk_mutect2/S018_1/chr1.tumor.pileup.table
      -I gatk_mutect2/S018_1/chr2.tumor.pileup.table
      -I … …
      -O gatk_mutect2/S018_1/S018_1.tumor.pileup.tsv

      GATK LearnReadOrientationModel
      -I gatk_mutect2/S018_1/chr1.f1r2.tar.gz
      -I gatk_mutect2/S018_1/chr2.f1r2.tar.gz
      -I … …
      -O gatk_mutect2/S018_1/S018_1.artifact-priors.tar.gz

      4) CalculateContamination
      gatk CalculateContamination -I gatk_mutect2/S018_1/S018_1.tumor.pileup.tsv -O gatk_mutect2/S018_1/S018_1.contamination.table --tumor-segmentation gatk_mutect2/S018_1/S018_1.segments.table -matched gatk_mutect2/S018_1/S018_1.normal.pileup.tsv
      5) FilterMutectCalls
      gatk FilterMutectCalls -V gatk_mutect2/S018_1/S018_1.m2.vcf.gz -R hg19.fa --contamination-table gatk_mutect2/S018_1/S018_1.contamination.table -O gatk_mutect2/S018_1/S018_1.m2_oncefilt.vcf.gz --ob-priors gatk_mutect2/S018_1/S018_1.artifact-priors.tar.gz --stats gatk_mutect2/S018_1/S018_1.m2.vcf.gz.stats --tumor-segmentation gatk_mutect2/S018_1/S018_1.segments.table --filtering-stats gatk_mutect2/S018_1/S018_1.m2_oncefilt.stats
      6) FilterAlignmentArtifacts
      gatk FilterAlignmentArtifacts -V gatk_mutect2/S018_1/S018_1.m2_oncefilt.vcf.gz -I gatk_mutect2/S018_1/S018_1.m2.sort.bam --bwa-mem-index-image /workplace/public/database/ensembl_grch37_82/homo_sapiens_GRCh37.fa.img -O gatk_mutect2/S018_1/S018_1.m2_twicefilt.vcf.gz

    $ cat gatk_mutect2/S018_1/S018_1.m2.vcf.gz.stats
    statistic value
    callable 64657.0

    PS:
    germline-resource and variants-for-contamination are from this link,and split by chromosomes
    ftp://[email protected]/bundle/Mutect2/af-only-gnomad.raw.sites.b37.vcf.gz
    ftp://[email protected]/Liftover_Chain_Files/b37tohg19.chain

  • ahdaahda ChinaMember
    edited August 9

    I think this is your expect stats file, rather than S018_1.m2.vcf.gz.stats
    $ cat gatk_mutect2/S018_1/S018_1.m2_oncefilt.stats
    #Ln prior of deletion of length 10=-20.72326583694641
    #Ln prior of deletion of length 9=-11.076851653667346
    #Ln prior of deletion of length 8=-20.72326583694641
    #Ln prior of deletion of length 7=-20.72326583694641
    #Ln prior of deletion of length 6=-20.72326583694641
    #Ln prior of deletion of length 5=-20.72326583694641
    #Ln prior of deletion of length 4=-20.72326583694641
    #Ln prior of deletion of length 3=-20.72326583694641
    #Ln prior of deletion of length 2=-20.72326583694641
    #Ln prior of deletion of length 1=-20.72326583694641
    #Ln prior of SNV=-10.3837044731074
    #Ln prior of insertion of length 1=-20.72326583694641
    #Ln prior of insertion of length 2=-20.72326583694641
    #Ln prior of insertion of length 3=-20.72326583694641
    #Ln prior of insertion of length 4=-20.72326583694641
    #Ln prior of insertion of length 5=-20.72326583694641
    #Ln prior of insertion of length 6=-20.72326583694641
    #Ln prior of insertion of length 7=-20.72326583694641
    #Ln prior of insertion of length 8=-20.72326583694641
    #Ln prior of insertion of length 9=-20.72326583694641
    #Ln prior of insertion of length 10=-20.72326583694641
    #High-AF beta-binomial cluster=weight = 0.1667, alpha = 10.00, beta = 1.00
    #Background beta-binomial cluster=weight = 0.5000, alpha = 0.50, beta = 1.81
    #Binomial cluster 1=weight = 0.6667, mean = 0.194
    #threshold=0.0
    #fdr=0.0
    #sensitivity=0.666
    filter FP FDR FN FNR
    weak_evidence 0.0 0.0 0.0 0.0
    strand_bias 0.0 0.0 1.0 0.33
    normal_artifact 0.0 0.0 0.0 0.0
    orientation 0.0 0.0 0.0 0.0

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    @ahda The results:

    $ cat gatk_mutect2/S018_1/S018_1.m2.vcf.gz.stats
    statistic value
    callable 64657.0
    

    suggest that your intervals only had ~60 kB of territory with at least 10x coverage, and it also seems from your filtering stats that there are very few calls. Could you post your filtered Mutect2 vcf here?

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    Also, in FilterAlignmentArtifacts you are using a b37 index image homo_sapiens_GRCh37.fa.img. You should use an hg38 image, even if your reads are aligned to b37.

  • ahdaahda ChinaMember

    @davidben Thank you for the reply.
    1.it's right that my intervals are only ~60 kB, actually 65298. the mean depth is about 1600x.
    2.in FilterAlignmentArtifacts you are using a b37 index image homo_sapiens_GRCh37.fa.img:I am sorry i make a mistake here, I used image Homo_sapiens_assembly38.fasta.img indeed.

    3.my Mutect2 filtered vcf is below.
    $ zcat gatk_mutect2/S018_1/S018_1.m2_twicefilt.vcf.gz
    ##fileformat=VCFv4.2
    ##FILTER=<ID=alignment,Description="Alignment artifact">
    ##FILTER=<ID=base_qual,Description="alt median base quality">
    ##FILTER=<ID=clustered_events,Description="Clustered events observed in the tumor">
    ##FILTER=<ID=contamination,Description="contamination">
    ##FILTER=<ID=duplicate,Description="evidence for alt allele is overrepresented by apparent duplicates">
    ##FILTER=<ID=fragment,Description="abs(ref - alt) median fragment length">
    ##FILTER=<ID=germline,Description="Evidence indicates this site is germline, not somatic">
    ##FILTER=<ID=haplotype,Description="Variant near filtered variant on same haplotype.">
    ##FILTER=<ID=low_allele_frac,Description="Allele fraction is below specified threshold">
    ##FILTER=<ID=map_qual,Description="ref - alt median mapping quality">
    ##FILTER=<ID=multiallelic,Description="Site filtered because too many alt alleles pass tumor LOD">
    ##FILTER=<ID=n_ratio,Description="Ratio of N to alt exceeds specified ratio">
    ##FILTER=<ID=normal_artifact,Description="artifact_in_normal">
    ##FILTER=<ID=numt_chimera,Description="NuMT variant with too many ALT reads originally from autosome">
    ##FILTER=<ID=numt_novel,Description="Alt depth is below expected coverage of NuMT in autosome">
    ##FILTER=<ID=orientation,Description="orientation bias detected by the orientation bias mixture model">
    ##FILTER=<ID=panel_of_normals,Description="Blacklisted site in panel of normals">
    ##FILTER=<ID=position,Description="median distance of alt variants from end of reads">
    ##FILTER=<ID=slippage,Description="Site filtered due to contraction of short tandem repeat region">
    ##FILTER=<ID=strand_bias,Description="Evidence for alt allele comes from one read direction only">
    ##FILTER=<ID=strict_strand,Description="Evidence for alt allele is not represented in both directions">
    ##FILTER=<ID=weak_evidence,Description="Mutation does not meet likelihood threshold">
    ##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
    ##FORMAT=<ID=AF,Number=A,Type=Float,Description="Allele fractions of alternate alleles in the tumor">
    ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
    ##FORMAT=<ID=F1R2,Number=R,Type=Integer,Description="Count of reads in F1R2 pair orientation supporting each allele">
    ##FORMAT=<ID=F2R1,Number=R,Type=Integer,Description="Count of reads in F2R1 pair orientation supporting each allele">
    ##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
    ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
    ##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing how the alternate alleles are phased in relation to one another">
    ##FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
    ##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
    ##FORMAT=<ID=PS,Number=1,Type=Integer,Description="Phasing set (typically the position of the first variant in the set)">
    ##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.">
    ##GATKCommandLine=<ID=FilterAlignmentArtifacts,CommandLine="FilterAlignmentArtifacts --output gatk_mutect2/S018_1/S018_1.m2_twicefilt.vcf.gz --bwa-mem-index-image /workplace/public/database/gatk_db_hg38/Homo_sapiens_assembly38.fasta.img --variant gatk_mutect2/S018_1/S018_1.m2_oncefilt.vcf.gz --input gatk_mutect2/S018_1/S018_1.m2.sort.bam --indel-start-tolerance 5 --fragment-size 1000 --max-failed-realignments 3 --sufficient-good-realignments 2 --dont-skip-filtered-variants false --dont-use-mates false --max-reasonable-fragment-length 100000 --min-aligner-score-difference 20 --min-mismatch-ratio 2.5 --num-regular-contigs 25 --minimum-seed-length 14 --drop-ratio 0.2 --seed-split-factor 0.5 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false --max-read-length 2147483647 --min-read-length 30 --minimum-mapping-quality 20",Version="4.1.2.0",Date="July 11, 2019 2:09:32 PM CST">
    ##GATKCommandLine=<ID=FilterMutectCalls,CommandLine="FilterMutectCalls --output gatk_mutect2/S018_1/S018_1.m2_oncefilt.vcf.gz --stats gatk_mutect2/S018_1/S018_1.m2.vcf.gz.stats --filtering-stats gatk_mutect2/S018_1/S018_1.m2_oncefilt.stats --tumor-segmentation gatk_mutect2/S018_1/S018_1.segments.table --orientation-bias-artifact-priors gatk_mutect2/S018_1/S018_1.artifact-priors.tar.gz --variant gatk_mutect2/S018_1/S018_1.m2.vcf.gz --reference /workplace/public/database/Human_hg19/genome.fa --threshold-strategy OPTIMAL_F_SCORE --f-score-beta 1.0 --false-discovery-rate 0.05 --initial-threshold 0.1 --mitochondria-mode false --max-events-in-region 2 --max-alt-allele-count 1 --unique-alt-read-count 0 --min-median-mapping-quality 30 --min-median-base-quality 20 --max-median-fragment-length-difference 10000 --min-median-read-position 1 --max-n-ratio Infinity --min-reads-per-strand 0 --autosomal-coverage 0.0 --max-numt-fraction 0.85 --min-allele-fraction 0.0 --contamination-estimate 0.0 --log-snv-prior -13.815510557964275 --log-indel-prior -16.11809565095832 --log-artifact-prior -2.302585092994046 --normal-p-value-threshold 0.001 --min-slippage-length 8 --pcr-slippage-rate 0.1 --distance-on-haplotype 100 --long-indel-length 5 --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false",Version="4.1.2.0",Date="July 11, 2019 2:03:38 PM CST">
    ##GATKCommandLine=<ID=Mutect2,CommandLine="Mutect2 --f1r2-tar-gz gatk_mutect2/S018_1/chr12.f1r2.tar.gz --tumor-sample S018_1 --normal-sample S018_2 --panel-of-normals gatk_mutect2/pon.vcf.gz --germline-resource af-only-bychr/chr12.hg19.vcf.gz --bam-output gatk_mutect2/S018_1/chr12.m2.bam --output gatk_mutect2/S018_1/chr12.m2.vcf.gz --intervals call_region/chr12.region.bed --input gatk_bqsr/S018_1.bqsr.bam --input gatk_bqsr/S018_2.bqsr.bam --reference /workplace/public/database/Human_hg19/genome.fa --disable-read-filter MateOnSameContigOrNoMappedMateReadFilter --f1r2-median-mq 50 --f1r2-min-bq 20 --f1r2-max-depth 200 --genotype-pon-sites false --genotype-germline-sites false --af-of-alleles-not-in-resource -1.0 --mitochondria-mode false --tumor-lod-to-emit 3.0 --initial-tumor-lod 2.0 --pcr-snv-qual 40 --pcr-indel-qual 40 --max-population-af 0.01 --downsampling-stride 1 --callable-depth 10 --max-suspicious-reads-per-alignment-start 0 --normal-lod 2.2 --ignore-itr-artifacts false --gvcf-lod-band -2.5 --gvcf-lod-band -2.0 --gvcf-lod-band -1.5 --gvcf-lod-band -1.0 --gvcf-lod-band -0.5 --gvcf-lod-band 0.0 --gvcf-lod-band 0.5 --gvcf-lod-band 1.0 --minimum-allele-fraction 0.0 --genotype-filtered-alleles false --disable-adaptive-pruning false --dont-trim-active-regions false --max-disc-ar-extension 25 --max-gga-ar-extension 300 --padding-around-indels 150 --padding-around-snps 20 --kmer-size 10 --kmer-size 25 --dont-increase-kmer-sizes-for-cycles false --allow-non-unique-kmers-in-ref false --num-pruning-samples 1 --min-dangling-branch-length 4 --recover-all-dangling-branches false --max-num-haplotypes-in-population 128 --min-pruning 2 --adaptive-pruning-initial-error-rate 0.001 --pruning-lod-threshold 2.302585092994046 --max-unpruned-variants 100 --debug-assembly false --debug-graph-transformations false --capture-assembly-failure-bam false --error-correct-reads false --kmer-length-for-read-error-correction 25 --min-observations-for-kmer-to-be-solid 20 --likelihood-calculation-engine PairHMM --base-quality-score-threshold 18 --pair-hmm-gap-continuation-penalty 10 --pair-hmm-implementation FASTEST_AVAILABLE --pcr-indel-model CONSERVATIVE --phred-scaled-global-read-mismapping-rate 45 --native-pair-hmm-threads 4 --native-pair-hmm-use-double-precision false --bam-writer-type CALLED_HAPLOTYPES --dont-use-soft-clipped-bases false --min-base-quality-score 10 --smith-waterman JAVA --emit-ref-confidence NONE --max-mnp-distance 1 --min-assembly-region-size 50 --max-assembly-region-size 300 --assembly-region-padding 100 --max-reads-per-alignment-start 50 --active-probability-threshold 0.002 --max-prob-propagation-distance 50 --force-active false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false --max-read-length 2147483647 --min-read-length 30 --minimum-mapping-quality 20 --disable-tool-default-annotations false --enable-all-annotations false",Version="4.1.2.0",Date="July 10, 2019 3:36:01 PM CST">
    ##INFO=<ID=CONTQ,Number=1,Type=Float,Description="Phred-scaled qualities that alt allele are not due to contamination">
    ##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
    ##INFO=<ID=ECNT,Number=1,Type=Integer,Description="Number of events in this haplotype">
    ##INFO=<ID=GERMQ,Number=1,Type=Integer,Description="Phred-scaled quality that alt alleles are not germline variants">
    ##INFO=<ID=MBQ,Number=R,Type=Integer,Description="median base quality">
    ##INFO=<ID=MFRL,Number=R,Type=Integer,Description="median fragment length">
    ##INFO=<ID=MMQ,Number=R,Type=Integer,Description="median mapping quality">
    ##INFO=<ID=MPOS,Number=A,Type=Integer,Description="median distance from end of read">
    ##INFO=<ID=NALOD,Number=A,Type=Float,Description="Negative log 10 odds of artifact in normal with same allele fraction as tumor">
    ##INFO=<ID=NCount,Number=1,Type=Integer,Description="Count of N bases in the pileup">
    ##INFO=<ID=NLOD,Number=A,Type=Float,Description="Normal log 10 likelihood ratio of diploid het or hom alt genotypes">
    ##INFO=<ID=OCM,Number=1,Type=Integer,Description="Number of alt reads whose original alignment doesn't match the current contig.">
    ##INFO=<ID=PON,Number=0,Type=Flag,Description="site found in panel of normals">
    ##INFO=<ID=POPAF,Number=A,Type=Float,Description="negative log 10 population allele frequencies of alt alleles">
    ##INFO=<ID=RCNTS,Number=2,Type=Integer,Description="Number of reads passing and failing realignment.">
    ##INFO=<ID=ROQ,Number=1,Type=Float,Description="Phred-scaled qualities that alt allele are not due to read orientation artifact">
    ##INFO=<ID=RPA,Number=.,Type=Integer,Description="Number of times tandem repeat unit is repeated, for each allele (including reference)">
    ##INFO=<ID=RU,Number=1,Type=String,Description="Tandem repeat unit (bases)">
    ##INFO=<ID=SEQQ,Number=1,Type=Integer,Description="Phred-scaled quality that alt alleles are not sequencing errors">
    ##INFO=<ID=STR,Number=0,Type=Flag,Description="Variant is a short tandem repeat">
    ##INFO=<ID=STRANDQ,Number=1,Type=Integer,Description="Phred-scaled quality of strand bias artifact">
    ##INFO=<ID=STRQ,Number=1,Type=Integer,Description="Phred-scaled quality that alt alleles in STRs are not polymerase slippage errors">
    ##INFO=<ID=TLOD,Number=A,Type=Float,Description="Log 10 likelihood ratio score of variant existing versus not existing">
    ##INFO=<ID=UNIQ_ALT_READ_COUNT,Number=1,Type=Integer,Description="Number of ALT reads with unique start and mate end positions at a variant site">
    ##MutectVersion=2.2
    ##contig=<ID=chr1,length=249250621>
    ##contig=<ID=chr2,length=243199373>
    ##contig=<ID=chr3,length=198022430>
    ##contig=<ID=chr4,length=191154276>
    ##contig=<ID=chr5,length=180915260>
    ##contig=<ID=chr6,length=171115067>
    ##contig=<ID=chr7,length=159138663>
    ##contig=<ID=chr8,length=146364022>
    ##contig=<ID=chr9,length=141213431>
    ##contig=<ID=chr10,length=135534747>
    ##contig=<ID=chr11,length=135006516>
    ##contig=<ID=chr12,length=133851895>
    ##contig=<ID=chr13,length=115169878>
    ##contig=<ID=chr14,length=107349540>
    ##contig=<ID=chr15,length=102531392>
    ##contig=<ID=chr16,length=90354753>
    ##contig=<ID=chr17,length=81195210>
    ##contig=<ID=chr18,length=78077248>
    ##contig=<ID=chr19,length=59128983>
    ##contig=<ID=chr20,length=63025520>
    ##contig=<ID=chr21,length=48129895>
    ##contig=<ID=chr22,length=51304566>
    ##contig=<ID=chrX,length=155270560>
    ##contig=<ID=chrY,length=59373566>
    ##contig=<ID=chrM,length=16571>
    ##filtering_status=These calls have been filtered by FilterMutectCalls to label false positives with a list of failed filters and true positives with PASS.
    ##normal_sample=S018_2
    ##source=FilterAlignmentArtifacts
    ##source=FilterMutectCalls
    ##source=Mutect2
    ##tumor_sample=S018_1
    #CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S018_1 S018_2
    chr1 151555414 . GTT G,GT,GTTT,GTTTT . multiallelic;normal_artifact CONTQ=93;DP=3079;ECNT=1;GERMQ=93;MBQ=20,20,20,20,20;MFRL=180,173,175,182,182;MMQ=60,60,60,60,60;MPOS=45,48,39,32;NALOD=-1.103e+01,-5.996e+01,-1.028e+02,-6.334e+00;NLOD=306.72,129.39,19.65,255.58;POPAF=2.77,6.00,6.00,1.64;RCNTS=0,0;ROQ=93;RPA=13,11,12,14,15;RU=T;SEQQ=93;STR;STRANDQ=93;STRQ=93;TLOD=25.65,69.49,23.03,14.14 GT:AD:AF:DP:F1R2:F2R1:SB 0/1/2/3/4:759,66,201,90,28:0.043,0.137,0.046,0.020:1144:273,23,81,29,10:364,37,100,39,5:330,429,154,231 0/0:946,25,139,196,41:0.012,0.072,0.122,0.015:1347:293,8,55,42,9:362,14,61,80,21:346,600,128,273
    chr1 162750245 . GA G . normal_artifact;slippage;weak_evidence CONTQ=93;DP=2097;ECNT=1;GERMQ=93;MBQ=20,20;MFRL=187,179;MMQ=60,60;MPOS=41;NALOD=-1.278e+01;NLOD=198.01;POPAF=2.12;RCNTS=0,0;ROQ=93;RPA=10,9;RU=A;SEQQ=47;STR;STRANDQ=93;STRQ=1;TLOD=13.64 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:722,75:0.045:797:181,36:322,22:290,432,35,40 0/0:948,48:0.025:996:268,17:299,21:323,625,17,31
    chr3 178952085 . A G . PASS CONTQ=93;DP=6044;ECNT=1;GERMQ=93;MBQ=20,20;MFRL=177,160;MMQ=60,60;MPOS=38;NALOD=3.04;NLOD=1333.19;POPAF=6.00;RCNTS=2,0;ROQ=93;SEQQ=93;STRANDQ=93;TLOD=457.34 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:1169,282:0.193:1451:663,145:501,135:598,571,158,124 0/0:4471,1:4.059e-04:4472:2118,0:2222,1:2333,2138,0,1
    chr5 1295227 . A G . clustered_events;normal_artifact;orientation;strand_bias;weak_evidence CONTQ=93;DP=1529;ECNT=4;GERMQ=93;MBQ=32,23;MFRL=183,170;MMQ=60,60;MPOS=30;NALOD=2.58;NLOD=235.64;POPAF=6.00;RCNTS=0,0;ROQ=1;SEQQ=58;STRANDQ=1;TLOD=10.32 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:683,10:0.014:693:240,0:436,7:641,42,0,10 0/0:808,3:2.421e-03:811:318,0:465,3:774,34,2,1
    chr5 1295234 . T G . clustered_events;normal_artifact;orientation;strand_bias;weak_evidence CONTQ=93;DP=1563;ECNT=4;GERMQ=93;MBQ=33,25;MFRL=182,172;MMQ=60,60;MPOS=20;NALOD=-8.529e+00;NLOD=219.73;POPAF=6.00;RCNTS=0,0;ROQ=1;SEQQ=61;STRANDQ=1;TLOD=10.70 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:672,13:0.017:685:227,0:438,7:641,31,0,13 0/0:830,9:0.011:839:323,1:493,6:797,33,0,9
    chr5 1295244 . A C . clustered_events;normal_artifact;orientation;strand_bias CONTQ=93;DP=1613;ECNT=4;GERMQ=93;MBQ=32,27;MFRL=180,175;MMQ=60,60;MPOS=46;NALOD=-1.587e+01;NLOD=203.79;POPAF=6.00;RCNTS=0,0;ROQ=1;SEQQ=93;STRANDQ=1;TLOD=32.94 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:666,31:0.043:697:212,0:410,31:637,29,12,19 0/0:865,24:0.020:889:315,0:444,21:811,54,5,19
    chr5 1295247 . C G . clustered_events;normal_artifact;orientation;strand_bias;weak_evidence CONTQ=93;DP=1642;ECNT=4;GERMQ=93;MBQ=31,27;MFRL=180,174;MMQ=60,60;MPOS=17;NALOD=-6.396e+00;NLOD=239.34;POPAF=6.00;RCNTS=0,0;ROQ=1;SEQQ=34;STRANDQ=1;TLOD=7.85 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:700,9:0.013:709:229,0:454,6:656,44,0,9 0/0:890,11:9.653e-03:901:349,1:501,9:823,67,0,11
    chr7 55273591 . GAAA G,GA,GAA,GAAAA,GAAAAA . germline;multiallelic;normal_artifact CONTQ=93;DP=3891;ECNT=1;GERMQ=1;MBQ=20,20,20,20,20,20;MFRL=178,169,171,179,180,179;MMQ=60,60,60,60,60,60;MPOS=18,33,34,35,31;NALOD=1.27,-1.382e+01,-1.052e+02,-5.625e+02,-3.495e+01;NLOD=626.39,518.64,226.15,-6.580e+02,120.48;POPAF=3.40,1.74,6.00,2.43,3.91;RCNTS=0,0;ROQ=93;RPA=13,10,11,12,14,15;RU=A;SEQQ=93;STR;STRANDQ=93;STRQ=93;TLOD=8.24,43.42,158.51,173.61,12.47 GT:AD:AF:DP:F1R2:F2R1:SB 0/1/2/3/4/5:451,16,71,225,253,39:0.012,0.057,0.207,0.233,0.026:1055:170,10,40,85,98,18:137,6,25,83,82,8:211,240,283,321 0/0:1075,3,28,199,767,125:1.051e-03,9.065e-03,0.070,0.356,0.034:2197:376,0,13,76,280,56:331,3,12,69,249,45:529,546,573,549
    chr13 32906589 . C G . strand_bias CONTQ=93;DP=5084;ECNT=1;GERMQ=93;MBQ=20,20;MFRL=180,165;MMQ=60,60;MPOS=51;NALOD=3.35;NLOD=1320.04;POPAF=6.00;RCNTS=0,0;ROQ=93;SEQQ=93;STRANDQ=83;TLOD=25.30 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:516,24:0.044:540:253,14:261,10:316,200,13,11 0/0:4409,1:2.245e-04:4410:2076,0:2257,0:2303,2106,0,1
    chr13 32973279 . GA G,GAA . multiallelic;normal_artifact;slippage CONTQ=93;DP=3408;ECNT=1;GERMQ=93;MBQ=20,20,20;MFRL=201,188,209;MMQ=60,60,60;MPOS=34,30;NALOD=-2.149e+01,-5.110e+01;NLOD=461.90,381.51;POPAF=6.00,2.61;RCNTS=0,0;ROQ=93;RPA=10,9,11;RU=A;SEQQ=93;STR;STRANDQ=93;STRQ=1;TLOD=37.95,14.45 GT:AD:AF:DP:F1R2:F2R1:SB 0/1/2:699,81,29:0.070,0.031:809:249,49,5:294,31,10:258,441,30,80 0/0:1966,97,127:0.019,0.035:2190:678,46,41:776,41,44:1040,926,124,100
    chr17 7572154 . GAAAAA G,GA,GAA,GAAA,GAAAA,GAAAAAA . germline;multiallelic;normal_artifact CONTQ=93;DP=779;ECNT=1;GERMQ=1;MBQ=20,20,20,20,20,20,20;MFRL=177,171,171,170,170,170,176;MMQ=60,60,60,60,60,60,60;MPOS=20,28,29,26,23,13;NALOD=-9.824e-01,-4.477e-01,-4.832e+00,-1.253e+01,-3.391e+01,-1.387e+00;NLOD=27.76,8.11,-2.435e+01,-5.311e+01,-6.426e+01,23.20;POPAF=2.00,0.264,6.00,6.00,6.00,6.00;RCNTS=0,0;ROQ=93;RPA=18,13,14,15,16,17,19;RU=A;SEQQ=93;STR;STRANDQ=93;STRQ=93;TLOD=5.65,4.29,9.34,55.23,52.75,4.42 GT:AD:AF:DP:F1R2:F2R1:SB 0/1/2/3/4/5/6:118,19,19,34,102,143,26:0.035,0.034,0.060,0.226,0.318,0.036:461:63,11,8,24,62,60,12:33,5,8,10,37,76,11:36,82,135,208 0/0:46,3,4,14,34,57,21:0.018,0.018,0.065,0.168,0.339,0.080:179:31,3,3,11,18,37,10:11,0,1,1,11,14,7:28,18,70,63
    chr17 7578508 . CAGGTCTTGG C . PASS CONTQ=93;DP=5420;ECNT=1;GERMQ=93;MBQ=20,20;MFRL=175,177;MMQ=60,60;MPOS=33;NALOD=3.25;NLOD=1058.57;POPAF=6.00;RCNTS=2,0;ROQ=93;SEQQ=93;STRANDQ=93;TLOD=131.27 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:1623,53:0.032:1676:908,35:613,18:787,836,26,27 0/0:3526,0:2.790e-04:3526:1460,0:1839,0:1721,1805,0,0
    chr17 37856412 . A C . base_qual;normal_artifact;orientation;strand_bias CONTQ=93;DP=3600;ECNT=2;GERMQ=93;MBQ=20,19;MFRL=183,202;MMQ=60,60;MPOS=28;NALOD=-6.595e+01;NLOD=369.81;POPAF=6.00;RCNTS=0,0;ROQ=11;SEQQ=93;STRANDQ=1;TLOD=61.18 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:1323,93:0.045:1416:550,31:457,8:561,762,92,1 0/0:1952,115:0.035:2067:738,41:811,23:631,1321,112,3
    chr17 37856429 . G C . normal_artifact;orientation;strand_bias CONTQ=93;DP=3712;ECNT=2;GERMQ=93;MBQ=20,21;MFRL=183,185;MMQ=60,60;MPOS=17;NALOD=-1.650e+01;NLOD=530.57;POPAF=6.00;RCNTS=0,0;ROQ=1;SEQQ=93;STRANDQ=1;TLOD=14.62 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:1467,29:0.015:1496:615,18:575,0:660,807,29,0 0/0:2056,35:0.010:2091:808,19:915,4:711,1345,34,1
    chr17 37868591 . C G . normal_artifact;orientation;strand_bias CONTQ=93;DP=3932;ECNT=1;GERMQ=93;MBQ=20,22;MFRL=174,166;MMQ=60,60;MPOS=22;NALOD=1.80;NLOD=618.06;POPAF=6.00;RCNTS=0,0;ROQ=1;SEQQ=93;STRANDQ=1;TLOD=15.39 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:1612,30:0.016:1642:699,27:690,0:699,913,30,0 0/0:2154,10:2.353e-03:2164:861,8:1047,0:878,1276,7,3
    chr17 41196810 . G T . normal_artifact;strand_bias;weak_evidence CONTQ=93;DP=1825;ECNT=1;GERMQ=93;MBQ=20,20;MFRL=180,211;MMQ=60,60;MPOS=49;NALOD=0.536;NLOD=339.86;POPAF=6.00;RCNTS=0,0;ROQ=91;SEQQ=1;STRANDQ=1;TLOD=3.62 GT:AD:AF:DP:F1R2:F2R1:SB 0/1:432,9:0.019:441:174,3:149,5:228,204,0,9 0/0:1256,22:6.610e-03:1278:538,2:514,8:586,670,1,21

  • 29043594952904359495 Member

    Q1:

    6) FilterAlignmentArtifacts
    gatk FilterAlignmentArtifacts -V gatk_mutect2/S018_1/S018_1.m2_oncefilt.vcf.gz -I gatk_mutect2/S018_1/S018_1.m2.sort.bam --bwa-mem-index-image /workplace/public/database/ensembl_grch37_82/homo_sapiens_GRCh37.fa.img -O gatk_mutect2/S018_1/S018_1.m2_twicefilt.vcf.gz

    hello, @ahda
    I want to ask whether FilterAlignmentArtifacts is helpful, and if so , where you download hg38.index_image


    --germline-resource af-only-bychr/chr13.hg19.vcf.gz

    where you download this file, because gatk only supply b37 and hg38, and the ftp seems we have no access?


    Q2: can we adjust the fdr value, here is 0.05, will 0.01 give more reliable results @davidben

    thanks a lot

  • ahdaahda ChinaMember

    @2904359495
    1)where to download gnomad files ?
    I got germline-resource and variants-for-contamination are from this link,and converted to hg19 version by picard LiftoverVcf
    ftp://[email protected]/bundle/Mutect2/af-only-gnomad.raw.sites.b37.vcf.gz
    ftp://[email protected]/Liftover_Chain_Files/b37tohg19.chain

    2)where to download hg38.index_image?
    I generate this file by BwaMemIndexImageCreator according to this link:
    https://software.broadinstitute.org/gatk/documentation/tooldocs/4.0.0.0/org_broadinstitute_hellbender_tools_BwaMemIndexImageCreator.php
    3)whether FilterAlignmentArtifacts is helpful?
    I don't know. to be honest, I met a problem in FILTER flag "alignment", I'm going to ask GATK team. but I think I'd better ask another question.

  • ahdaahda ChinaMember

    @davidben Thank you for the answer. now I'm clear with this question.

  • 29043594952904359495 Member

    @ahda,thanks a lot
    how did you get into ftp site, and you use pc or mac ,thanks

  • ahdaahda ChinaMember

    @davidben this is my new question about FilterAlignmentArtifact, and the GATK mutect2 flow is same, so I link here:
    https://gatkforums.broadinstitute.org/gatk/discussion/24365/a-problem-about-filter-flag-alignment-filtered-some-tp-variant-sanger-varified/p1?new=1

    @2904359495 I'm not sure I understand your question, Maybe this is you need.
    https://software.broadinstitute.org/gatk/download/bundle

  • 29043594952904359495 Member

    I know this, but the ftp need password

  • 29043594952904359495 Member

    @ahda do you meet this, how you enter the ftp, through browser or ftp software()

  • ahdaahda ChinaMember

    I use the software FileZilla and like this:

  • 29043594952904359495 Member

    @ahda, thanks a lot. finally ok. but sometimes can happen this.

  • ahdaahda ChinaMember

    No thanks.I have ever met the problem you said. just keep try a few more times.

  • 29043594952904359495 Member

    ok ,thanks a lot

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    can we adjust the fdr value, here is 0.05, will 0.01 give more reliable results

    @ahda we recommend adjusting the -f-score-beta parameter to tune sensitivity vs. precision.

  • 29043594952904359495 Member

    @davidben , thanks a lot
    it is me ask the -f-score-beta
    so is --f-score-beta 1.0 the suitable value that balance sensitivity vs. precision the best, to add sensitivity or precision, how should I switch that value, and is there a scope?
    thanks a lot

  • 29043594952904359495 Member

    @davidben , can you transfer the data in ftp to google drive, the ftp is neither stable nor fast, thanks a lot

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    can you transfer the data in ftp to google drive, the ftp is neither stable nor fast, thanks a lot

    @2904359495 Our best practices resources are best accessed from their google bucket, for example gs://gatk-best-practices/somatic-b37/

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    so is --f-score-beta 1.0 the suitable value that balance sensitivity vs. precision the best, to add sensitivity or precision, how should I switch that value, and is there a scope?

    @2904359495 Our advice is almost always to use the defaults, but if you want to try to do better you can consult https://en.wikipedia.org/wiki/F1_score

  • 29043594952904359495 Member

    Thanks a [email protected]
    but I always find this linkhttps://console.cloud.google.com/storage/browser/genomics-public-data/?pli=1, it just has resourece hg38, can you check with the uploader?

  • davidbendavidben BostonMember, Broadie, Dev ✭✭✭

    gs://gatk-best-practices/ contains hg37 and hg38 resources.

  • 29043594952904359495 Member

    is it convenient for you or your colleague to convert the b37 to hg19 af-only-gnomad.raw.sites.vcf, and make a contamination vcf, thanks a lot.

Sign In or Register to comment.