GATK4 VariantAnnotator, De Novo, -ped : workshop 1809

manolismanolis Member ✭✭
edited November 2018 in Ask the GATK team

GATK 4.0.11.0, server Linux

Hi,

1)

According to the workshop 1809 I was running:

/share/apps/bio/gatk-4.0.11.0/gatk --java-options "-Xmx5g" VariantAnnotator -R /shared/resources/hgRef/hg38/Homo_sapiens_assembly38.fasta \
-V  prova_CGP_GQ.vcf -O prova_CGP_GQ_DeNovo.vcf -A PossibleDeNovo

and I had this warning

12:05:24.122 WARN PossibleDeNovo - Annotation will not be calculated, must provide a valid PED file (-ped) from the command line.

I also done: grep -i "novo" prova_CGP_GQ_DeNovo.vcf
and I found this word only in the header.

Then I added the ped file

/share/apps/bio/gatk-4.0.11.0/gatk --java-options "-Xmx5g" VariantAnnotator -R /shared/resources/hgRef/hg38/Homo_sapiens_assembly38.fasta \
-V prova_CGP_GQ.vcf -O prova_CGP_GQ_DeNovo.vcf -A PossibleDeNovo -ped /home/manolis/GATK4/PED/12-931_fam.ped

and I had this

A USER ERROR has occurred: Pedigree argument "pedigree" or "founder-id" was specified without a pedigree annotation being requested, (eg: ))

My .ped file is ok because I used it upstream in the CalculateGenotypePosteriors

/share/apps/bio/gatk-4.0.11.0/gatk --java-options "-Xmx5g" CalculateGenotypePosteriors -R /shared/resources/hgRef/hg38/Homo_sapiens_assembly38.fasta \
-V prova.vcf -O prova_CGP.vcf -ped /home/manolis/GATK4/PED/12-931_fam.ped --skip-population-priors

Using GATK3.8 I do not have any problem

java "-Xmx5g" -jar /share/apps/bio/GenomeAnalysisTK-3.8-1-0-gf15c1c3ef/GenomeAnalysisTK.jar -T VariantAnnotator \
-R /shared/resources/hgRef/hg38/Homo_sapiens_assembly38.fasta \
-V prova_CGP_GQ.vcf -o prova_CGP_GQ_DeNovo.vcf -A PossibleDeNovo -ped /home/manolis/GATK4/PED/12-931_fam.ped

Final message:

------------------------------------------------------------------------------------------
Done. There were no warn messages.
------------------------------------------------------------------------------------------

2)

upstream, when I'm going to create the trios.vcf file (prova.vcf ) for a downstream De Novo analysis I have to use or not the "-exclude-non-variants" option in the SelectVariants?

/share/apps/bio/gatk-4.0.11.0/gatk --java-options "-Xmx10g" SelectVariants -R /shared/resources/hgRef/hg38/Homo_sapiens_assembly38.fasta \
-V /home/manolis/GATK4/3.WES_Illumina/germSNV/4.VCF/storage/raw_filtered_by_VQSR_by_on_target_regions.vcf -O prova.vcf -exclude-non-variants -sn "12-928" -sn "12-929" -sn "12-931"

All the best

Post edited by manolis on

Best Answer

Answers

  • manolismanolis Member ✭✭

    Full warning messages:

    Using GATK jar /share/apps/bio/gatk-4.0.11.0/gatk-package-4.0.11.0-local.jar
    Running:
        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx5g -jar /share/apps/bio/gatk-4.0.11.0/gatk-package-4.0.11.0-local.jar VariantAnnotator -R /shared/resources/hgRef/hg38/Homo_sapiens_assembly38.fasta -V prova_CGP_GQ.vcf -O prova_CGP_GQ_DeNovo.vcf -A PossibleDeNovo
    12:54:50.549 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/share/apps/bio/gatk-4.0.11.0/gatk-package-4.0.11.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
    12:54:52.275 INFO  VariantAnnotator - ------------------------------------------------------------
    12:54:52.276 INFO  VariantAnnotator - The Genome Analysis Toolkit (GATK) v4.0.11.0
    12:54:52.276 INFO  VariantAnnotator - For support and documentation go to https://software.broadinstitute.org/gatk/
    12:54:52.276 INFO  VariantAnnotator - Executing as [email protected] on Linux v4.4.0-138-generic amd64
    12:54:52.277 INFO  VariantAnnotator - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_121-b15
    12:54:52.277 INFO  VariantAnnotator - Start Date/Time: November 14, 2018 12:54:50 PM CET
    12:54:52.277 INFO  VariantAnnotator - ------------------------------------------------------------
    12:54:52.277 INFO  VariantAnnotator - ------------------------------------------------------------
    12:54:52.278 INFO  VariantAnnotator - HTSJDK Version: 2.16.1
    12:54:52.278 INFO  VariantAnnotator - Picard Version: 2.18.13
    12:54:52.278 INFO  VariantAnnotator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
    12:54:52.278 INFO  VariantAnnotator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
    12:54:52.278 INFO  VariantAnnotator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
    12:54:52.278 INFO  VariantAnnotator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
    12:54:52.278 INFO  VariantAnnotator - Deflater: IntelDeflater
    12:54:52.278 INFO  VariantAnnotator - Inflater: IntelInflater
    12:54:52.278 INFO  VariantAnnotator - GCS max retries/reopens: 20
    12:54:52.279 INFO  VariantAnnotator - Requester pays: disabled
    12:54:52.279 WARN  VariantAnnotator - 
    
       !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    
       Warning: VariantAnnotator is a BETA tool and is not yet ready for use in production
    
       !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    
    
    12:54:52.279 INFO  VariantAnnotator - Initializing engine
    12:54:52.814 INFO  FeatureManager - Using codec VCFCodec to read file file:///home/manolis/prove/denovo/prova_CGP_GQ.vcf
    12:54:53.092 INFO  VariantAnnotator - Done initializing engine
    12:54:53.211 INFO  ProgressMeter - Starting traversal
    12:54:53.211 INFO  ProgressMeter -        Current Locus  Elapsed Minutes    Variants Processed  Variants/Minute
    12:54:53.268 WARN  PossibleDeNovo - Annotation will not be calculated, must provide a valid PED file (-ped) from the command line.
    12:55:03.216 INFO  ProgressMeter -       chr10:85555593              0.2                 53000         317872.9
    12:55:10.087 INFO  VariantAnnotator - No variants filtered by: AllowAllVariantsVariantFilter
    12:55:10.088 INFO  ProgressMeter -       chrX:141905856              0.3                108387         385353.2
    12:55:10.088 INFO  ProgressMeter - Traversal complete. Processed 108387 total variants in 0.3 minutes.
    12:55:10.731 INFO  VariantAnnotator - Shutting down engine
    [November 14, 2018 12:55:10 PM CET] org.broadinstitute.hellbender.tools.walkers.annotator.VariantAnnotator done. Elapsed time: 0.34 minutes.
    Runtime.totalMemory()=2138046464
    
    
    
    Running:
        java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx5g -jar /share/apps/bio/gatk-4.0.11.0/gatk-package-4.0.11.0-local.jar VariantAnnotator -R /shared/resources/hgRef/hg38/Homo_sapiens_assembly38.fasta -V prova_CGP_GQ.vcf -O prova_CGP_GQ_DeNovo.vcf -A PossibleDeNovo -ped /home/manolis/GATK4/PED/12-931_fam.ped
    
    
    **BETA FEATURE - WORK IN PROGRESS**
    
    USAGE: VariantAnnotator [arguments]
    
    Tool for adding annotations to VCF files
    Version:4.0.11.0
    
    
    Required Arguments:
    
    --output,-O:File              The file to whcih variants should be written  Required. 
    
    --variant,-V:String           A VCF file containing variants  Required. 
    ...
    ...
    ...
    --showHidden,-showHidden:Boolean
                                  display hidden arguments  Default value: false. Possible values: {true, false} 
    
    
    ***********************************************************************
    
    A USER ERROR has occurred: Pedigree argument "pedigree" or "founder-id" was specified without a pedigree annotation being requested, (eg: ))
    
    ***********************************************************************
    Set the system property GATK_STACKTRACE_ON_USER_EXCEPTION (--java-options '-DGATK_STACKTRACE_ON_USER_EXCEPTION=true') to print the stack trace.
    
  • manolismanolis Member ✭✭

    Hi @shlee , many thanks. Would be nice to have all tools in the same package (GATK4) but also gatk3 works very good. Thanks for HAIL info!

  • > @manolis said:
    > Hi @shlee , many thanks. Would be nice to have all tools in the same package (GATK4) but also gatk3 works very good. Thanks for HAIL info!

    I tried with GATK3.7, although it doesn't shows error in warning message, but I can't see the 'hiConfDeNovo' in output.vcf, as someone reported. Did you get the annotation correctly? If yes, can you post your command or pedigree file so that I can know where my problem is ? Many thanks.
  • manolismanolis Member ✭✭
    edited February 11

    Now you can find "VariantAnnotator" also in GATK v4.1.0.0

    Do you have 'loConfDeNovo' in your output? If yes, I think that just in your trios there are not 'hiConfDeNovo' to call... try with another trios vcf...

    Code

    java '-Xmx5g' -jar ${GATK3.8} -T VariantAnnotator \
    -R ${hg38} \
    -V "${trios}.vcf" \
    -o "${trios}.DeNovo.vcf" \
    -A PossibleDeNovo \
    -ped ${pedigree}
    

    PED file

    famID   AAA BBB CCC 1   2
    famID   BBB 0   0   1   1
    famID   CCC 0   0   2   1
    

    column 1: family ID
    column 2: sample ID (AAA is the proband)
    column 3: father of the sample ID (BBB is the father of AAA)
    column 4: mother of the sample ID (CCC is the mother of AAA)
    column 5: sex (1= male; 2= female)
    column 6: phenotype (1= healthy; 2= affected)

    As always, the gatk team will give you an official feedback ...

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @JinboWuGlasgow

    Was @manolis's suggestion resolve your issue?

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭
    edited February 11

    Hi @JinboWuGlasgow,

    I am able to produce a hiConfDeNovo annotation for some test data using GATK3.7. Here is the command I used and as you see it is similar to that of @manolis':

    java -jar $GATK -T VariantAnnotator \
    -R ref/ref.fasta \
    -V precomputed/trioGGVCF.vcf.gz \
    -o trioVA.denovo.vcf.gz \
    -A PossibleDeNovo 
    -ped trio.ped
    

    For the test data I am using, this produces the following newly annotated records:

    I talked to the developers today regarding when this annotation will be ported to GATK4 and the hold-up is in obtaining test data towards this feature. So, I've agreed to provide them some data towards making test data and you can follow the progress of this port in the PR that will be linked to https://github.com/broadinstitute/gatk/issues/4987.

    P.S. You can download the data I used at https://github.com/broadinstitute/gatk/issues/4987

  • @shlee @bhanuGandham
    Thanks a lot, I thought all sites have 'hiConfDenovo' or 'lowConfDenovo' in the final output, but I'm wrong, when I used grep command, I got the all the hiConfDenovo sites(that's what I want)...

    Thank you again for the reply.
  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    You're welcome!

Sign In or Register to comment.