Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

M2 and GDBI for PON: [E::vcf_parse_format] Invalid character '.' in 'AF' FORMAT field at chr1:16949

manolismanolis Member ✭✭

GATK 4.1.1.0, local linux server

Hi,

I ran some WES normal samples:

${gatk} Mutect2 \
-R ${hg38} \
-I "${sample}.bam" \ 
-O "${sample}.vcf.gz" \
-L ${interval} \
-ip 5 \
--max-mnp-distance 0

and then GenomicsDBImport:

${gatk} GenomicsDBImport \
-R ${hg38} \
-V "${sample1}.vcf.gz" \
-V "${sample2}.vcf.gz" \
--batch-size 1 --reader-threads 1 \
--genomicsdb-workspace-path "GDBI_pon" \
-L chr1

Here the error:

13:18:45.329 INFO  GenomicsDBImport - Done initializing engine
13:18:45.517 INFO  GenomicsDBImport - Vid Map JSON file will be written to /home/manolis/prove/GDBI_pon/GDBI_pon/vidmap.json
13:18:45.517 INFO  GenomicsDBImport - Callset Map JSON file will be written to /home/manolis/prove/GDBI_pon/GDBI_pon/callset.json
13:18:45.517 INFO  GenomicsDBImport - Complete VCF Header will be written to /home/manolis/prove/GDBI_pon/GDBI_pon/vcfheader.vcf
13:18:45.517 INFO  GenomicsDBImport - Importing to array - /home/manolis/prove/GDBI_pon/GDBI_pon/genomicsdb_array
13:18:45.517 INFO  ProgressMeter - Starting traversal
13:18:45.517 INFO  ProgressMeter -        Current Locus  Elapsed Minutes     Batches Processed   Batches/Minute
13:18:45.820 INFO  GenomicsDBImport - Importing batch 1 with 1 samples
[E::vcf_parse_format] Invalid character '.' in 'AF' FORMAT field at chr1:14653
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fe721be816b, pid=12942, tid=0x00007fe7801f7700
#
# JRE version: OpenJDK Runtime Environment (8.0_152-b12) (build 1.8.0_152-release-1056-b12)
# Java VM: OpenJDK 64-Bit Server VM (25.152-b12 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  [libtiledbgenomicsdb8166440819035845683.so+0x35416b]  bcf_unpack+0x36b
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/manolis/prove/GDBI_pon/hs_err_pid12942.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.

Here the header of the vcf.gz and the variant:

##FORMAT=<ID=AF,Number=A,Type=Float,Description="Allele fractions of alternate alleles in the tumor">

chr1    14653   .   C   T   .   .   DP=13;ECNT=2;MBQ=20,30;MFRL=212,211;MMQ=43,33;MPOS=40;POPAF=7.30;TLOD=10.18 GT:AD:AF:DP:F1R2:F2R1:PGT:PID:PS:SB 0|1:9,4:0.333:13:6,2:3,1:0|1:14653_C_T:14653:5,4,3,1

Here the vcf validation:

${gatk} ValidateVariants \
-R ${hg38} \
-V "${sample1}.vcf.gz" \
-L ${interval} \
-ip 5

No any warning ...

When I process the "${sample1}.vcf.gz" with:

bcftools annotate -x FORMAT/AF "${sample1}.vcf.gz" -O z -o "${sample1}_noAF.vcf.gz"

and then running GenomicsDBImport I do not have any error ...

Any suggestion please?
Many thanks

Answers

  • manolismanolis Member ✭✭

    Hi, fixed. Seems that was a problem related with one of the hosts of the cluster. Sorry for boring you.

    Best

  • Hi, manolis !
    Tell me, please, how exactly did you solve this problem?
    Many thanks

  • jpfloridojpflorido SevilleMember

    Hi manolis,

    I'm having exactly the same issue with my PoN creation. Supposedly the AF field is correct and all my VCFs (using only 3 for test purposes) passed the ValidateVariants test. I also use the --max-mnp-distance=0 option in Mutect2 to prevent from the known bug in the GenomicsDBImport tool. But still same "Invalid character '.' in 'AF' FORMAT field at ..." and "A fatal error has been detected by the Java Runtime Environment" error happening.

    Would you please mind to let me know what was your host problems and how did you fix it? Just in case the same is happening here...

    Thanks in advance!

  • fmortunofmortuno Clinical Bioinformatics Area, FPS, Seville (Spain)Member

    Any suggestion about this ^ @manolis

    Thanks!

  • fmortunofmortuno Clinical Bioinformatics Area, FPS, Seville (Spain)Member

    Any suggestion about this ^ @manolis ?

    Thanks!

  • manolismanolis Member ✭✭
    edited July 13

    Our "solution" is totally crazy and we still can not explain why happening this! We have a linux cluster with 6 hosts.
    When I'am going to run GDBI for PON creation (GATK v4.1.1.0) during the day does not work, even if there are no jobs in all hosts!
    When I'm going to run it during the late night it works.

    For now we can not explain this behavior :o :/ We are waiting an answer from our server support.

    Best

  • fmortunofmortuno Clinical Bioinformatics Area, FPS, Seville (Spain)Member

    @jpflorido said:
    Hi manolis,

    I'm having exactly the same issue with my PoN creation. Supposedly the AF field is correct and all my VCFs (using only 3 for test purposes) passed the ValidateVariants test. I also use the --max-mnp-distance=0 option in Mutect2 to prevent from the known bug in the GenomicsDBImport tool. But still same "Invalid character '.' in 'AF' FORMAT field at ..." and "A fatal error has been detected by the Java Runtime Environment" error happening.

    Would you please mind to let me know what was your host problems and how did you fix it? Just in case the same is happening here...

    Thanks in advance!

    Thank you for your answer manolis!

    Is there someone else from the GATK team that can advice with this? I am quite sure my Mutect2 outputs where generated correctly and the AF field seems right to me but maybe I am wrong.

    Just to refresh, we (@jpflorido and me) are trying to create a panel with 3 exome samples but it fails when putting together the VCFs with the GenomicsDBImport tool.

    I have also tried to build the last version of GATK directly from the repository in case this is something that have been fixed recently but same error occurs. I can share whatever you could need.

    Thanks in advance,
    Francisco

Sign In or Register to comment.