GenotypeGVCF error when X chromosome was joined

Hi.
I produced X chromosome's GVCF using HaplotypeCaller, once with ploidy 2 for female and once with ploidy 1 for male.
Female is 26 individuals and male is 36 individuals.
then I joined these 62 individuals chr X's GVCFs using GenotypeGVCFs but I encounted error.

INFO  17:15:50,430 ProgressMeter -      X:31075401         0.0     2.3 h   13670.7 w       20.0%    11.5 h       9.2 h
##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 3.4-46-gbc02625):
##### ERROR
##### ERROR This means that one or more arguments or inputs in your command are incorrect.
##### ERROR The error message below tells you what is the problem.
##### ERROR
##### ERROR If the problem is an invalid argument, please check the online documentation guide
INFO  17:23:45,258 ProgressMeter -      X:31075401         0.0     2.4 h   14455.8 w       20.0%    12.1 h       9.7 h
##### ERROR (or rerun your command with --help) to view allowable command-line arguments for this tool.
##### ERROR
##### ERROR Visit our website and forum for extensive documentation and answers to
##### ERROR commonly asked questions http://www.broadinstitute.org/gatk
##### ERROR
##### ERROR Please do NOT post this error to the GATK forum unless you have really tried to fix it yourself.
##### ERROR
##### ERROR MESSAGE: There was a failure because you did not provide enough memory to run this program.  See the -Xmx JVM argument to adjust the maximum heap size provided to JVM argument to adjust the maximum heap size provided to Java

So I attempt to increase -Xmx up to 120G, but continually error occured.
What is the problem?

Also I joined 36 individuals chrY's GVCFs using GenotypeGVCFs and then run VQSR.
but this error occured.

##### ERROR MESSAGE: Bad input: Values for InbreedingCoeff annotation not detected for ANY training variant in the input callset.
VariantAnnotator may be used to add these annotations

My script is ,

java -Xmx32g -jar Genome Analysis Toolkit.jar -T GenotypeGVCFs \
-R /BiO/Project/bundle/human_g1k_v37.fasta -o project.raw.joint.chrY.vcf \
--dbsnp /BiO/Project/bundle/dbsnp_138.b37.vcf --variant project.GVCF.chrY.list \
-nt 32 -hets 0.001 -stand_call_conf 30.0 -stand_emit_conf 30.0 -indelHeterozygosity 1.25E-4 -nda -L Y \
--log_to_file chrY.GenotypeGvcfs.log --disable_auto_index_creation_and_locking_when_reading_rods

Although the number of my male samples is higher than InbreedingCoeff condition ( >= 10 ), Why do this error occur?

Tagged:

Best Answers

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Sunhye
    Hi,

    I think your GenotypeGVCFs issue may be related to the issue described in this thread: http://gatkforums.broadinstitute.org/discussion/5862/problem-running-genotypegvcfs-for-large-all-male-cohort-in-chromosome-x
    We are working on a fix for this, but I am not sure when it will be ready. I will post to the thread above when the fix is in.

    As for your VQSR error, can you confirm that you do have InbreedingCoefficient annotation in your VCF records? Can you post a few records from your VCF?

    Thanks,
    Sheila

  • SunhyeSunhye KoreaMember
    edited October 2015

    Hi @Sheila.

    In my chrY VCF,

    ##INFO=<ID=FS,Number=1,Type=Float,Description="Phred-scaled p-value using Fisher's exact test to detect strand bias">
    ##INFO=<ID=HaplotypeScore,Number=1,Type=Float,Description="Consistency of the site with at most two segregating haplotypes">
    ##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
    ##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
    ##INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
    ##INFO=<ID=MQ,Number=1,Type=Float,Description="RMS Mapping Quality">
    ##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities">
    ##INFO=<ID=NDA,Number=1,Type=Integer,Description="Number of alternate alleles discovered (but not necessarily genotyped) at this site">
    ##INFO=<ID=QD,Number=1,Type=Float,Description="Variant Confidence/Quality by Depth">
    ##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias">
    ##INFO=<ID=SOR,Number=1,Type=Float,Description="Symmetric Odds Ratio of 2x2 contingency table to detect strand bias">
    #CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
    Y       2651883 .       C       CT      32.04   . AC=1;AF=0.029;AN=34;DP=144;FS=0.000;MLEAC=1;MLEAF=0.029;MQ=57.21;NDA=1;QD=8.01;SOR=1.609
    Y       2653141 .       T       C       2794.46 .       AC=5;AF=0.139;AN=36;DP=207;FS=0.000;MLEAC=5;MLEAF=0.139;MQ=60.00;NDA=1;QD=33.27;SOR=1.423
    Y       2654333 rs201588461     C       T       3739.49 .       AC=9;AF=0.250;AN=36;DB;DP=221;FS=0.000;MLEAC=9;MLEAF=0.250;MQ=60.00;NDA=1;QD=33.39;SOR=0.927
    Y       2655180 rs11575897      G       A       5915.49 .       AC=9;AF=0.250;AN=36;DB;DP=282;FS=0.000;MLEAC=9;MLEAF=0.250;MQ=60.00;NDA=1;QD=34.00;SOR=0.839
    Y       2657214 .       G       C       707.23  .       AC=2;AF=0.056;AN=36;DP=162;FS=0.000;MLEAC=2;MLEAF=0.056;MQ=60.00;NDA=1;QD=32.15;SOR=0.693
    Y       2658789 .       T       TA      165.66  .       AC=1;AF=0.028;AN=36;DP=153;FS=0.000;MLEAC=1;MLEAF=0.028;MQ=62.65;NDA=1;QD=20.71;SOR=1.863
    Y       2660085 .       T       A       820.66  .       AC=1;AF=0.028;AN=36;DP=159;FS=0.000;MLEAC=1;MLEAF=0.028;MQ=60.00;NDA=1;QD=32.83;SOR=1.358
    

    InbreedingCoeff is in header but variant is not annotated.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Sunhye
    Hi,

    Your comment did not post properly. :neutral:

    -Sheila

  • SunhyeSunhye KoreaMember

    And @Sheila, X chromosome's problem is likely different ploidy value ( ploidy = 1 for male, ploidy =2 for female)
    I will run that males and females, respectively, run GenotypeGVCF, and then the chrX gvcfs of male and female join using combineGVCFs?

  • SunhyeSunhye KoreaMember

    Sorry, @Sheila
    Could you check the posts above again?
    Thanks !

  • SunhyeSunhye KoreaMember
    edited October 2015

    Hi @Sheila !

    I am always grateful for your answer.
    For X chromosome, I ran having "--max_alternate_alleles 3" and then I finished successfully joint X chromosomes containing different haploids.

    Thanks,
    Sunhye

  • SunhyeSunhye KoreaMember

    Hi @Sheila .
    According to your advise, I run VariantAnnotator .
    My command,

    java -Xmx64g -jar ${GATK} -T VariantAnnotator -R ${REF} -V project.raw.joint.chrY.vcf -L Y \
    -A InbreedingCoeff -o project.raw.joint.chrY.IC_anno.vcf
    

    However, this tool,also, don't annotate InbreedingCoefficient to chrY.vcf.
    Although the number of sample is more than 10, for Y chromosome, Don't GATK calculate InbreedingCoefficient?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin
  • SunhyeSunhye KoreaMember

    Hi @Sheila
    I'm so sorry about late relay.
    Except for Y chromosome, other chromosomes were successfully annotated about InbreedingCoeff without ped file.
    and my samples are unrelated in same population (i.e these are not trio sample.)
    Even so, can I use the ped file?
    If so, I make ped files manually?

  • SunhyeSunhye KoreaMember

    Hi @Sheila. Thanks for your reply !
    Hi @tommycarstensen ! I agree to your comment. So,I will remove variants called as hetrozygote genotype for male.

Sign In or Register to comment.