We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

How to identify Denovo Mutations in the child compared with parents?

deepuedeepue HelsinkiMember
edited May 2015 in Ask the GATK team


I am new to NGS analysis and have been following this pipeline recommended in many of the posts in the online forums. I have 3 samples(1 child, 2 parents) and completed analysis till generation of VCF files using HaplotypeCaller. I would like to find de novo mutations in the child, Is it a good idea to proceed for de novo mutations identification after annotation or before annotation? Please advise me on how to proceed with this ?

Note: This question has been asked in another forum but I didn't get suggestions on the GATK functions.



Best Answer


  • deepuedeepue HelsinkiMember

    Hi @Sheila

    Thank you for your quick suggestion.

    I have few questions at this point, after going through the pipeline you have mentioned.
    1. It was mentioned that this workflow will work fine for cohort analysis. Can we use it for a single family - trio data ?
    2. I was wondering how to create a .ped file ? I have gone through the posts and understood its format. As only first 6 columns are required, can we create the file just by typing manually ? Can you please attach a sample ped file for one family ?
    3. Run time error while using the GenotypeGVCFs

    The list of input alleles must contain as an allele but that is not the case at position 13273; please use the Haplotype Caller with gVCF output to generate appropriate records

    I have used the parameters in the function HaplotypeCaller mentioned in the another posts in the same forum.
    --emitRefConfidence GVCF \
    --variant_index_type LINEAR \
    --variant_index_parameter 128000 \


  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭


    1) Yes, you can use the genotype refinement workflow on a trio.
    2) This should help with formatting/examples: http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml#ped You can create your own ped file manually or by using vcftools.
    3) Can you post the exact command you are running with GenotypeGVCFs? Also, please post an example record at that position for one of the samples.


  • deepuedeepue HelsinkiMember



    In the vcftools option, it only has one input file, can you please help me to pass trio data and relation between them to create a ped file ?

    This is the command i have run:

    java -Xmx2g -jar $GTKPATH/GenomeAnalysisTK.jar \
    -T GenotypeGVCFs \
    -R $REFPATH/hg19.fa \
    --variant $TRIPATH/child.output.raw.snps.indels.vcf \
    --variant $TRIPATH/dad.output.raw.snps.indels.vcf \
    --variant $TRIPATH/mom.output.raw.snps.indels.vcf \
    -o $TRIPATH/tri.genotyped.vcf

    Sorry, I couldn't get the information from that position. I am new to unix environment, can you please suggest me how to get that in an easier way ?


  • SheilaSheila Broad InstituteMember, Broadie ✭✭✭✭✭


    1000 Genomes project has an online vcf to ped converter. I think this may be easier to use. http://www.1000genomes.org/vcf-ped-converter


Sign In or Register to comment.