Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Does ReadBackedPhasing rely on a VCF's GT field?

I have a VCF file that is missing the GT field. Can I just add 0/1 for each variant, and let GATK's ReadBackedPhasing take care of resolving the actual phased genotypes?

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @genomeuser
    Hi,

    I don't think so. Why does your VCF not have genotypes? Where/how did you produce it?

    Thanks,
    Sheila

  • genomeusergenomeuser Member
    edited May 2018

    The VCF was produced from a BED-like file of variants that I had.. The file had these five columns - chr,start, stop,ref, and alt.

    It will take some time and effort to go back to the original output by the variant callers, so if you know of any other way that GATK can resolve this issue , let me know.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @genomeuser
    Hi,

    I have no idea if your way will work, and we have not tested it. But, you can try it out and let us know how it goes :smile: If it does not work, you will need to generate a proper VCF (preferably by GATK tools) to input to ReadBackedPhasing.

    -Sheila

  • wchenwchen Member

    I ran this:
    java -Xmx50g -jar .../GenomeAnalysisTK-3.3.0/GenomeAnalysisTK.jar -R ucsc.hg19.fasta -L regions.bed -T HaplotypeCaller -nct 8 -I recal.bam -o g.vcf --genotyping_mode DISCOVERY -stand_emit_conf 30 -stand_call_conf 30 -ERC BP_RESOLUTION -variant_index_type LINEAR -variant_index_parameter 128000 >& hp.log

    using -ERC GVCF gave the same thing.

    All lines look like this:
    chr9 133730278 . A . . . GT:AD:DP:GQ:PL 0/0:374,6:380:99:0,1 20,1800

    Why I didn't see 0|1 or 1|1? Thanks!

  • wchenwchen Member

    chr9 133730274 . A . . . GT:AD:DP:GQ:PL 0/0:349,10:359:99:0,120,1800

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @wchen
    Hi,

    The tool will only output | in the genotype when the site is phased with another site. Can you confirm the sites you posted are in phase with other sites? Please post some IGV screenshots of the sites that are in phase. Also, I hope this blog will help.

    -Sheila

Sign In or Register to comment.