The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Haplotype Score and Phasing

stechenstechen University of PennsylvaniaMember Posts: 23

Hello! I was wondering if the HaplotypeScore annotation was restored for HaplotypeCaller in GATK 2.6. Does it have to be called? (It's not included in my vcf file.) Moreover, all of the GT field designations have "/" instead of "|" which according to the following would mean that the results are still unphased:

"GT genotype, encoded as alleles values separated by either of ”/” or “|”, e.g. The allele values are 0 for the reference allele (what is in the reference sequence), 1 for the first allele listed in ALT, 2 for the second allele list in ALT and so on. For diploid calls examples could be 0/1 or 1|0 etc. For haploid calls, e.g. on Y, male X, mitochondrion, only one allele value should be given. All samples must have GT call information; if a call cannot be made for a sample at a given locus, ”.” must be specified for each missing allele in the GT field (for example ./. for a diploid). The meanings of the separators are:
/ : genotype unphased
| : genotype phased"

http://www.1000genomes.org/wiki/Analysis/Variant Call Format/vcf-variant-call-format-version-40

Also, is there a more detailed explanation than what's on the HaplotypeScore documentation page? How is the score determined in UnifiedGenotyper? Does the score have anything to do with phasing? Also, how is phasing achieved if only the 10bps surrounding the SNP are examined, regions which likely do not include other SNPs?

Thank you!

Best Answer

Answers

  • jzookjzook Member Posts: 17

    Hi Geraldine,

    Since complex variants (i.e., nearby SNPs and indels) represent a significant fraction of variants, and unphased complex variants are not very useful, it would be really great if the HaplotypeCaller would output phasing for complex variants. The Haplotypecaller inherently should know the phasing of complex variants based on how the algorithm works, so it seems like this should be a pretty straightforward thing to do. Awhile ago, I thought an older version of HaplotypeCaller actually could output phased haplotypes. Do you have plans to add this ability in the near future?

    Thanks!
    Justin

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,183 admin

    Hi Justin,

    Unfortunately, phasing complex variants is more complicated than it might seem, and it makes the evaluation of the callsets even more difficult, so we have no immediate plans to implement this. But it's interesting to hear that this is something people would want...

    For now, you can add -mergeVariantsViaLD to your HC command line to go back to the old behavior of merging together nearby events if there is enough evidence in the population to support it. But please be aware that in general we don't support that option.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.