The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Calling SNPs in a hemiclone experiment

BlueBlue Member Posts: 48

Hi Team,

I'm sequencing the genome of an organism which is a cross between the reference line (with no SNPs) and an individual from an outbred population (with many SNPs). Therefore all of the SNPs in my target organism will be heterozygous. So far I have sequenced three individuals which are crosses and one individual from our reference line.

I understand that the UnifiedGenotyper uses population genetic principles to ascertain genotype but I can't find more information about how this is performed. Thus, I am primarily worried that heterozygotes with strongly asymmetric allele counts in the reads will be called as homozygotes in order to fit in with, say Hard-Wienberg equilibrium.

Is there any chance you could enlighten me on this ? (or direct me to more detailed information on UG mechanism and settings).

Just to let you know the background, my study organism is Drosophila melanogaster. The whole genome of 164Mb is paired-end sequenced on an Illumina. I have so far sequenced one individual from our in-house reference line, and three individuals which are crosses of the reference line with a diverse, out-bred population. Average coverage is 30X. The 'crosses' are hemiclones in which recombination between the parental chromosomes is suppressed. I plan on sequencing 200 hemiclone individuals in which one haplotype will be shared between them (the reference gene) and the other haplotype will be diverse and unique to each line. As expected, I have identified a limited number of mutations in our in-house laboratory reference line compared to that of the assembly.

Any advice on how to best call genotypes in this unorthodox sample would be most appreciated.

Best Answer

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,163 admin
    Accepted Answer

    Hi there,

    Actually the UG does NOT use population genetics principles to ascertain genotypes. Genotypes are strictly greedily assigned from the genotype likelihoods (which are just based on the data). No Hardy-Weinberg in the UG at all. So you can just go ahead with the Best Practices we describe in the documentation. Good luck!

    Geraldine Van der Auwera, PhD

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 11,163 admin
    Accepted Answer

    Hi there,

    Actually the UG does NOT use population genetics principles to ascertain genotypes. Genotypes are strictly greedily assigned from the genotype likelihoods (which are just based on the data). No Hardy-Weinberg in the UG at all. So you can just go ahead with the Best Practices we describe in the documentation. Good luck!

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.