The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Surround blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block.
Powered by Vanilla. Made with Bootstrap.
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Identify haploid vs. diploid samples

beryljonesberyljones University of Illinois at Urbana-ChampaignMember Posts: 2

Hello! I have RNAseq data from 40 insects with a haplodiploid sex determination system. While for pupae and adults I know the sex of the individuals, I would like to determine the sex of the younger life stages. I believe there should be a way to use SNP variants to determine whether a particular library is from a haploid (male) or diploid (female) individual, but I don't know how the GATK tools work and what would be most appropriate. I would appreciate any advice!

Tagged:

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator, Dev Posts: 4,456 admin

    @‌beryljones

    Hi,

    You can call all variants assuming they are all diploid. High quality variants in haploid individuals should all be homozygous; in diploids, there will also be heterozygous calls with good quality.

    Hope this helps.

    -Sheila

  • beryljonesberyljones University of Illinois at Urbana-ChampaignMember Posts: 2
    edited April 2014

    Hi Sheila,

    Thanks for the response! Which variant calling tool would be best? Are there any designed for transcriptome rather than genome variant discovery?

    Thanks!
    Beryl

  • KStammKStamm Member Posts: 31

    You're asking the GATK support forum; they are going to recommend HaplotypeCaller as the strongest variant caller possible on genomic samples. The older tool, UnifiedGenotyper should also work for these purposes, but none are designed for RNA-Seq. I know when I use files aligned by TopHat there are lots of false-positive variants hanging off exon boundaries, so you should carefully select exonic locations only for this purpose.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator, Dev Posts: 4,456 admin

    @beryljones‌

    Hi,

    We do have a new updated Best Practices that explains HaplotypeCaller's abilities to handle RNA-seq data. Please refer to it here: http://www.broadinstitute.org/gatk/guide/best-practices?bpm=RNAseq

    -Sheila

  • charlesbaudocharlesbaudo MissouriMember Posts: 5

    @KStamm
    Could you elaborate on how you assess those false-positive variants? The frequency of SNPs we saw in our RNA-seq SNP analysis were way higher than expected and I suspect that many will be false-positives. Thank you.

Sign In or Register to comment.