genotyping sex chromosomes

Hi GATK team,

I have exome samples, some males and some females.
I mapped the female to a reference genome without the Y chromosome, and continued with each sample the Best Practice steps.
The reason for that mapping is that we don't want to lose some of the females reads to the homologous regions of chro Y.
Will I be able to run GenotypeGVCFs on those samples?
Is this a good way to do the genotyping on the sex chromosomes?

thank in advanced,
Maya

Answers

  • tommycarstensentommycarstensen United KingdomMember
    edited February 2015

    Hi @mayaab. I'm a user and I'm very interested in your question. I don't think there are currently any best practices for the sex chromosomes. See this thread:

    http://gatkforums.broadinstitute.org/discussion/2895/vqsr-and-sex-chromosomes
    

    With that being said. Your pseudoautosomal regions on the Y chromosome could be masked on your reference sequence (for example if you are using hs37d5) and no reads were therefore mapped to this region. Have I understood your question correctly? For example the masking is explained here:

    ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/README_human_reference_20110707
    
    "The Y PARs have been masked out by "N" in the reference, so that the X PARs may be treated as diploid for male samples."
    

    I was able to run GenotypeGVCFs3.2 on the Y chromosome of female samples. I never checked, whether I got strictly REFREF calls, but I should have. That being said you could just run it on the male samples with a ploidy of 1. This thread might be of interest to you:

    http://gatkforums.broadinstitute.org/discussion/1214/can-i-use-gatk-on-non-diploid-organisms
    

    I hope parts of what I have written is somewhat useful.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator

    @mayaab @tommycarstensen
    Hi Maya,

    It is true we do not currently have a Best Practices for sex chromosomes. We may work on one in the near future.

    Usually it does not make a difference if you map females to a reference with the Y chromosome or not. In your case, you can simply run GenotypeGVCFs on your samples without a problem as long as you use the male reference with the Y chromosome. GATK does not throw an error if the bam file does not contain some of the reference chromosomes. However, it will throw an error if a chromosome is found in the bam file that is not in the reference.

    -Sheila

  • mayaabmayaab IsraelMember

    Thanks for the answers! Tommycarstensen and Sheila, you helped me a lot.
    For now, I ran GenotypeGVCF on the samples.
    I will check the masking of Y chro for future analysis.

Sign In or Register to comment.