Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

How to determine sex from Bam Filles

paumarcpaumarc ZagrebMember

Hello

There is some easy way to detect the sex of the sequenced genome? I saw this entrance

http://gatkforums.broadinstitute.org/discussion/5903/is-there-a-walker-that-determines-sex-from-a-bam-or-vcf-file

but i am not sure if it is the same case (i am new at GATK), there is some "Hans on" manual or guide that i could use to start?

thanks

Comments

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @paumarc
    Hi,

    Unfortunately, that thread is probably your best help. We do not have any specific recommendations for determining the sex of a sample. Hopefully, some other users will jump in here with some helpful tips.

    -Sheila

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    I would recommend looking up phenotypic inference in the literature to find how other researchers do this. Good luck!

  • tommycarstensentommycarstensen United KingdomMember ✭✭✭

    @paumarc Perhaps you could check the Y coverage. If there isn't a whole lot of it, then you are probably dealing with a female. Likewise, if your chromX depth is half of your autosomes, then your bam file probably originates from a male. I would probably not rely on this myself. It depends how certain you need to be.

    I just found an answer from @lindenb on biostars.org, where he suggests something similar:

    In our lab, we run GATK DepthOfCoverage with 3 beds (autosomes, chrX, chrY) to get 3 mean corverages. Females should have cov(X)>>cov(Y).
    

    Don't forget to remove the PARs for a more accurate result...

  • paumarcpaumarc ZagrebMember
  • Will_GilksWill_Gilks University of Sussex, UKMember ✭✭
    edited June 2016

    Hi @paumarc

    To calculate number of reads per chromosome - which will tell you indirectly the sex of your subjects - try Samtools idxstats http://samtools.sourceforge.net/ with a loop to run through the bam files.

    for i in *.bam; do
    samtools idxstats ${i} > ${i}.chromdepths.txt
    done;
    

    Which gives you an output like this (chromosome - length - reads - (something else)) :

    chr2L   23513712        8695825 341
    chr2R   25286936        8759063 401
    chr3L   28110227        9983508 385
    chr3R   32079331        11345968        491
    chrUn_DS485919v1        1021    0       0
    chrUn_DS483755v1        6936    829     2
    chrUn_DS485425v1        1143    211     0
    chrUn_DS484861v1        1395    97      0
    

    You might want to modify the path and prefix for your bams.

    In e.g. R, you can then calculate the ratio of X and Y linked reads compared to autosomal reads, having adjusted for the length of the chromosomes.

Sign In or Register to comment.