SNP distribution across chromosome

Hi,
I wonder to ask -is there any tool in the GATK to calculate SNPs/Indels distribution in each chromosome based on 100kb or 1 MB window size? Thanks.

Answers

  • shleeshlee CambridgeMember, Broadie, Moderator admin

    Hi @shis,

    What do you mean by distribution?

    Take a look at the Picard metrics page to see if any of the metrics would be helpful towards your calculations. Also, this thread highlights the importance of also factoring for coverage.

  • shisshis USAMember

    Hi Shlee, Thanks for the reply.
    "SNP distribution" - I meant how many SNPs present in 100 kb or 1 Mb region of a chromosome (e.g., rice chromosome 1). Actually, I want to analyse the number of SNPs present based on 100Kb or 1 Mb window size.

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @shis
    Hi,

    I am not aware of any GATK tools that do that. You will need to find some other tools to do what you want.

    -Sheila

  • shisshis USAMember

    I find the solution to calculate SNP distribution in 1 MB region of a chromosome using vcftools --SNPdensity option. I used the following command to calculate SNP density in 1 MB window size of a chromosome:
    vcftools --vcf SNP.vcf --SNPdensity 1000000 --out SNP_snpdensity

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @shis
    Hi,

    Thanks for sharing!

    -Sheila

  • BegaliBegali GermanyMember

    @shlee
    @Sheila

    I would like to receive your hints for plotting how can I do it which will help me to determine threshods that with lower set score so remove them by hard filtering .. however my question can I obtain this plot distribution by GATK tools or I need to do it by R. program I have limited experiences with programmer languages .. my Q is any method with GATK which I can run it to obtain result such as here in this link https://gatkforums.broadinstitute.org/gatk/discussion/6925/understanding-and-adapting-the-generic-hard-filtering-recommendations .... also can you kindly provide me if also there what statistical analysis after filtering step will be useful for convince my results at the end ... I am new for seq analysis (my project for RADseq for plant) , Bioinformatics tools however I am trying my best after discuss with people like you kindly accept to help people like me ....thanks in advance

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Begali
    Hi,

    Have a look at the presentations section where we have some hands on tutorials for hard filtering. Those have R commands for plotting that should help get you started.

    -Sheila

Sign In or Register to comment.