Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

GenerateHaploidCNVGenotypes, missing R script

Will_GilksWill_Gilks University of Sussex, UKMember ✭✭

Hi @bhandsaker

Running, GenerateHaploidCNVGenotypes I'm getting:

Exception in thread "main" java.lang.RuntimeException: Cannot locate R script: genotyping/estimate_cnv_allele_frequencies.R (SV_DIR = /cm/shared/apps/svtoolkit/2.0.1602/)

I've looked for the R script in /cm/shared/apps/svtoolkit/2.0.1602/ but can't locate it, although there are several other R scripts. I guess I'm either missing the script altogether, or not specifying the path correctly.

Also I was wondering how use the variants generated by HaplotypeCaller to phase the CNV calls. I have maternal but not paternal genotypes... which sounds like writing a new program altogether

Anyway, my (non-working) code is:

   SV_DIR=/cm/shared/apps/svtoolkit/2.0.1602/


        which java > /dev/null || exit 1
        which Rscript > /dev/null || exit 1
        which samtools > /dev/null || exit 1
            mx="-Xmx4g"
            classpath="${SV_DIR}/lib/SVToolkit.jar:${SV_DIR}/lib/gatk/GenomeAnalysisTK.jar"



java -Xmx4g -cp ${classpath} org.broadinstitute.sv.apps.GenerateHaploidCNVGenotypes \
    -R ${ref_seq} \
    -vcf ${vcf_path}/${vcf} \
    -O lhm_rg_cnvGenos_raw.vcf \
        -ploidyMapFile ${ploidy} \
        -genderMapFile ${gender_map} \
        -estimateAlleleFrequencies true \
        -genotypeLikelihoodThreshold 0.001 \
            -debug true \
            --verbose true || exit

Cheers,

Will

Best Answers

Answers

  • bhandsakerbhandsaker Member, Broadie, Moderator admin

    Try export SV_DIR=.... It's an environment variable.

  • Will_GilksWill_Gilks University of Sussex, UKMember ✭✭

    @bhandsaker I'm assuming you meant export SV_DIR=/cm/shared/apps/svtoolkit/2.0.1602/.

    Anyway, returns the "can't find R script" under any variation I try.

    Should I be able to locate the script in /cm/shared/apps/svtoolkit/2.0.1602/ ? It's not in R/ or genotyping/

  • Will_GilksWill_Gilks University of Sussex, UKMember ✭✭

    Hi @bhandsaker Many thanks, and happy travels.

  • Will_GilksWill_Gilks University of Sussex, UKMember ✭✭

    Hi @bhandsaker

    GenerateHaploidCNV genotypes worked. I've attached screenshots from the 'Integrated Genomics Viewer' of one whole chromosome arm of the resulting vcf.

    As before, homozygous variants are expected to be very rare due to our breeding design.

    1. There's three samples a bit before the middle which look dodgy because the genotypes are often different from the other samples.

    2. There's a lot of loci that are called as heterozygous for all individuals including the maternal lineage on the bottom row, which is pretty weird, and probably not biological. I was thinking it might be to do with the sequencing chemistry, and how different library preparation methods deal with GC-rich and repetitive template. All, these samples were done using the Illumina Nextera method.

    3. Any idea how I can phase the 'haploid' CNV genotypes with SNP genotypes from HaplotypeCaller ?

Sign In or Register to comment.