The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.9.0 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

Is UGCalcLikelihodds available?

marcus10marcus10 Member Posts: 2

Hi,

We're interested in implementing Pasaniuc et al.'s (Nature Genetics 2012) method to use off-target next-generation sequencing data to 'genotype' some of our samples. In this paper, GATK’s UGCalcLikelihoods mode was used to compute the genotype likelihoods prior to imputation. Unfortunately, however, it doesn’t appear that UGCalcLikelihoods has been made available for public access. Is this true and can we anticipate that this might be available soon? Is there any way for members outside of Broad to access this?

If this is indeed unavailable this is a bit of a shame, both because this was published in such a high profile journal and, more importantly, because I'm certain that other researchers would like to use this method.

Can your suggest other tools that one might be able to use for this purpose?

Thanks!

Best Answer

  • ebanksebanks Broad InstituteMember, Broadie, Dev Posts: 692 admin
    Accepted Answer

    Hi there,

    UGCalcLikelihoods is a private tool used for QCing the calling and not intended for general use; so no it will not be made available. That being said, had my colleagues who wrote that paper consulted me I would have suggested an alternative approach that would be easier (while still achieving the same results).

    You should use the Unified Genotyper, with '--genotyping_mode GENOTYPE_GIVEN_ALLELES' (since you want likelihoods for just the 1000 Genomes alleles), '--alleles 1000Genomes.vcf -L 1000Genomes.vcf' (to tell it which alleles from 1000 Genomes to genotype and just process those locations), and '--output_mode EMIT_ALL_SITES' (to produce likelihoods even at low confidence sites). This produces the same file that was described in the paper.

    Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

Answers

  • ebanksebanks Broad InstituteMember, Broadie, Dev Posts: 692 admin
    Accepted Answer

    Hi there,

    UGCalcLikelihoods is a private tool used for QCing the calling and not intended for general use; so no it will not be made available. That being said, had my colleagues who wrote that paper consulted me I would have suggested an alternative approach that would be easier (while still achieving the same results).

    You should use the Unified Genotyper, with '--genotyping_mode GENOTYPE_GIVEN_ALLELES' (since you want likelihoods for just the 1000 Genomes alleles), '--alleles 1000Genomes.vcf -L 1000Genomes.vcf' (to tell it which alleles from 1000 Genomes to genotype and just process those locations), and '--output_mode EMIT_ALL_SITES' (to produce likelihoods even at low confidence sites). This produces the same file that was described in the paper.

    Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

  • marcus10marcus10 Member Posts: 2

    Thank you very much for both a prompt and extremely useful reply!

    Best regards!

Sign In or Register to comment.