To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

Stars in GT, incompatible with SnpSift

BurgundiaPRBurgundiaPR Lyon (France)Member
edited November 2015 in Ask the GATK team

Hi,

I have questions about the combination genotypeGVCFs/SelectVariants.

  • I have some loci with the genotype 0/2 for some samples, 0/1 for other.. There is thus two different alleles , but the corresponding ALT is for example « G,*_». If the nucleotide G corresponds to 0/1, to what corresponds exactly the 0/2 ? Said otherly, what does this star mean ?

  • How to correctly extract the variants for each sample from the combined vcf ? For a given sample, my command line is :

java -jar GenomeAnalysisTK.jar -T SelectVariants -R hg19_min_oldM.fa -V Combined.vcf -o Sample.vcf --excludeNonVariants -sn Sample

Nevertheless, it gives a vcf which is not exploitable by SnpSift extractFields because of theses stars.

  • last question : I read that a downsampling is effectuated by SelectVariants, but I do not understand what it means. As you guess, I really want to conserve all the variants that have been called by HaplotypeCaller, so I don't want SelectVariants to perform an other thing that extracting all the variants from each sample....

Sorry for all these questions... but thanks by advance to who will help me !

Cheers,
BPR

Best Answer

Answers

  • tommycarstensentommycarstensen United KingdomMember

    Is this version 3.4-0 or 3.4-46?

  • BurgundiaPRBurgundiaPR Lyon (France)Member

    It is the version 3.4 -0.

  • BurgundiaPRBurgundiaPR Lyon (France)Member
    edited November 2015

    Thank you very much for your answer, I'll write to the SnpEff/SNPsift team to explain them the issue.

    But before they take into account these GATK modifications, I have to handle these stars. After SelectVariants, in my "per sample vfc", can I replace the star by "nothing" to render the file compatible with these tools ? Or does it exist another way to represent this deletion ? Normally, deletions are in the form REF="GA", ALT="G", so a REF with multiple bases. Here, I have only one base for REF..

    Concerning downsampling, here is what is written in the SelectVariants manual page :

    **Downsampling settings
    This tool applies the following downsampling settings by default.

    Mode: BY_SAMPLE
    To coverage: 1,000**

    Cheers,
    BPR

  • BurgundiaPRBurgundiaPR Lyon (France)Member

    Concerning the star/SnpSift issue, it has been fixed in recent versions of the tool.

    Mea culpa.

  • BurgundiaPRBurgundiaPR Lyon (France)Member

    Nevertheless, I still have some doubts on the "downsampling settings" used by SelectVariants. If someone have an idea... ;)

    BPR

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie
    edited November 2015

    Hi @BurgundiaPR, the SelectVariants tool does not actually apply any downsampling. The settings that appear on the tool page are an unintended side effect of the automated documentation system. We will try to fix this soon to avoid confusing people. Sorry about that!

  • BurgundiaPRBurgundiaPR Lyon (France)Member

    Ok, thanks you very much for your answer !

    BPR

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie

    The documentation issue raised in this thread has been fixed, FYI.

Sign In or Register to comment.