SelectVariants According to AD Values

How can I select variants that has specific AD values?

Since AD parameter has two values (reference depth, allele depth) I couln't find that how can I write related expression on script. If I wanted to select variants that their allele depth is greater 100, which expression should I use? (e.g "AD[1] > 100 ?)

Also, several times I came across inconsistencies between AF values and AD values. In the example below, AD values show that there is 69 reference and 78 allele depth. It should indicate about 0.53 AF. However, UnifiedGenotyper gives its AF as 1.00. What could be the reason og that situtation?

CM001522.1 69914 . T C 2657 . AC=1;AF=1.00;AN=1;BaseQRankSum=3.287;DP=147;Dels=0.00;FS=12.559;HaplotypeScore=94.3377;MLEAC=1;MLEAF=1.00;MQ=41.70;MQ0=41;MQRankSum=4.393;QD=18.07;ReadPosRankSum=1.197;SOR=4.017 GT:AD:DP:GQ:PL 1:69,78:147:99:2687,0

Thank You.

Best Answer

  • shleeshlee Cambridge admin
    Accepted Answer

    Hi @CanH,

    Check out our documentation on JEXL expressions here. Discussion of AD is under Using JEXL to evaluate arrays.

    The INFO field annotations refer to the cohort and here I believe should refer to allele frequency. Your VCF header should describe the annotation, e.g.

    ##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
    

    You can find more information on VCF formats here.

    If you subset your callset, e.g. using SelectVariants, then depending on your settings, the tools may have adjusted the INFO field values to reflect the samples that remain in your final callset. You can avoid this using special options that each tool doc lists, e.g. --keepOriginalAC.

Answers

  • shleeshlee CambridgeMember, Broadie, Moderator admin
    Accepted Answer

    Hi @CanH,

    Check out our documentation on JEXL expressions here. Discussion of AD is under Using JEXL to evaluate arrays.

    The INFO field annotations refer to the cohort and here I believe should refer to allele frequency. Your VCF header should describe the annotation, e.g.

    ##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
    

    You can find more information on VCF formats here.

    If you subset your callset, e.g. using SelectVariants, then depending on your settings, the tools may have adjusted the INFO field values to reflect the samples that remain in your final callset. You can avoid this using special options that each tool doc lists, e.g. --keepOriginalAC.

Sign In or Register to comment.