Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

BAM with hight depth + adaptive-pruning : variant called as HET but there are almost no ALT read

lindenblindenb FranceMember ✭✭
edited September 16 in Ask the GATK team

Hi the GATK team,

this is a follow-up of my previous question: https://gatkforums.broadinstitute.org/gatk/discussion/24252/ which was solved by using --adaptive-pruning

I'm working with a bam having a hig depth (HAPLOPLEX technology). Now I've got a variant call as HET with a AD 18/14

7   154461108   .   C   A   1218.60 .   AC=1;AF=0.500;AN=2;BaseQRankSum=-2.756;DP=36;ExcessHet=3.0103;FS=12.053;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=25.36;ReadPosRankSum=-1.963;SOR=0.061   GT:AD:DP:GQ:PL  0/1:18,14:32:99:1226,0,140

while the samtools mpileup shows that there is a majority (>90%) of REF alleles at this position.

7   154461108   C   637 ....................................................................................................................................................................................................................................................................................................A...................................................................A....................................................................................................................................................................................A......A..................................A,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,   alnfnkmgnlhnllXcffmd^gijljlmnSomHlfknjnmclnnRcmXdhdoZmnnonlonOlonooolGfnlonnknmoncononJolenmnnjncn^Tokklooo^o<lfh<mggZoRkfmlgamgkgnkdmlcomkgboihgkmhonbkmmnoiokconommnlonnkn]kkkgohfnfakfoohnmnmmb<mnjodblnnll^lioloonoMoha`OcfoBf\amgakda_ngkknmWgklfkZmnhnRVnnnoogmcbnafnognojojoonWmmBhkQ_njhmkdnjkomokl^jindfbjl9gggnkdcknllkRgoik\omlkkm^lnhlkloTo^onndIonmnm]nnnjngmcgnnfoojnnnoonjenmnhlonoekoonooofoo_kZT^f<mkhkkZlknBgffmkmimhfonflfmmdkkl5mi]okmojmoRinnoojconknmokn^][moononnnjoGooocnolnoXX<n\gok<[lnndknnhfifm_akfkmmmlmllhgmnnmnomnmmnknnmmIoomjdomnokjdjonmbooonnjnhonooXmdno\oflnnbn<fXDFEHEEGE?HFHHHFFHHGFFH;=HHBHFHDH;EHHDFHFFH9HHHH>FGF<H\

The vcf was called with:

module load gatk/4.1.2.0 && gatk --java-options " -Djava.io.tmpdir=." HaplotypeCaller --minimum-mapping-quality "20" -L "7:154460108-154462108" \
-R human_g1k_v37.fasta -I jeter.bam -O jeter.vcf \
--dont-use-soft-clipped-bases  --max-reads-per-alignment-start 2000 \
--graph-output jeter.dot --debug-graph-transformations \
--force-active --disable-optimizations  -bamout  jeter.bamout.bam \
--adaptive-pruning

can you explain this HET genotype please ?

I put the files in the following shared folder: https://nextcloud-bird.univ-nantes.fr/index.php/s/HYsx26oRmrJNwAd

Thank you in advance,

Pierre,

Answers

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭

    Hi @lindenb

    Sorry If I am nosing in here. I gave a try to this sample with the below parameters and the heterozygous call is now gone but the homozygous call still remains.

    gatk --java-options " -Djava.io.tmpdir=." HaplotypeCaller --minimum-mapping-quality "20" -L "7:154460108-154462108" \
    -R human_g1k_v37.fasta -I jeter.bam -O jeter.vcf \
    --dont-use-soft-clipped-bases  -bamout  jeter.bamout.bam \
    --adaptive-pruning
    

    Removed disabling optimizations and forcing active as well as max read count per alignment start. Do these settings work for your other heterozygous variants that you are worried about in the previous post.

    I used GATK 4.1.3.0 btw.

  • lindenblindenb FranceMember ✭✭

    @SkyWarrior thanks, I get the same result too. But with many reads starting at the very same position I think I need to set --max-reads-per-alignment-start otherwise other variants might be missed (I did not test it though)

  • SkyWarriorSkyWarrior TurkeyMember ✭✭✭

    Good to see that the result is reproducible and another thing that worries me is the homozygous variant turns into a heterozygous call once you have all your parameters set but removing 3 of them takes care of this issue as well. I guess a whole around trial (for all the variants that you are concerned about) with only these 2 parameters might be good start.

Sign In or Register to comment.