We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

GATK ClipReads option

JCGrenierJCGrenier Montreal, QCMember ✭✭

Hi folks,

I have a question concerning the ClipReads function and the different functionalities it's offering. The --clipRepresentation option and the different possibilites it offers doesn't seems to do what it's supposed to... As a matter of fact, I tried the REVERT_SOFTCLIPPED_BASES
and HARDCLIP_BASES options and those seem unefficient while giving in input bam files coming from bwa.

HARDCLIP_BASES seems to do what REVERT_SOFTCLIPPED_BASES is supposed to do. I end up using that as it's doing what I want. However, do you think it could be possible to add an option to change the base qualities of those bases that were reverted as we do not necessarily want to include them by using different BQ thresholds?

Thanks a lot!


  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi there,

    Can you tell me what version you're using, and also post an example of what you're seeing with each option? I'd like to make sure we're distinguishing a potential problem with the documentation of the options from a bug in the tool's behavior.

  • JCGrenierJCGrenier Montreal, QCMember ✭✭
    edited September 2013


    I'm using GATK version 2.4-9-g532efad. But I verified also with the very last one and it was doing the same thing.

    Here's an example of the 2 commands with an example of one read :

    Here's what the read looks like first :

    HWI-ST0860:271:H0K9KADXX:1:1114:4088:42468 137 101510.NC_008268 9948 2 67S31M1S = 9948 0 TGCGGCACCGCTACCACTCCACACCGTTGGCGTATCACCGTCGCAATTGGTACCGTCACCACCGTAGCCGCCACCGTGACCGCCACCGCCACCGCCGAT [email protected];[email protected];@&gt;A>59=?CDDDB<[email protected]@DD&gt;B98??@C&lt;[email protected]@D><[email protected]@ MD:Z:9C0C20 RG:Z:AFRXG:i:0 NM:i:2 XM:i:2 XN:i:0 XO:i:0 AS:i:49 XS:i:47 YT:Z:UP

    Base command:

    java -Xmx1g -jar /home/apps/Logiciels/GATK/GenomeAnalysisTK-2.4-9-g532efad/GenomeAnalysisTK.jar -T ClipReads -l INFO -I test.bam -o test.hardclip.bam -R $PATH_TO_REF/RepGenomes.fa -CR $OPTION

    So I guess the two last commands can't process pre-soft-clipped files?

    Thanks for your help.

    Post edited by Geraldine_VdAuwera on
  • CarneiroCarneiro Charlestown, MAMember admin

    Hi JC,

    this tool was written a long time ago, and since it didn't get much use, it hasn't been updated in a while. That's not to say it doesn't do what it was written to do. But there are some caveats, I'll try to ellucidate here.

    • You are not asking to clip any bases from the sequence, so the behavior is correct. There are many ways to tell the tool how you want to clip bases, the simplest of all being the -CT option. That being said, all the outputs look right to me.
    • The option to explicitly revert soft clipped bases was never implemented in this tool (it is implemented internally in the code as an API for other tools that perform that action such as ReduceReads, thus it became visible through the documentation). This is very easy to add though, so I'll do this today.
    • The HARDCLIP_BASES action's first (necessary) step is to revert the softclipped bases so it is able to hard clip whatever tail you request without bumping into soft-clips. That's how it works. Since you are not requesting any base to be hard clipped, it's only reverting soft-clips.
  • JCGrenierJCGrenier Montreal, QCMember ✭✭

    Thanks for your answer! It explains very much why I'm getting those results. Could this be possible to add the option, like I said before, to recode qualities of the reverted bases? It could be really helpful.

    Thanks a lot!

Sign In or Register to comment.