The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks (  ) each to make a code block as demonstrated here.

GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

DenmarkMember

Hi,

Is there a "Best Practices" for how to use ReadAdaptorTrimmer? To me it seems that there is a Catch 22, if one wants to use GATK and Picard.

According to the ReadAdaptorTrimmer documentation: "Read data MUST be in query name ordering as produced, for example with Picard's FastqToBam". Therefore, I would start by doing

java picard.jar FastqToSam FASTQ={r1_file} FASTQ2={r2_file} OUTPUT={bam_file} SM={sample} SORT_ORDER=queryname

to convert my FASTQ files into a sorted uBAM file. However, ReadAdaptorTrimmer requires the BAM file to be indexed, but if I then try

java picard.jar BuildBamIndex INPUT={bam_file}

it fails because BuildBamIndex requires that the BAM file is sorted by coordinate (which does not make sense since the reads are not yet aligned).

Thanks,
Michael

Tagged:

#### Issue · Github October 2015 by Sheila

Issue Number
254
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
sooheelee

@micknudsen
Hi Michael,

I have asked one of our team members to help with this one. She will get back to me sometime next week.

-Sheila

edited November 2015

Hi Michael,

I recommend two different tools that combined achieve what I assume you need--MarkIlluminaAdapters and SamToFastq.

To clip the adapter sequences, use Picard's SamToFastq. You will specify the CLIPPING_ATTRIBUTE=XT and a CLIPPING_ACTION of either (1) X to hard-clip, (2) N to change bases to Ns or (3) a number, e.g. 2, to change the base qualities of those positions to the value, e.g. 2.

Remember that you can restore original read sequences and base qualities, amongst other attributes, after alignment using Picard's MergeBamAlignment.

These recommendations aside, I was able to recapitulate your errors using my own file. These errors persist even when commands are run in unsafe mode, designated with -U`, that allow GATK commands to process files without indexes. Since I am new to the GATK team, I had to ask to find out that ReadAdaptorTrimmer isn't on the team's radar--that is, we don't use it. Its presence is some vestige of development. This tool blindly strips what it assumes are adaptor sequences but what are technically sequences 3' of overlapping sequences of a certain length. If you are processing sequencing samples with typical aims, I would strongly discourage using any tool that doesn't specifically take into account the sequences of adapters in trimming.

I hope I've been helpful. Let me know if I can clarify any points.

Post edited by shlee on
• DenmarkMember

Thanks, @shlee! I will go ahead and try your approach. I will let you know if I run into something unexpected.