If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!
Test-drive the GATK tools and Best Practices pipelines on Terra
Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
(howto) Compress read data with ReduceReads - DEPRECATED
Please note that this article refers to a method that is no longer recommended as part of the Best Practices!
Compress the read data in order to minimize file sizes, which facilitates massively multisample processing.
- Compress your sequence data
1. Compress your sequence data
Run the following GATK command:
java -jar GenomeAnalysisTK.jar \ -T ReduceReads \ -R reference.fa \ -I recal_reads.bam \ -L 20 \ -o reduced_reads.bam
This creates a file called
reduced_reads.bam containing only the sequence information that is essential for calling variants.
Note that ReduceReads is not meant to be run on multiple samples at once. If you plan on merging your sample bam files, you should run ReduceReads on individual samples before doing so.