Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

ClipReads error on mixed-length reads

DKOHSUDKOHSU OHSUMember

I have several exome files where I need to clip reads down to 50 bp. Unfortunately, some of my BAMs contain mixed-length reads. When I apply ClipReads to those files, it completes traversal but errors out near the end, possibly just before writing the new clipped BAM to disk. By contrast, it runs just fine on samples with consistent read lengths that have gone through the same upstream analysis pipeline. Is this perhaps a bug that could be fixed?

Thanks,
Deidre

Command line and error trace as follows:

java -Xmx5g -Djava.io.tmpdir=/data/kruppd \
-jar /home/groups/oroaklab/src/GATK/GATK-3.2.2/GenomeAnalysisTK.jar -T ClipReads \
-CT 51-93 -CR HARDCLIP_BASES \
-R /home/groups/oroaklab/refs/GRCh37/Homo_sapiens_assembly19.fasta \
-I /home/groups/oroakdata/SSC/reprocessed/bam/13818.fa.realigned.recal.bam \
-o /home/groups/oroakdata/SSC/reprocessed/bam/clipped/13818.fa.realigned.recal.clipped.bam

INFO 13:20:38,538 HelpFormatter - --------------------------------------------------------------------------------
INFO 13:20:38,540 HelpFormatter - The Genome Analysis Toolkit (GATK) v3.2-2-gec30cee, Compiled 2014/07/17 15:22:03
INFO 13:20:38,540 HelpFormatter - Copyright (c) 2010 The Broad Institute
INFO 13:20:38,541 HelpFormatter - For support and documentation go to http://www.broadinstitute.org/gatk
INFO 13:20:38,544 HelpFormatter - Program Args: -T ClipReads -CT 51-93 -CR HARDCLIP_BASES -R /home/groups/oroaklab/refs/GRCh37/Homo_sapiens_assembly19.fasta -I /home/groups/oroakdata/SSC/reprocessed/bam/13818.fa.realigned.recal.bam -o /home/groups/oroakdata/SSC/reprocessed/bam/clipped/13818.fa.realigned.recal.clipped.bam
INFO 13:20:38,555 HelpFormatter - Executing as [email protected] on Linux 2.6.32-504.16.2.el6.x86_64 amd64; OpenJDK 64-Bit Server VM 1.7.0_85-mockbuild_2015_07_15_12_57-b00.
INFO 13:20:38,555 HelpFormatter - Date/Time: 2015/07/23 13:20:38
INFO 13:20:38,555 HelpFormatter - --------------------------------------------------------------------------------
INFO 13:20:38,555 HelpFormatter - --------------------------------------------------------------------------------
INFO 13:20:38,651 GenomeAnalysisEngine - Strictness is SILENT
INFO 13:20:38,754 GenomeAnalysisEngine - Downsampling Settings: No downsampling
INFO 13:20:38,766 SAMDataSource$SAMReaders - Initializing SAMRecords in serial
WARNING: BAM index file /home/groups/oroakdata/SSC/reprocessed/bam/13818.fa.realigned.recal.bai is older than BAM /home/groups/oroakdata/SSC/reprocessed/bam/13818.fa.realigned.recal.bam
INFO 13:20:38,797 SAMDataSource$SAMReaders - Done initializing BAM readers: total time 0.03
INFO 13:20:38,903 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
INFO 13:20:38,908 GenomeAnalysisEngine - Done preparing for traversal
INFO 13:20:38,908 ProgressMeter - [INITIALIZATION COMPLETE; STARTING PROCESSING]
INFO 13:20:38,908 ProgressMeter - | processed | time | per 1M | | total | remaining
INFO 13:20:38,909 ProgressMeter - Location | reads | elapsed | reads | completed | runtime | runtime
INFO 13:20:38,909 ClipReads - Creating cycle clipper 50-92
INFO 13:20:38,911 ReadShardBalancer$1 - Loading BAM index data
INFO 13:20:38,912 ReadShardBalancer$1 - Done loading BAM index data
INFO 13:21:08,911 ProgressMeter - 1:40861936 1400017.0 30.0 s 21.0 s 1.3% 38.0 m 37.5 m
INFO 13:21:39,230 ProgressMeter - 1:118574196 3400042.0 60.0 s 17.0 s 3.8% 26.2 m 25.2 m
..
INFO 13:42:23,954 ProgressMeter - X:112926783 7.4461491E7 21.8 m 17.0 s 96.5% 22.5 m 47.0 s
INFO 13:42:54,685 ProgressMeter - GL000192.1:545144 7.6429948E7 22.3 m 17.0 s 100.0% 22.3 m 0.0 s
INFO 13:42:56,178 GATKRunReport - Uploaded run statistics report to AWS S3

ERROR ------------------------------------------------------------------------------------------
ERROR stack trace

java.lang.NullPointerException
at htsjdk.samtools.SAMRecordCoordinateComparator.compare(SAMRecordCoordinateComparator.java:51)
at htsjdk.samtools.SAMRecordCoordinateComparator.compare(SAMRecordCoordinateComparator.java:41)
at java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
at java.util.TimSort.sort(TimSort.java:203)
at java.util.Arrays.sort(Arrays.java:727)
at htsjdk.samtools.util.SortingCollection.spillToDisk(SortingCollection.java:218)
at htsjdk.samtools.util.SortingCollection.add(SortingCollection.java:165)
at htsjdk.samtools.SAMFileWriterImpl.addAlignment(SAMFileWriterImpl.java:179)
at org.broadinstitute.gatk.engine.io.storage.SAMFileWriterStorage.addAlignment(SAMFileWriterStorage.java:95)
at org.broadinstitute.gatk.engine.io.stubs.SAMFileWriterStub.addAlignment(SAMFileWriterStub.java:308)
at org.broadinstitute.gatk.tools.walkers.readutils.ClipReads.reduce(ClipReads.java:473)
at org.broadinstitute.gatk.tools.walkers.readutils.ClipReads.reduce(ClipReads.java:162)
at org.broadinstitute.gatk.engine.traversals.TraverseReadsNano$TraverseReadsReduce.apply(TraverseReadsNano.java:251)
at org.broadinstitute.gatk.engine.traversals.TraverseReadsNano$TraverseReadsReduce.apply(TraverseReadsNano.java:240)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.executeSingleThreaded(NanoScheduler.java:279)
at org.broadinstitute.gatk.utils.nanoScheduler.NanoScheduler.execute(NanoScheduler.java:245)
at org.broadinstitute.gatk.engine.traversals.TraverseReadsNano.traverse(TraverseReadsNano.java:102)
at org.broadinstitute.gatk.engine.traversals.TraverseReadsNano.traverse(TraverseReadsNano.java:56)
at org.broadinstitute.gatk.engine.executive.LinearMicroScheduler.execute(LinearMicroScheduler.java:108)
at org.broadinstitute.gatk.engine.GenomeAnalysisEngine.execute(GenomeAnalysisEngine.java:314)
at org.broadinstitute.gatk.engine.CommandLineExecutable.execute(CommandLineExecutable.java:121)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:248)
at org.broadinstitute.gatk.utils.commandline.CommandLineProgram.start(CommandLineProgram.java:155)
at org.broadinstitute.gatk.engine.CommandLineGATK.main(CommandLineGATK.java:107)

ERROR ------------------------------------------------------------------------------------------
ERROR A GATK RUNTIME ERROR has occurred (version 3.2-2-gec30cee):
ERROR
ERROR This might be a bug. Please check the documentation guide to see if this is a known problem.
ERROR If not, please post the error message, with stack trace, to the GATK forum.
ERROR Visit our website and forum for extensive documentation and answers to
ERROR commonly asked questions http://www.broadinstitute.org/gatk
ERROR
ERROR MESSAGE: Code exception (see stack trace for error itself)
ERROR ------------------------------------------------------------------------------------------

Answers

Sign In or Register to comment.