Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
Attention:
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!

Error running Picard Tools v2.10.9

Hello- I'm trying to run MarkDuplicates in Picard Tools v2.10.9. Java version is 1.8.0_144. The data is a .bam file that has been sorted and indexed in samtools. (exome capture sequences from HiSeq 4000).

I used the following command line call to try the run:
java -Xmx5g -jar /Users/paa9/Desktop/picard.jar MarkDuplicates I=CAALB_20170711_K00134_IL100090633_EC-12_L004_R1.fastq.gz.srt.bam O=CAALB_20170711_K00134_IL100090633_EC-12_L004_R1.fastq.gz.srt.rmDP.bam METRICS_FILE=CAALB_20170711_K00134_IL100090633_EC-12_L004_R1.fastq.gz.srt.rmDP.mtrc REMOVE_DUPLICATES=true

At first it looks like it is running successfully, but then I get a series of error messages:
INFO 2017-08-14 16:03:59 MarkDuplicates Start of doWork freeMemory: 1015051672; totalMemory: 1029177344; maxMemory: 4772593664
INFO 2017-08-14 16:03:59 MarkDuplicates Reading input file and constructing read end information.
INFO 2017-08-14 16:03:59 MarkDuplicates Will retain up to 17292006 data points before spilling to disk.
INFO 2017-08-14 16:04:08 MarkDuplicates Read 1,000,000 records. Elapsed time: 00:00:07s. Time for last 1,000,000: 7s. Last read position: JXUM01S000636:264,164
INFO 2017-08-14 16:04:08 MarkDuplicates Tracking 30491 as yet unmatched pairs. 4430 records in RAM.
INFO 2017-08-14 16:04:21 MarkDuplicates Read 2,000,000 records. Elapsed time: 00:00:20s. Time for last 1,000,000: 12s. Last read position: JXUM01S001819:138,692
INFO 2017-08-14 16:04:21 MarkDuplicates Tracking 41344 as yet unmatched pairs. 5 records in RAM.
INFO 2017-08-14 16:04:36 MarkDuplicates Read 3,000,000 records. Elapsed time: 00:00:36s. Time for last 1,000,000: 15s. Last read position: JXUM01S004018:43,724
INFO 2017-08-14 16:04:36 MarkDuplicates Tracking 48378 as yet unmatched pairs. 9 records in RAM.

Them jere is the error message I get…

To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: /var/folders/t4/1tm2l_xd5r9c9lwhrjsfzwb80000gn/T/CSPI.4085908425341095212.tmp/4388.tmpnot found
at htsjdk.samtools.util.FileAppendStreamLRUCache$Functor.makeValue(FileAppendStreamLRUCache.java:64)
at htsjdk.samtools.util.FileAppendStreamLRUCache$Functor.makeValue(FileAppendStreamLRUCache.java:49)
at htsjdk.samtools.util.ResourceLimitedMap.get(ResourceLimitedMap.java:76)
at htsjdk.samtools.CoordinateSortedPairInfoMap.getOutputStreamForSequence(CoordinateSortedPairInfoMap.java:180)
at htsjdk.samtools.CoordinateSortedPairInfoMap.ensureSequenceLoaded(CoordinateSortedPairInfoMap.java:102)
at htsjdk.samtools.CoordinateSortedPairInfoMap.remove(CoordinateSortedPairInfoMap.java:86)
at picard.sam.markduplicates.util.DiskBasedReadEndsForMarkDuplicatesMap.remove(DiskBasedReadEndsForMarkDuplicatesMap.java:61)
at picard.sam.markduplicates.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:518)
at picard.sam.markduplicates.MarkDuplicates.doWork(MarkDuplicates.java:228)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:268)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:96)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:106)
Caused by: java.io.FileNotFoundException: /var/folders/t4/1tm2l_xd5r9c9lwhrjsfzwb80000gn/T/CSPI.4085908425341095212.tmp/4388.tmp (Too many open files in system)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.(FileOutputStream.java:213)
at htsjdk.samtools.util.FileAppendStreamLRUCache$Functor.makeValue(FileAppendStreamLRUCache.java:61)
... 11 more

To check and see if it is something wrong with the input file, I ran the “ValidateSamFile” in Picard Tools using the following call:

java -jar /Users/paa9/Desktop/picard.jar ValidateSamFile I=CAALB_20170711_K00134_IL100090633_EC-12_L004_R1.fastq.gz.srt.bam MODE=SUMMARY

And here is the terminal output I get:

HISTOGRAM java.lang.String

Error Type Count
ERROR:MATE_NOT_FOUND 56371
ERROR:MISSING_READ_GROUP 1
WARNING:RECORD_MISSING_READ_GROUP 4475203

Thank you for any suggestions!!!

Best Answers

Answers

Sign In or Register to comment.