Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

UmiAwareMarkDuplicatesWithMateCigar RX problem

Hello, I'm using Picard's mark duplicate for my sample based on Qiagen myeloid panel (Amplicon based, single end primer extension, paired-end reads with UMI). However, I am faced with the follow error. Do I need any information about the proprietary UMI to be able to use the mark duplicate function? Or is it due to some other problem? Thank you very much!

INFO 2019-03-30 10:55:14 UmiAwareMarkDuplicatesWithMateCigar

********** NOTE: Picard's command line syntax is changing.
**********
********** For more information, please see:
**********
********** The command line looks like this in the new syntax:
**********
********** UmiAwareMarkDuplicatesWithMateCigar -I 09H787BM.aligned.sorted.bam -O 09H787BM.aligned.md.bam -M output_duplicate_metrics.txt -UMI_METRICS output_umi_metrics.txt
**********


10:55:14.729 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/operation/RedCellNGS/tools/picard/picard/picard.jar!/com/intel/gkl/native/libgkl_compression.so
[Sat Mar 30 10:55:14 HKT 2019] UmiAwareMarkDuplicatesWithMateCigar UMI_METRICS_FILE=output_umi_metrics.txt INPUT=[09H787BM.aligned.sorted.bam] OUTPUT=09H787BM.aligned.md.bam METRICS_FILE=output_duplicate_metrics.txt MAX_EDIT_DISTANCE_TO_JOIN=1 UMI_TAG_NAME=RX ALLOW_MISSING_UMIS=false MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 TAG_DUPLICATE_SET_MEMBERS=false REMOVE_SEQUENCING_DUPLICATES=false TAGGING_POLICY=DontTag CLEAR_DT=true DUPLEX_UMI=false ADD_PG_TAG_TO_READS=true REMOVE_DUPLICATES=false ASSUME_SORTED=false DUPLICATE_SCORING_STRATEGY=SUM_OF_BASE_QUALITIES PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=UmiAwareMarkDuplicatesWithMateCigar READ_NAME_REGEX=<optimized capture of last three ':' separated fields as numeric values> OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 MAX_OPTICAL_DUPLICATE_SET_SIZE=300000 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json USE_JDK_DEFLATER=false USE_JDK_INFLATER=false
[Sat Mar 30 10:55:14 HKT 2019] Executing as [email protected] on Linux 4.15.0-46-generic amd64; OpenJDK 64-Bit Server VM 1.8.0_192-b01; Deflater: Intel; Inflater: Intel; Provider GCS is not available; Picard version: 2.18.26-SNAPSHOT
INFO 2019-03-30 10:55:33 SortingCollection Creating merging iterator from 4 files
[Sat Mar 30 10:55:33 HKT 2019] picard.sam.markduplicates.UmiAwareMarkDuplicatesWithMateCigar done. Elapsed time: 0.32 minutes.
Runtime.totalMemory()=5966921728
Exception in thread "main" picard.PicardException: Read M01772:224:000000000-CC9BK:1:1108:22328:6034 does not contain a UMI with the RX attribute.
at picard.sam.markduplicates.UmiGraph.<init>(UmiGraph.java:85)
at picard.sam.markduplicates.UmiAwareDuplicateSetIterator.process(UmiAwareDuplicateSetIterator.java:137)
at picard.sam.markduplicates.UmiAwareDuplicateSetIterator.next(UmiAwareDuplicateSetIterator.java:119)
at picard.sam.markduplicates.UmiAwareDuplicateSetIterator.next(UmiAwareDuplicateSetIterator.java:53)
at picard.sam.markduplicates.SimpleMarkDuplicatesWithMateCigar.doWork(SimpleMarkDuplicatesWithMateCigar.java:133)
at picard.sam.markduplicates.UmiAwareMarkDuplicatesWithMateCigar.doWork(UmiAwareMarkDuplicatesWithMateCigar.java:138)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:295)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
Sign In or Register to comment.