Heads up:
We’re moving the GATK website, docs and forum to a new platform. Read the full story and breakdown of key changes on this blog.
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.
We will be out of the office for a Broad Institute event from Dec 10th to Dec 11th 2019. We will be back to monitor the GATK forum on Dec 12th 2019. In the meantime we encourage you to help out other community members with their queries.
Thank you for your patience!

MESSAGE: BUG: requested unknown contig=ERCC-00002 index=-1


I'm currently running variant calling on RNA-SEQ data from the ENCODE Project. To streamline the process, I have downloaded their previously aligned RNA-SEQ data (they used STAR aligner.) I then planned on adding read groups, sorting/marking duplicates, reassigning mapping qualities and recalibration before variant calling. However, while on the step to use Split'N'Trim to reassign mapping qualities, I was hit with the following error:

MESSAGE: BUG: requested unknown contig=ERCC-00002 index=-1

I saw a previous thread with somebody having the same issue, and it was recommended to use -fixNDN, but I was wondering if anybody else could pitch in on why this error was caused and if using the previously aligned data will be okay to use with the best practices workflow.

BTW.. for adding read groups and marking duplicates, I simply used the basic parameters outlined in this thread: https://software.broadinstitute.org/gatk/guide/article?id=3891, while making sure to edit the read group information specific for my data.

For the split and trim step that caused the error, this is what was used:

java -jar GenomeAnalysisTK.jar -T SplitNCigarReads -R hg38.fasta -I dedupped.bam -o split.bam -rf ReassignOneMappingQuality -RMQF 255 -RMQT 60 -U ALLOW_N_CIGAR_READS


  • toledo32325toledo32325 ToledoMember
    edited January 2017

    Sorry to double-post, but just to update, even adding the -fixNDN that was recommended didn't work and the same error resulted: ERROR MESSAGE: BUG: requested unknown contig=ERCC-00002 index=-1 The error comes up very close to the process being over( in fact, there was only 2 minutes left when the error came up.)

    Should also mention I'm using the same reference genome that ENCODE used for alignment (GRCh38). I also indexed it using CreateSequencDictionary and Samtools.

    Is there something I'm missing here?

  • SheilaSheila Broad InstituteMember, Broadie admin


    Can you please post the BAM header (specifically the SQ lines) and the FASTA dict file?


Sign In or Register to comment.