-rf BadCigar

newbie16newbie16 Member Posts: 38
edited August 2012 in Ask the GATK team

Hello,

I have a bam file where few reads have CIGAR strings that start with Deletions. For example: 440H1D33M1I1D33M.
I am trying to execute BaseRecalibrator (2.0 beta) on this file. However, I see an error below:

"##### ERROR MESSAGE: SAM/BAM file SAMFileReader{/home/datarig/CGP/GatkAnalysis/NG_2012_05_10_v2.6/WithGATK2.0/with_SamV2/NG_R1/test.ordered.sorted.realigned.bam} is malformed: Read starting with de
letion. Cigar: 440H1D33M1I1D33M. This is an indication of a malformed file, but the SAM spec allows reads starting in deletion. If you are sure you want to use this read, re-run your analysis with
the extra option: -rf BadCigar"

However if I use the -rf BadCigar filter, I still get the same error. The command I used is pasted below.

"java -Xmx4g -jar GenomeAnalysisTK.jar -T BaseRecalibrator -I test.bam -R ucsc.hg19.fasta -knownSites dbsnp_135.hg19.vcf -o recal_data.grp -rf BadCigar"

Could you please let me know what I am doing wrong?

Thanks

Tagged:

Best Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,295 admin
    Answer ✓

    Hi there,

    You're not doing anything wrong, this is an issue that the tool is not handling properly. We'll fix it asap and post a notice in this thread when it's done.

    Geraldine Van der Auwera, PhD

  • ebanksebanks Broad InstituteMember, Administrator, Broadie, Moderator, Dev Posts: 698 admin
    Answer ✓

    Thanks for the feedback. I'll update the error message now.

    Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,295 admin
    Answer ✓

    Hi there,

    You're not doing anything wrong, this is an issue that the tool is not handling properly. We'll fix it asap and post a notice in this thread when it's done.

    Geraldine Van der Auwera, PhD

  • newbie16newbie16 Member Posts: 38

    Thanks !
    Also, if I may make a suggestion, the error I get in its message says:
    " If you are sure you want to use this read, re-run your analysis with the extra option: -rf BadCigar" which indicates that the filter will somehow "use" the bad reads and INCLUDE them during analysis. However, the documentation for -rf Badcigar says that this option will DISCARD bad reads. This is conflicting information. could this please be modified as well.

  • ebanksebanks Broad InstituteMember, Administrator, Broadie, Moderator, Dev Posts: 698 admin
    Answer ✓

    Thanks for the feedback. I'll update the error message now.

    Eric Banks, PhD -- Director, Data Sciences and Data Engineering, Broad Institute of Harvard and MIT

  • maquezaihoumaquezaihou Member Posts: 10

    I had the same error! Adding "-rf BadCigar" doesn't help. Any suggestions?

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,295 admin

    Hi there,

    What version of GATK are you using?

    Geraldine Van der Auwera, PhD

  • maquezaihoumaquezaihou Member Posts: 10

    Hi Geraldinne,
    I am using the latest version. In fact, I am dealing with RNA-seq data. I followed the best practice, everything went well except unifiedgenotyper. I always got the error message "bad cigar", even if I added '-rf BadCigar'. I don't know what should I do. Do you have any suggestion?
    Thank you in advance!

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,295 admin

    You should validate your bam file first. You can also check a set of reads and see if they all have bad cigar problems, or if it's just a subset.

    Geraldine Van der Auwera, PhD

  • blueskypyblueskypy Member Posts: 261 ✭✭

    I'm working on exome seq and using GATK version 2.7-2-g6bda569 and still get the same error in HC!

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,295 admin

    @blueskypy, have you validated your files?

    Geraldine Van der Auwera, PhD

  • alejandraalejandra spainMember Posts: 52

    so the -rf BadCigar somehow "use" the bad reads and INCLUDE them during analysis or discards the bad reads?
    Which one is the correct?

  • Geraldine_VdAuweraGeraldine_VdAuwera Administrator, Dev Posts: 10,295 admin

    No, if you use that flag the engine will filter out those reads so they will not be used in analysis.

    Geraldine Van der Auwera, PhD

Sign In or Register to comment.