Print Reads in GATK

Hi,
I am unable to execute 'Print Reads' as I get the following error:

Unsupported CIGAR operator N in read read704 at chr21:33029208. Perhaps you are trying to use

RNA-Seq data? While we are currently actively working to support this data type unfortunately

the GATK cannot be used with this data in its current form. You have the option of either

filtering out all reads with operator N in their CIGAR string(please add --

filter_reads_with_N_cigar to your command line) or assume the risk of processing those reads as

they arre including the pertinent unsafe flag(please add -U ALLOW_N_CIGAR_READS to your command

line). Notice however that if you were to choose the latter, an unspecified subset of the

analytical outputs of an unspecified subset of the tools will become unpredictable. Consequently

the GATK team might well not be able to provide you with the usual support with any issue

regarding any output.
I tried both ways of filter option and allow cigar reads....but I still have the same error. Can you please help me in resolving this as soon as you can?

Thanks and Regards

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin
    edited October 2014

    @anniekatam‌

    Hi,

    Please tell us which version of GATK you are using and the exact command line you used.

    Is your data actually RNA-seq data? If so, have you looked at the RNA-seq documentation?

    Thanks,
    Sheila

  • anniekatamanniekatam USMember
    edited October 2014

    Hi Sheila,

    I am using GATK 2.8-1 version and the command line used is:
    gatk2_wrapper.py
    --stdout "${output_log}"
    -d "-I" "${reference_source.input_bam}" "${reference_source.input_bam.ext}" "gatk_input"
    #if str( $reference_source.input_bam.metadata.bam_index ) != "None":
    -d "" "${reference_source.input_bam.metadata.bam_index}" "bam_index" "gatk_input" ##hardcode galaxy ext type as bam_index
    #end if
    -p '
    @[email protected]
    -T "PrintReads"
    -o "${output_bam}"
    \$GATK2_SITE_OPTIONS

    ## according to http://www.broadinstitute.org/gatk/guide/article?id=1975
    --num_cpu_threads_per_data_thread \${GALAXY_SLOTS:-6}
    
    
    #if $reference_source.reference_source_selector != "history":
        -R "${reference_source.ref_file.fields.path}"
    #end if
    #if str($input_recal) != 'None':
        --BQSR "${input_recal}"
    #end if
    --disable_bam_indexing
    

    I am not sure if my data is RNA-seq...how can I find that out?Can you please help me?

    Thanks and Regards

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @anniekatam‌

    Hi,

    You should find out what type of data you are working with from the person who gave it to you.

    I see you are using GATK version 2.8. RNA analysis capabilities were not yet available in that version of GATK.

    If you are just testing GATK out, you can find some DNA data and try with that. Otherwise, you should try running GATK directly rather than through Galaxy because it will allow you to use the latest version with better features, fewer bugs, and less time spent troubleshooting.

    I hope this helps.

    -Sheila

  • anniekatamanniekatam USMember

    Hi Sheila,
    Thanks for that. I have downloaded BAM file from some website and trying with that.
    Can you check my bam file and see what it has specifically in relation to the error that I got?If yes, can u please tell me to whom I should send that file? Iam unable to upload here as it says bam type is not allowed.

    Thanks and Regards

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    @anniekatam We don't have time to check for issues in some random dataset of unknown provenance. Quite a lot of people depend on us to help them get their analyses to work. If you want to try the tools with some test data, you can download some test files from our FTP server (see the FAQs). If you have a problem with those, we can help you, because we know they are well formatted and will not cause any weird errors.

  • anniekatamanniekatam USMember

    Hi Geraldine, Thanks for your suggestion. I tried to connect to the FTP site for downloading...I am now looking for test files for all tools in GATK. Can you please guide me as to where I have to look for them?

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @anniekatam‌

    Hi,

    You can click the link in this FAQ article: http://gatkforums.broadinstitute.org/discussion/1215/how-can-i-access-the-gsa-public-ftp-server#latest, then select either 2.5 or 2.8 and choose the ExampleFASTA/

    This will take you to a download page for all the files you will need to practice with GATK.
    -Sheila

  • anniekatamanniekatam USMember
    edited October 2014

    Hi Sheila, Thanks for your steps. However as soon as I connect to FTP for download: I see many different folders but not 2.5 or 2.8....is there any specific folder that I need to go in to get 2.8 one? Can you please help? I think I got it....it present under the folder bundle....will try downloading it. Thanks once again.

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Look into "bundle" first, then 2.8, then the name of the reference build you are using.

Sign In or Register to comment.