To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at

-L option with PrintReads

nahmednahmed TU DelftMember
edited March 5 in Ask the GATK team

I am using GATK-3.6 to analyze exome sequencing data. To speed up the analysis I split the BED file provided in GATK resource bundle (Broad.human.exome.b37.bed) into several files, one for each chromosome. I created the BQSR tables separately (in parallel) for each chromosome using the -L option. Under theses circumstances can I use the -L option with PrintReads to run on each BQSR table separately as well. It is not recommended in The input to PrintReads file is a single deduplicated BAM file. Indel realignment is not performed

Best Answer


  • nahmednahmed TU DelftMember
    edited March 6

    Hi @Geraldine_VdAuwera,
    Thank you for the answer. Is my approach OK for GATK3? As described above the input is a single deduplicated BAM file. Output is multiple BAMs, one per table.

    Post edited by nahmed on
  • SheilaSheila Broad InstituteMember, Broadie, Moderator
    edited March 9


    Assuming you have enough data per chromosome to run BQSR, you can use -L with BaseRecalibrator then use PrintReads without -L. The reason is that you don't want to lose any data with -L when you output the final BAM file. BaseRecalibrator will only run on the -L intervals, but PrintReads with -bqsr will recalibrate all reads/bases and output any that are not included in -L.

    I hope this makes sense.


    EDIT: Ideally, you want to run the whole process without -L, so the tools have enough data to produce proper models.

Sign In or Register to comment.