The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Get notifications!


You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

Did you remember to?


1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

Did we ask for a bug report?


Then follow instructions in Article#1894.

Formatting tip!


Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ``` ) each to make a code block as demonstrated here.

Jump to another community
Picard 2.9.4 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

After BQSR, Why don't create a header ?

HyunminHyunmin Seoul, KoreaMember

When I run the BQSR with GATK 2.8.1, I can't check the PG information on output bam file.

BWA->Picard dedup->GATK LocalRealign->GATK BQSR

Below is the output BAM file after BQSR step.

@PG ID:GATK IndelRealigner VN:2.3-9-gdcdccbb CL:knownAlleles=[(RodBinding name=knownAlleles source=/nG/Reference/hgdownload.cse.ucsc.edu/goldenPath/hg19/KNOWNINDEL/Mills_and_1000G_gold_standard.indels.hg19.vcf)] targetIntervals=/nG/Data/2077/vcf1/node1/chrY/Databind/chrY.bam.intervals LODThresholdForCleaning=5.0 consensusDeterminationModel=USE_READS entropyThreshold=0.15 maxReadsInMemory=150000 maxIsizeForMovement=3000 maxPositionalMoveAllowed=200 maxConsensuses=30 maxReadsForConsensuses=120 maxReadsForRealignment=20000 noOriginalAlignmentTags=false nWayOut=null generate_nWayOut_md5s=false check_early=false noPGTag=false keepPGTags=false indelsFileForDebugging=null statisticsFileForDebugging=null SNPsFileForDebugging=null
@PG ID:MarkDuplicates PN:MarkDuplicates VN:1.92(1464) CL:net.sf.picard.sam.MarkDuplicates INPUT=[/nG/Data/2077/step2_makebam/node1-1.bam, /nG/Data/2077/step2_makebam/node1-2.bam, /nG/Data/2077/step2_makebam/node1-3.bam, /nG/Data/2077/step2_makebam/node1-4.bam, /nG/Data/2077/step2_makebam/node1-5.bam, /nG/Data/2077/step2_makebam/node1-6.bam] OUTPUT=/dev/stdout METRICS_FILE=/nG/Data/2077/temp/picard_info.txt REMOVE_DUPLICATES=true ASSUME_SORTED=true MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=1000000 TMP_DIR=[/nG/Data/2077/temp] QUIET=true VALIDATION_STRINGENCY=LENIENT COMPRESSION_LEVEL=0 MAX_RECORDS_IN_RAM=2000000 PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=MarkDuplicates MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 SORTING_COLLECTION_SIZE_RATIO=0.25 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 VERBOSITY=INFO CREATE_INDEX=false CREATE_MD5_FILE=false

Related thread (guess)
http://gatkforums.broadinstitute.org/discussion/2118/baserecalibration-prinread-don-t-create-a-header-and-don-t-obtain-oq-orignal-base-quality-in-bam

Tagged:

Best Answer

Answers

  • KurtKurt Member

    Yeah, version 2.3-9 didn't create a @PG tag for the BQSR process. Since BQSR doesn't explicitly create a bam file, you wouldn't get a header from that, but you should get it from PrintReads (but not in version 2.3-9).

    I had a bam file from 2.7-4 lying around and saw the @PG tag for PrintReads that used BQSR, but it isn't verbose. I don't know if newer versions are more verbose or not.

    `@PG ID:GATK IndelRealigner VN:2.7-4-g6f46d11 CL:knownAlleles=[(RodBinding name=knownAlleles source=/usr/local/sandbox/ddl_pipeline_files//Mills_and_1000G_gold_standard.indels.b37.vcf), (RodBinding name=knownAlleles2 source=/usr/local/sandbox/ddl_pipe
    line_files//1000G_phase1.indels.b37.vcf)] targetIntervals=/isilon/ddl/SS_0500181//DDL_021214_A74KD_44872/CFMP1949-1/Reports/Local_Realignment_Intervals/CFMP1949-1.intervals LODThresholdForCleaning=5.0 consensusDeterminationModel=USE_READS entropyThreshold=0.15
    maxReadsInMemory=150000 maxIsizeForMovement=3000 maxPositionalMoveAllowed=200 maxConsensuses=30 maxReadsForConsensuses=120 maxReadsForRealignment=20000 noOriginalAlignmentTags=false nWayOut=null generate_nWayOut_md5s=false check_early=false noPGTag=false keepPG
    Tags=false indelsFileForDebugging=null statisticsFileForDebugging=null SNPsFileForDebugging=null

    @PG ID:MarkDuplicates PN:MarkDuplicates VN:1.103(1598) CL:net.sf.picard.sam.MarkDuplicates INPUT=[/isilon/ddl/SS_0500181/DDL_021214_A74KD_44872/CFMP1949-1/temp/CFMP1949-1.original.bam] OUTPUT=/isilon/ddl/SS_0500181/DDL_021214_A74KD_44872/CFMP19
    49-1/temp/CFMP1949-1.dup.bam METRICS_FILE=/isilon/ddl/SS_0500181/DDL_021214_A74KD_44872/CFMP1949-1/Reports/Picard_Duplicates/CFMP1949-1.picard.duplicates.txt VALIDATION_STRINGENCY=SILENT CREATE_INDEX=true PROGRAM_RECORD_ID=MarkDuplicates PROGRAM_GROUP_NAME=M
    arkDuplicates REMOVE_DUPLICATES=false ASSUME_SORTED=false MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=8000 SORTING_COLLECTION_SIZE_RATIO=0.25 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIX
    EL_DISTANCE=100 VERBOSITY=INFO QUIET=false COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_MD5_FILE=false

    @PG ID:GATK PrintReads VN:2.7-4-g6f46d11 CL:readGroup=null platform=null number=-1 sample_file=[] sample_name=[] simplify=false no_pg_tag=false
    `

  • HyunminHyunmin Seoul, KoreaMember

    Yes, Both version 2.3.9 and 2.8.1 didn't creat a @PG tag for the BQSR process.
    I think this is the bug.

  • ebanksebanks Broad InstituteMember, Broadie, Dev

    Hi Hyunmin, you should have a PG tag in there for PrintReads if you are using the latest version. It definitely does that now (as @Kurt has posted).

  • HyunminHyunmin Seoul, KoreaMember

    Hi Hyunmin, you should have a PG tag in there for PrintReads if you are using the latest version. It definitely does that now (as @Kurt has posted).

    How can I handle the PG tag in PrintReads command?

    My Command..

    java -jar /BiOfs/hmkim87/BioTools/GATK/2.8.1/GenomeAnalysisTK.jar -T PrintReads -R ref.fa -I T1304D2110.dedup.realigned.bam -o T1304D2110.dedup.realigned.recal.bam

  • ebanksebanks Broad InstituteMember, Broadie, Dev

    You don't need to handle it - it should be put there automatically.

  • HyunminHyunmin Seoul, KoreaMember

    @Kurt said:
    Looking at your command, it makes it look like you want to write a bam file with recalibrated base call quality scores, but you are not supplying it with table created in the -T BaseRecalibrator step that you should have done previously.

    Your command should've of been;

    java -jar GenomeAnalysisTK.jar -T PrintReads -R ref.fa -I dedup.realigned.bam -BQSR bqsr.table -o dedup.realigned.recal.bam

    with the bqsr.table being created in the -T BaseRecalibrator step

    Thanks!! I forgot it with behave naturally.

Sign In or Register to comment.