RealignerTargetCreator / IndelRealigner

Hi,

I am using GATK 3.1-1 on Ion torrent data. I am using human reference as provided by Ion Torrent people. I used picard to remove the duplicates. RealignerTargetCreator / IndelRealigner tools (with indel.vcf files from 1000G_phase1 and Mills) runs without any error on TMAP produced .bam files. But I am not sure whether the tools are doing what they are suppose to do? The stdout with informative lines are below.

Can anyone help me interpreting -
219901 reads (97.07% of total) failing DuplicateReadFilter (Does it mean - none of the reads were duplicate?)
1539 reads (0.68% of total) failing MappingQualityZeroFilter (Does it mean 1539 reads are at multiple location or low quality score?)

INFO 17:57:10,596 ProgressMeter - chr2:108035321 3.29e+08 30.0 s 0.0 s 11.5% 4.3 m 3.8 m
INFO 17:57:40,597 ProgressMeter - chr4:55195129 7.26e+08 60.0 s 0.0 s 24.1% 4.2 m 3.2 m
INFO 17:58:10,599 ProgressMeter - chr6:75886121 1.12e+09 90.0 s 0.0 s 36.8% 4.1 m 2.6 m
INFO 17:58:40,600 ProgressMeter - chr8:136959289 1.52e+09 120.0 s 0.0 s 49.4% 4.0 m 2.0 m
INFO 17:59:10,601 ProgressMeter - chr11:127358689 1.92e+09 2.5 m 0.0 s 62.8% 4.0 m 88.0 s
INFO 17:59:40,602 ProgressMeter - chr15:60257869 2.31e+09 3.0 m 0.0 s 76.5% 3.9 m 55.0 s
INFO 18:00:10,604 ProgressMeter - chr20:41377801 2.74e+09 3.5 m 0.0 s 89.2% 3.9 m 25.0 s
INFO 18:00:40,605 ProgressMeter - chrM:16485 3.06e+09 4.0 m 0.0 s 100.0% 4.0 m 0.0 s
INFO 18:01:00,391 ProgressMeter - done 3.10e+09 4.3 m 0.0 s 100.0% 4.3 m 0.0 s
INFO 18:01:00,392 ProgressMeter - Total runtime 259.80 secs, 4.33 min, 0.07 hours
INFO 18:01:00,451 MicroScheduler - 221440 reads were filtered out during the traversal out of approximately 226527 total reads (97.75%)
INFO 18:01:00,451 MicroScheduler - -> 0 reads (0.00% of total) failing BadCigarFilter
INFO 18:01:00,452 MicroScheduler - -> 0 reads (0.00% of total) failing BadMateFilter
INFO 18:01:00,452 MicroScheduler - -> 219901 reads (97.07% of total) failing DuplicateReadFilter
INFO 18:01:00,452 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
INFO 18:01:00,452 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter
INFO 18:01:00,452 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter
INFO 18:01:00,453 MicroScheduler - -> 1539 reads (0.68% of total) failing MappingQualityZeroFilter
INFO 18:01:00,453 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter
INFO 18:01:00,453 MicroScheduler - -> 0 reads (0.00% of total) failing Platform454Filter
INFO 18:01:00,466 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter

Thank you for the help,
SK

Tagged:

Best Answer

Answers

  • surendrksurendrk Member

    Thanks Sheila. Yes, its not regular whole exome sequence analysis. The data are from cancer panel chip version 2 and I can see that why there is so many duplicates.
    SK

Sign In or Register to comment.