It looks like you're new here. If you want to get involved, click one of these buttons!
I am running Pileup with the verbose option. I have two questions regarding it. (1) Why are all the value in the mapping quality column 0 ? (2)There is another column, not mentioned in the description of pileup, separated by '@'. What does this column mean ?
11 86988 A A D 0 C37@931@1036@0 11 86989 G G D 0 C37@932@1036@0 11 86990 T T B 0 C37@933@1036@0 11 86991 G G D 0 C37@934@1036@0 11 86992 A A B 0 C37@935@1036@0 11 86993 C C C 0 C37@936@1036@0 11 86994 C CCC D=A 0 C37@937@1036@0,38@0@100@0,39@0@100@0
Thanks, Arshi
ebanks
Posts: 481 mod
The mapping quality isn't emitted by default, so you can't be seeing them at all with that command line. I think perhaps you are seeing 0s because there are no RODs (e.g. VCFs) being input.
Eric Banks, PhD -- Group Leader, Methods Development, MPG, Broad Institute of Harvard and MIT
Answers
Thanks for reporting this. I'm just about to add documentation for the verbose output. Here's what it will say: In addition to the standard pileup output, adds 'verbose' output too. The verbose output contains the number of spanning deletions, and for each read in the pileup it has the read name, offset in the base string, read length, and read mapping quality. These per read items are delimited with an '@' character.
Eric Banks, PhD -- Group Leader, Methods Development, MPG, Broad Institute of Harvard and MIT
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Thanks ! Could you also help me with the issue where I am getting all 0 mapping qualities ?. I have checked my data in IGV, and very few of my reads have 0 mapping qualities. I get the correct phred quality scores, though. This is how I am running Pileup, java -Xmx8g -path/toGATK.jar/ \ -T Pileup \ -R path/toGATK/resources/hg19.fa \ -I a.bam \ -o a.pileup
Thanks
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Thanks a lot Eric !. The INDEL is a great option in pileup. I am also trying to get all the INDELS and SNPs through UnifiedGenotyper (-glm BOTH). Is there a way that GATK can output the number of Indels at each position. Similar to a pileup format ?. I am interested in both known and predictive INDELS and their count. Perhaps I can use the .vcf file from UnifiedGenotyper ?.
Thanks, Arshi
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Just to be a littel clear, I tried the --metadata option in Pileup and used the 1000G_indel.vcf file as RODs.
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Hmm, no I don't think you can do what you want with the GATK right now.
Eric Banks, PhD -- Group Leader, Methods Development, MPG, Broad Institute of Harvard and MIT
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •Ok. Thanks for your quick reply !
- Spam
- Abuse
- Troll
0 • Off Topic Disagree Agree Like WTF •