How do I split a pileup by tag (rather than sample id)?

maramara Posts: 10Member

Hello,
I am writing my own walker and am currently using: rawContext.getPileup().getPileupForSample('sample_name') to generate pileups specific to a given sample. However, can I split the pileup by bam tag instead?
Thank you.

Best Answer

Answers

  • CarneiroCarneiro Posts: 274Administrator, GATK Dev admin

    do you mean a generic Tag? Or the read group id tag?

    There are some generic functionality in the pileup implementation like getPileupForLane and mapping and base quality filters (I encourage you to take a look at the class).

    The read group ID would be tricky because the GATK can (and will) modify id's internally (if necessary) to make aggregated BAMs have unique IDs.

  • maramara Posts: 10Member

    Thank you for the advice.
    The tag I was referring to is the tag for the input bam file (ex: I:tumor_bam and I:normal_bam) from getToolkit().getReadsDataSource().getReaderIDs().getTags().getPositionalTags(). I am inputting two BAMs that potentially have the same sample name, and, ideally, I would like to have pileups for each BAM individually.

  • CarneiroCarneiro Posts: 274Administrator, GATK Dev admin

    that's very interesting. As we never had this use case, we never actually implemented this particular stratification.

    I don't think the positional tags get passed on to the Reads (I might be wrong though!). If they do, it's very easy (check the picard API to see if they are accessible from the SAMRecord. If not, there are ways around it but none of them are going to look pretty.

  • maramara Posts: 10Member

    Thanks. Would you be able to elaborate on how to check the picard API or point me to an example where this would be implemented?

Sign In or Register to comment.