Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

What would be the correct approach to simulate reads from two parents?

I have simulated pair end reads from two parental lines. These reads were combined to simulate an F1 cross. Later I aligned the reads with one of the parents and generate the BAM files that were then read by Genome STRiP.
I'm getting these two different errors.

1.
java.lang.RuntimeException: Mismatched read pair records found:

Not sure how to interpret and fix this.

2.
java.lang.IllegalArgumentException: Read pair records have different read groups: scf7180000037249-id.whte-28-21004: ID_3_4,ID_12_4

When I generated the RG tags I run this command:

java -jar /share/apps/picard-tools/AddOrReplaceReadGroups.jar I=F1_indiv${i}4x_bwa_mem_sorted.bam O=F1_indiv${i}_4x_bwa_mem_sorted_rg.bam SORT_ORDER=coordinate CREATE_INDEX=true RGPL=illumina RGID=ID${i}4 RGSM=indiv${i} RGPU=ART${i}_4 RGLB=ART_popv3_whte

But this is not considering that half of the reads in the fastq file were from one parent and the other from the second parent. Is there a way to correctly generate the RG tags in this scenario?

Any help and or comment will be appreciated!!!

Best,

ARW

Comments

  • bhandsakerbhandsaker Member, Broadie, Moderator admin

    Regarding the first problem, you would have to provide more information about the error (i.e. the rest of the output).

    I suspect the problem in both cases, however, is that the bam files you generated don't follow the SAM file conventions.
    The second error is pretty self-explanatory: If you have a read pair, we expect the RG tag to be the same on both mates.

Sign In or Register to comment.