If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We encourage our fourm members to be more involved, jump in and help out your fellow researchers with their questions. GATK forum is a community forum and helping each other with using GATK tools and research is the cornerstone of our success as a genomics research community.We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra

Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

What's the best way to process multiple samples in "Data pre-processing for variant discovery"

In the Best Practice "Data pre-processing for variant discovery," the json file include parameters PreProcessingForVariantDiscovery_GATK4.sample_name and PreProcessingForVariantDiscovery_GATK4.flowcell_unmapped_bams_list. The flowcell_unmapped_bams_list is mean for include multiple bam files from the same sample, and sample_name is for the actual sample name. So this pipeline can only process one sample per run.

My question is how to or is there a way to parameterized this pipeline to process multiple samples in one call? Or how to integrate this pipeline into a script to process multiple samples? For example, is it possible to give the sample_name a value like $NAME and get the actual sample name from a shell variable? How about the flowcell_unmapped_bams_list?

Thank you very much for the help!



Sign In or Register to comment.