To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

How to get sample names based on genotype from multi-sample vcf file

meharmehar Member
edited November 2015 in Ask the GATK team

Hi,

I have multi-sample vcf file and an example variant is shown below:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 03-071 04-051 04-071 06-044 07-085 10-009

chr1 6526093 . T C 197.77 . AC1=1;AC=1;AF1=0.5 GT:GQ:DP:PL:AD 0/1 1/1 0/0 1/1 1/1 0/1

For each variant i would need to retrieve the sample names based on genotype. If the genotype is "0/1" could it be possible to select only the samples for which the genotype is "0/1" and similarly for the genotype "1/1"?

SelectVariants filters variants based on the different quality values provided. Here i would need to subset the vcf by sample for each variant based on the genotype. Are there any tools which can do this?

Best Answer

Answers

Sign In or Register to comment.