To celebrate the release of GATK 4.0, we are giving away free credits for running the GATK4 Best Practices pipelines in FireCloud, our secure online analysis portal. It’s first come first serve, so sign up now to claim your free credits worth $250. Sponsored by Google Cloud. Learn more at https://software.broadinstitute.org/firecloud/documentation/freecredits

How do I specify a list of samples for GenotypeGVCFs?

This is the recommended code for GenotypeGVCFs

java -jar GenomeAnalysisTK.jar \
   -T GenotypeGVCFs \
   -R reference.fasta \
   --variant sample1.g.vcf \
   --variant sample2.g.vcf \
   -o output.vcf

Is there some way to specify input g.vcfs from a variable or a text file with sample names?

echo "$files"
s1.g.vcf
s2.g.vcf

or

cat files.txt
s1.g.vcf
s2.g.vcf

I tried --variant $files and --variant <(echo $files), but that doesn't work.

Tagged:

Best Answer

  • rmfrmf Member
    Accepted Answer

    This approach worked for me:

    # get all files ending with g.vcf and add --variant before it
    samples=$(find . | sed 's/.\///' | grep -E 'g.vcf$' | sed 's/^/--variant /')
    

    Then

    java -jar GenomeAnalysisTK.jar \
       -T GenotypeGVCFs \
       -R reference.fasta \
       -o output.vcf \
       $(echo $samples)
    

Answers

  • Greg_OwensGreg_Owens Member
    edited December 2017

    For bash scripts, I use a loop to build up a single variable that has all my samples. For example:

    while read prefix
    do
            tmp="$tmp --variant $gvcf/$prefix.g.vcf"
    done < $path/gvcf.list.txt
    

    Then just put $tmp in when I call GATK.

  • rmfrmf Member
    Accepted Answer

    This approach worked for me:

    # get all files ending with g.vcf and add --variant before it
    samples=$(find . | sed 's/.\///' | grep -E 'g.vcf$' | sed 's/^/--variant /')
    

    Then

    java -jar GenomeAnalysisTK.jar \
       -T GenotypeGVCFs \
       -R reference.fasta \
       -o output.vcf \
       $(echo $samples)
    
  • SkyWarriorSkyWarrior TurkeyMember

    Put all the file names in a single file named files.list or whatever. Give that file as --variant parameter and you are set. You don't need to fiddle with loops and other things.

Sign In or Register to comment.