We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

SelectVariants by sample names file

I need to subset a list of samples from a large vcf.gz file. The sample names was saved in a plain txt file, each name in a row. I used

-RF -sf my.sample.names.txt

but kept getting error.
Any suggestions? Thanks!
Tagged:

Best Answer

Answers

  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin

    Hi @ysgz7

    please post the version of gatk you are using, the exact command and the entire error log.

  • ysgz7ysgz7 Member
    GATK version: 4.1.2.0
    my command:

    ./gatk SelectVariants -V myfile.vcf.gz -O myoutput.vcf -RF -sf my.sample.names.txt

    ***********************************************************************

    A USER ERROR has occurred: Invalid argument 'my.sample.names.txt'.

    ***********************************************************************
    Thank you!
  • bhanuGandhambhanuGandham Cambridge MAMember, Administrator, Broadie, Moderator admin
    edited June 2019

    @ysgz7

    You need to use -sn as the argument for sample names as opposed to -sf

  • ysgz7ysgz7 Member
    It doesn't work either. As my understanding, followed by -sn should be the name(s), but I'm using a txt file with about 100 names. Am I right?

    I was reading the table here software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_variantutils_SelectVariants.php
    --sample_file
    -sf NA File containing a list of samples to include

    Thank you!
  • cnormancnorman United StatesMember, Broadie, Dev ✭✭

    @ysgz7 You're running GATK v4.1.2.0, but referencing the GATK v3.8.0 documentation. If you want to use a filename with SelectVariants -sn in v4.1.2.0, the filename needs to end with a .args suffix. See this for more detail.

  • ysgz7ysgz7 Member
    > @cnorman said:
    > @ysgz7 You're running GATK v4.1.2.0, but referencing the GATK v3.8.0 documentation. If you want to use a filename with SelectVariants -sn in v4.1.2.0, the filename needs to end with a .args suffix. See this for more detail.

    Thank you so much for your reply!

    Now I changed my .txt file extension to my.sample.names.args, put this file and the myfile.vcf.gz under the gatk folder, used the following command
    ./gatk SelectVariants -V myfile.vcf.gz -O myoutput.vcf -RF -sn my.sample.names

    But I still get the error message:
    ***********************************************************************

    A USER ERROR has occurred: Invalid argument 'pan.sample.names'.

    ***********************************************************************

    What sounds like incorrect?

    Thank you!
Sign In or Register to comment.