Holiday Notice:
The Frontline Support team will be offline February 18 for President's Day but will be back February 19th. Thank you for your patience as we get to all of your questions!

No common samples in VCF and BAM headers, so nothing could possibly be phased!

alisonewralisonewr Member
edited November 2015 in Ask the GATK team

I want to phase some DNA-seq data.

java -jar GenomeAnalysisTK.jar -T ReadBackedPhasing -R ref.fasta -I readnames.bam --variant test.vcf -L Chr.list -o phased_SNPs.vcf --phaseQualityThresh 20.0

My vcf file looks like this and only contains information for 1 sample

fileformat=VCFv4.0

source=pileup_to_vcf.pyV1.2

INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">

INFO=<ID=SAF,Number=.,Type=Float,Description="Specific Allele Frequency">

FILTER=<ID=DP,Description="Minimum depth of 10">

FILTER=<ID=SAF,Description="Allele frequency of at least 0.3 with base quality minimum 0">

CHROM POS ID REF ALT QUAL FILTER INFO

NC_024331.1 131 . G GA . PASS SAF=0.655738;DP=61
NC_024331.1 147 . C G . PASS SAF=0.320000;DP=25
NC_024331.1 422 . C A . PASS SAF=0.414545;DP=275

I previously had an error message saying my bam file did not have read names. I ran
java -jar AddOrReplaceReadGroups.jar I=sorted.bam O=readnames.bam RGLB=LaneX RGPU=NONE RGSM=AnySampleName RGPL=illumina

Now I am getting an error

ERROR
ERROR MESSAGE: No common samples in VCF and BAM headers, so nothing could possibly be phased!

Is there somewhere in the header of the vcf I can add AnySampleName?

Thanks!!

Best Answer

Answers

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Hi there,

    The problem is deeper than that -- you need to have genotype calls for your samples in the VCF. Otherwise there's nothing to phase. Have you read the documentation about phasing?

  • alisonewralisonewr Member
    edited November 2015

    Thanks! So as I only have one sample, does that mean I have to specify
    -v, --VCF Compute genotype likelihoods and output them in the variant call format (VCF) when I run samtools mpileup?

    And as my bam file is only one sample, it has no sample information in the header. How do I fix this?

    Thanks very much!!

Sign In or Register to comment.