output the 0|0 for PGT

frankfengfrankfeng Member
edited January 2016 in Ask the GATK team

Hi,

In the output VCF of GenotypeGVCFs, there are 0|1, 1|0, 1|1 and . for PGT, and no 0|0. When seeing a . for PGT, I feel uncertain, as I don't know whether it is actually a 0|0 or missing data (e.g. due to zero useful coverage at the position). So it will be great if GenotypeGVCFs can also output the 0|0.

Tested GATK version: V3.5

Thanks!

Frank

Tagged:

Best Answer

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @frankfeng
    Hi Frank,

    Are you using -allSites in you GenotypeGVCFs command?

    Can you post a few example records of the . in the PGT field?

    Thanks,
    Sheila

  • frankfengfrankfeng Member

    @Sheila
    Hi Sheila,

    No, I am not using -allSites in my GenotypeGVCFs command, and my understanding is that the -allSites option has nothing to do with PGT (please correct me if I am wrong).

    Below is an example VCF output of GenotypeGVCFs (only the FORMAT and Genotype sections are shown; see the . for PGT and PID of SAMPLE-2):

    FORMAT     SAMPLE-1     SAMPLE-2     SAMPLE-3
    GT:AD:DP:GQ:PGT:PID:PL     1/1:0,96:96:99:1|1:1019175_C_G:3803,287,0     1/1:0,111:111:99:.:.:3099,326,0     0/1:60,49:109:99:0|1:1019175_C_G:1795,0,2287
    GT:AD:DP:GQ:PGT:PID:PL     0/1:38,68:106:99:0|1:1019175_C_G:1540,0,776     0/0:121,0:121:99:.:.:0,120,1800     0/1:63,56:119:99:0|1:1019175_C_G:1923,0,2304

    What I expected is below (see what replace the . for PGT and PID of SAMPLE-2):

    FORMAT     SAMPLE-1     SAMPLE-2     SAMPLE-3
    GT:AD:DP:GQ:PGT:PID:PL     1/1:0,96:96:99:1|1:1019175_C_G:3803,287,0     1/1:0,111:111:99:1|1:1019175_C_G:3099,326,0     0/1:60,49:109:99:0|1:1019175_C_G:1795,0,2287
    GT:AD:DP:GQ:PGT:PID:PL     0/1:38,68:106:99:0|1:1019175_C_G:1540,0,776     0/0:121,0:121:99:0|0:1019175_C_G:0,120,1800     0/1:63,56:119:99:0|1:1019175_C_G:1923,0,2304

    As I mentioned in my previous post, there is no 0|0 PGT output at all. If I want 0|0 output, do I need to turn on a specific option for HaplotypeCaller or GenotypeGVCFs? Or that is not yet implemented with HaplotypeCaller and GenotypeGVCFs?

    Thanks!

    Frank

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @frankfeng
    Hi Frank,

    I think this is a representation issue. Let me check with the team and get back to you. Basically, sample 1 and sample 3 are phased at those two sites, but since sample 2 is homozygous reference at the second site, it is not phased. Can you please post the original bam file and bamout file of the region for sample 2?

    Thanks,
    Sheila

    Issue · Github
    by Sheila

    Issue Number
    546
    State
    closed
    Last Updated
    Assignee
    Array
    Closed By
    vdauwera
  • @Sheila

    Hi Sheila,

    The original bam and "bamout file" of Sample-2 are attached: The HOM-ALT site is the first site, and the vertical center lines are at the second site.

    Thanks!

    Frank

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Can you post the full records (not just genotype fields) including the record that comes before the two that you are showing?

  • @Geraldine_VdAuwera said:
    Can you post the full records (not just genotype fields) including the record that comes before the two that you are showing?

    @Geraldine_VdAuwera

    Hi Geraldine, please see the attached vcf.gz file.

    Thanks.

    Frank

  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Please post the records as a text comment. It takes less time for us to look at it that way. Also I can't access the file easily from a smartphone (which is what I use a lot of the time for answering support questions).

  • frankfengfrankfeng Member
    edited February 2016

    @Geraldine_VdAuwera said:
    Please post the records as a text comment. It takes less time for us to look at it that way. Also I can't access the file easily from a smartphone (which is what I use a lot of the time for answering support questions).

    Hi @Geraldine_VdAuwera please see below:

    ##fileformat=VCFv4.2
    ##ALT=<ID=NON_REF,Description="Represents any possible alternative allele at this location">
    ##FILTER=<ID=LowQual,Description="Low quality">
    ##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
    ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
    ##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
    ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
    ##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed within the GVCF block">
    ##FORMAT=<ID=PGT,Number=1,Type=String,Description="Physical phasing haplotype information, describing how the alternate alleles are phased in relation to one another">
    ##FORMAT=<ID=PID,Number=1,Type=String,Description="Physical phasing ID information, where each unique ID within a given sample (but not across samples) connects records within a phasing group">
    ##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Normalized, Phred-scaled likelihoods for genotypes as defined in the VCF specification">
    ##FORMAT=<ID=RGQ,Number=1,Type=Integer,Description="Unconditional reference genotype confidence, encoded as a phred quality -10*log10 p(genotype call is wrong)">
    ##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.">
    ##GATKCommandLine.GenotypeGVCFs=<ID=GenotypeGVCFs,Version=3.5-0-g36282e4,Date="Sun Feb 07 18:05:59 CST 2016",Epoch=1454830559511,CommandLineOptions="analysis_type=GenotypeGVCFs input_file=[] showFullBamList=false read_buffer_size=null phone_home=NO_ET gatk_key=/data/sacgf/reference/map_n_call_pipeline/gatk/paul.wang_adelaide.edu.au.key tag=NA read_filter=[] disable_read_filter=[] intervals=[1:1009175-1029175] excludeIntervals=null interval_set_rule=UNION interval_merging=ALL interval_padding=0 reference_sequence=/scratch/gatk_bundle/2.8/b37/human_g1k_v37_decoy.fasta nonDeterministicRandomSeed=false disableDithering=false maxRuntime=-1 maxRuntimeUnits=MINUTES downsampling_type=BY_SAMPLE downsample_to_fraction=null downsample_to_coverage=1000 baq=OFF baqGapOpenPenalty=40.0 refactor_NDN_cigar_string=false fix_misencoded_quality_scores=false allow_potentially_misencoded_quality_scores=false useOriginalQualities=false defaultBaseQualities=-1 performanceLog=null BQSR=null quantize_quals=0 static_quantized_quals=null round_down_quantized=false disable_indel_quals=false emit_original_quals=false preserve_qscores_less_than=6 globalQScorePrior=-1.0 validation_strictness=SILENT remove_program_records=false keep_program_records=false sample_rename_mapping_file=null unsafe=null disable_auto_index_creation_and_locking_when_reading_rods=false no_cmdline_in_header=false sites_only=false never_trim_vcf_format_field=false bcf=false bam_compression=null simplifyBAM=false disable_bam_indexing=false generate_md5=false num_threads=2 num_cpu_threads_per_data_thread=1 num_io_threads=0 monitorThreadEfficiency=false num_bam_file_handles=null read_group_black_list=null pedigree=[] pedigreeString=[] pedigreeValidationType=STRICT allow_intervals_with_unindexed_bam=false generateShadowBCF=false variant_index_type=DYNAMIC_SEEK variant_index_parameter=-1 reference_window_stop=0 logging_level=INFO log_to_file=null help=false version=false variant=[(RodBindingCollection [(RodBinding name=variant source=SAMPLE-1.g.vcf.gz)]), (RodBindingCollection [(RodBinding name=variant2 source=SAMPLE-2.g.vcf.gz)]), (RodBindingCollection [(RodBinding name=variant3 source=SAMPLE-3.g.vcf.gz)])] out=/home/users/jfeng/test/test_PGT_2.vcf.gz includeNonVariantSites=false uniquifySamples=false annotateNDA=false heterozygosity=0.001 indel_heterozygosity=1.25E-4 standard_min_confidence_threshold_for_calling=30.0 standard_min_confidence_threshold_for_emitting=20.0 max_alternate_alleles=6 input_prior=[] sample_ploidy=2 annotation=[] group=[Standard] dbsnp=(RodBinding name= source=UNBOUND) filter_reads_with_N_cigar=false filter_mismatching_base_and_quals=false filter_bases_not_stored=false">
    ##GATKCommandLine.HaplotypeCaller=<ID=HaplotypeCaller,Version=3.5-0-g36282e4,Date="Sun Jan 10 19:26:27 CST 2016",Epoch=1452416187963,CommandLineOptions="analysis_type=HaplotypeCaller input_file=[bqsr/SAMPLE-3.bam] showFullBamList=false read_buffer_size=null phone_home=NO_ET gatk_key=/data/sacgf/reference/map_n_call_pipeline/gatk/paul.wang_adelaide.edu.au.key tag=NA read_filter=[] disable_read_filter=[] intervals=[/data/sacgf/reference/hg19/ExomeCaptureRegions/Nimblegen/Nimblegen_SeqCap_EQ_Exome_v3/SeqCap_EZ_Exome_v3_capture.1000bp_padded_each_side.merged.Ensembl_style.cut_columns.bed] excludeIntervals=null interval_set_rule=UNION interval_merging=ALL interval_padding=0 reference_sequence=/scratch/gatk_bundle/2.8/b37/human_g1k_v37_decoy.fasta nonDeterministicRandomSeed=false disableDithering=false maxRuntime=-1 maxRuntimeUnits=MINUTES downsampling_type=BY_SAMPLE downsample_to_fraction=null downsample_to_coverage=500 baq=OFF baqGapOpenPenalty=40.0 refactor_NDN_cigar_string=false fix_misencoded_quality_scores=false allow_potentially_misencoded_quality_scores=false useOriginalQualities=false defaultBaseQualities=-1 performanceLog=null BQSR=null quantize_quals=0 static_quantized_quals=null round_down_quantized=false disable_indel_quals=false emit_original_quals=false preserve_qscores_less_than=6 globalQScorePrior=-1.0 validation_strictness=SILENT remove_program_records=false keep_program_records=false sample_rename_mapping_file=null unsafe=null disable_auto_index_creation_and_locking_when_reading_rods=false no_cmdline_in_header=false sites_only=false never_trim_vcf_format_field=false bcf=false bam_compression=null simplifyBAM=false disable_bam_indexing=false generate_md5=false num_threads=1 num_cpu_threads_per_data_thread=2 num_io_threads=0 monitorThreadEfficiency=false num_bam_file_handles=null read_group_black_list=null pedigree=[] pedigreeString=[] pedigreeValidationType=STRICT allow_intervals_with_unindexed_bam=false generateShadowBCF=false variant_index_type=DYNAMIC_SEEK variant_index_parameter=-1 reference_window_stop=0 logging_level=INFO log_to_file=null help=false version=false likelihoodCalculationEngine=PairHMM heterogeneousKmerSizeResolution=COMBO_MIN dbsnp=(RodBinding name= source=UNBOUND) dontTrimActiveRegions=false maxDiscARExtension=25 maxGGAARExtension=300 paddingAroundIndels=150 paddingAroundSNPs=20 comp=[] annotation=[StrandBiasBySample] excludeAnnotation=[ChromosomeCounts, FisherStrand, StrandOddsRatio, QualByDepth] group=[Standard, StandardHCAnnotation] debug=false useFilteredReadsForAnnotations=false emitRefConfidence=GVCF bamOutput=null bamWriterType=CALLED_HAPLOTYPES disableOptimizations=false annotateNDA=false heterozygosity=0.001 indel_heterozygosity=1.25E-4 standard_min_confidence_threshold_for_calling=-0.0 standard_min_confidence_threshold_for_emitting=-0.0 max_alternate_alleles=6 input_prior=[] sample_ploidy=2 genotyping_mode=DISCOVERY alleles=(RodBinding name= source=UNBOUND) contamination_fraction_to_filter=0.0 contamination_fraction_per_sample_file=null p_nonref_model=null exactcallslog=null output_mode=EMIT_VARIANTS_ONLY allSitePLs=true gcpHMM=10 pair_hmm_implementation=VECTOR_LOGLESS_CACHING pair_hmm_sub_implementation=ENABLE_ALL always_load_vector_logless_PairHMM_lib=false phredScaledGlobalReadMismappingRate=45 noFpga=false sample_name=null kmerSize=[10, 25] dontIncreaseKmerSizesForCycles=false allowNonUniqueKmersInRef=false numPruningSamples=1 recoverDanglingHeads=false doNotRecoverDanglingBranches=false minDanglingBranchLength=4 consensus=false maxNumHaplotypesInPopulation=128 errorCorrectKmers=false minPruning=2 debugGraphTransformations=false allowCyclesInKmerGraphToGeneratePaths=false graphOutput=null kmerLengthForReadErrorCorrection=25 minObservationsForKmerToBeSolid=20 GVCFGQBands=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 70, 80, 90, 99] indelSizeToEliminateInRefModel=10 min_base_quality_score=10 includeUmappedReads=false useAllelesTrigger=false doNotRunPhysicalPhasing=false keepRG=null justDetermineActiveRegions=false dontGenotype=false dontUseSoftClippedBases=false captureAssemblyFailureBAM=false errorCorrectReads=false pcr_indel_model=CONSERVATIVE maxReadsInRegionPerSample=10000 minReadsPerAlignmentStart=10 mergeVariantsViaLD=false activityProfileOut=null activeRegionOut=null activeRegionIn=null activeRegionExtension=null forceActive=false activeRegionMaxSize=null bandPassSigma=null maxProbPropagationDistance=50 activeProbabilityThreshold=0.002 min_mapping_quality_score=20 filter_reads_with_N_cigar=false filter_mismatching_base_and_quals=false filter_bases_not_stored=false">
    ##GVCFBlock0-1=minGQ=0(inclusive),maxGQ=1(exclusive)
    ##GVCFBlock1-2=minGQ=1(inclusive),maxGQ=2(exclusive)
    ##GVCFBlock10-11=minGQ=10(inclusive),maxGQ=11(exclusive)
    ##GVCFBlock11-12=minGQ=11(inclusive),maxGQ=12(exclusive)
    ##GVCFBlock12-13=minGQ=12(inclusive),maxGQ=13(exclusive)
    ##GVCFBlock13-14=minGQ=13(inclusive),maxGQ=14(exclusive)
    ##GVCFBlock14-15=minGQ=14(inclusive),maxGQ=15(exclusive)
    ##GVCFBlock15-16=minGQ=15(inclusive),maxGQ=16(exclusive)
    ##GVCFBlock16-17=minGQ=16(inclusive),maxGQ=17(exclusive)
    ##GVCFBlock17-18=minGQ=17(inclusive),maxGQ=18(exclusive)
    ##GVCFBlock18-19=minGQ=18(inclusive),maxGQ=19(exclusive)
    ##GVCFBlock19-20=minGQ=19(inclusive),maxGQ=20(exclusive)
    ##GVCFBlock2-3=minGQ=2(inclusive),maxGQ=3(exclusive)
    ##GVCFBlock20-21=minGQ=20(inclusive),maxGQ=21(exclusive)
    ##GVCFBlock21-22=minGQ=21(inclusive),maxGQ=22(exclusive)
    ##GVCFBlock22-23=minGQ=22(inclusive),maxGQ=23(exclusive)
    ##GVCFBlock23-24=minGQ=23(inclusive),maxGQ=24(exclusive)
    ##GVCFBlock24-25=minGQ=24(inclusive),maxGQ=25(exclusive)
    ##GVCFBlock25-26=minGQ=25(inclusive),maxGQ=26(exclusive)
    ##GVCFBlock26-27=minGQ=26(inclusive),maxGQ=27(exclusive)
    ##GVCFBlock27-28=minGQ=27(inclusive),maxGQ=28(exclusive)
    ##GVCFBlock28-29=minGQ=28(inclusive),maxGQ=29(exclusive)
    ##GVCFBlock29-30=minGQ=29(inclusive),maxGQ=30(exclusive)
    ##GVCFBlock3-4=minGQ=3(inclusive),maxGQ=4(exclusive)
    ##GVCFBlock30-31=minGQ=30(inclusive),maxGQ=31(exclusive)
    ##GVCFBlock31-32=minGQ=31(inclusive),maxGQ=32(exclusive)
    ##GVCFBlock32-33=minGQ=32(inclusive),maxGQ=33(exclusive)
    ##GVCFBlock33-34=minGQ=33(inclusive),maxGQ=34(exclusive)
    ##GVCFBlock34-35=minGQ=34(inclusive),maxGQ=35(exclusive)
    ##GVCFBlock35-36=minGQ=35(inclusive),maxGQ=36(exclusive)
    ##GVCFBlock36-37=minGQ=36(inclusive),maxGQ=37(exclusive)
    ##GVCFBlock37-38=minGQ=37(inclusive),maxGQ=38(exclusive)
    ##GVCFBlock38-39=minGQ=38(inclusive),maxGQ=39(exclusive)
    ##GVCFBlock39-40=minGQ=39(inclusive),maxGQ=40(exclusive)
    ##GVCFBlock4-5=minGQ=4(inclusive),maxGQ=5(exclusive)
    ##GVCFBlock40-41=minGQ=40(inclusive),maxGQ=41(exclusive)
    ##GVCFBlock41-42=minGQ=41(inclusive),maxGQ=42(exclusive)
    ##GVCFBlock42-43=minGQ=42(inclusive),maxGQ=43(exclusive)
    ##GVCFBlock43-44=minGQ=43(inclusive),maxGQ=44(exclusive)
    ##GVCFBlock44-45=minGQ=44(inclusive),maxGQ=45(exclusive)
    ##GVCFBlock45-46=minGQ=45(inclusive),maxGQ=46(exclusive)
    ##GVCFBlock46-47=minGQ=46(inclusive),maxGQ=47(exclusive)
    ##GVCFBlock47-48=minGQ=47(inclusive),maxGQ=48(exclusive)
    ##GVCFBlock48-49=minGQ=48(inclusive),maxGQ=49(exclusive)
    ##GVCFBlock49-50=minGQ=49(inclusive),maxGQ=50(exclusive)
    ##GVCFBlock5-6=minGQ=5(inclusive),maxGQ=6(exclusive)
    ##GVCFBlock50-51=minGQ=50(inclusive),maxGQ=51(exclusive)
    ##GVCFBlock51-52=minGQ=51(inclusive),maxGQ=52(exclusive)
    ##GVCFBlock52-53=minGQ=52(inclusive),maxGQ=53(exclusive)
    ##GVCFBlock53-54=minGQ=53(inclusive),maxGQ=54(exclusive)
    ##GVCFBlock54-55=minGQ=54(inclusive),maxGQ=55(exclusive)
    ##GVCFBlock55-56=minGQ=55(inclusive),maxGQ=56(exclusive)
    ##GVCFBlock56-57=minGQ=56(inclusive),maxGQ=57(exclusive)
    ##GVCFBlock57-58=minGQ=57(inclusive),maxGQ=58(exclusive)
    ##GVCFBlock58-59=minGQ=58(inclusive),maxGQ=59(exclusive)
    ##GVCFBlock59-60=minGQ=59(inclusive),maxGQ=60(exclusive)
    ##GVCFBlock6-7=minGQ=6(inclusive),maxGQ=7(exclusive)
    ##GVCFBlock60-70=minGQ=60(inclusive),maxGQ=70(exclusive)
    ##GVCFBlock7-8=minGQ=7(inclusive),maxGQ=8(exclusive)
    ##GVCFBlock70-80=minGQ=70(inclusive),maxGQ=80(exclusive)
    ##GVCFBlock8-9=minGQ=8(inclusive),maxGQ=9(exclusive)
    ##GVCFBlock80-90=minGQ=80(inclusive),maxGQ=90(exclusive)
    ##GVCFBlock9-10=minGQ=9(inclusive),maxGQ=10(exclusive)
    ##GVCFBlock90-99=minGQ=90(inclusive),maxGQ=99(exclusive)
    ##GVCFBlock99-2147483647=minGQ=99(inclusive),maxGQ=2147483647(exclusive)
    ##INFO=<ID=AC,Number=A,Type=Integer,Description="Allele count in genotypes, for each ALT allele, in the same order as listed">
    ##INFO=<ID=AF,Number=A,Type=Float,Description="Allele Frequency, for each ALT allele, in the same order as listed">
    ##INFO=<ID=AN,Number=1,Type=Integer,Description="Total number of alleles in called genotypes">
    ##INFO=<ID=BaseQRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt Vs. Ref base qualities">
    ##INFO=<ID=ClippingRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref number of hard clipped bases">
    ##INFO=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth; some reads may have been filtered">
    ##INFO=<ID=DS,Number=0,Type=Flag,Description="Were any of the samples downsampled?">
    ##INFO=<ID=END,Number=1,Type=Integer,Description="Stop position of the interval">
    ##INFO=<ID=ExcessHet,Number=1,Type=Float,Description="Phred-scaled p-value for exact test of excess heterozygosity">
    ##INFO=<ID=FS,Number=1,Type=Float,Description="Phred-scaled p-value using Fisher's exact test to detect strand bias">
    ##INFO=<ID=HaplotypeScore,Number=1,Type=Float,Description="Consistency of the site with at most two segregating haplotypes">
    ##INFO=<ID=InbreedingCoeff,Number=1,Type=Float,Description="Inbreeding coefficient as estimated from the genotype likelihoods per-sample when compared against the Hardy-Weinberg expectation">
    ##INFO=<ID=MLEAC,Number=A,Type=Integer,Description="Maximum likelihood expectation (MLE) for the allele counts (not necessarily the same as the AC), for each ALT allele, in the same order as listed">
    ##INFO=<ID=MLEAF,Number=A,Type=Float,Description="Maximum likelihood expectation (MLE) for the allele frequency (not necessarily the same as the AF), for each ALT allele, in the same order as listed">
    ##INFO=<ID=MQ,Number=1,Type=Float,Description="RMS Mapping Quality">
    ##INFO=<ID=MQRankSum,Number=1,Type=Float,Description="Z-score From Wilcoxon rank sum test of Alt vs. Ref read mapping qualities">
    ##INFO=<ID=QD,Number=1,Type=Float,Description="Variant Confidence/Quality by Depth">
    ##INFO=<ID=RAW_MQ,Number=1,Type=Float,Description="Raw data for RMS Mapping Quality">
    ##INFO=<ID=ReadPosRankSum,Number=1,Type=Float,Description="Z-score from Wilcoxon rank sum test of Alt vs. Ref read position bias">
    ##INFO=<ID=SOR,Number=1,Type=Float,Description="Symmetric Odds Ratio of 2x2 contingency table to detect strand bias">
    ##contig=<ID=1,length=249250621,assembly=b37>
    ##contig=<ID=2,length=243199373,assembly=b37>
    ##contig=<ID=3,length=198022430,assembly=b37>
    ##contig=<ID=4,length=191154276,assembly=b37>
    ##contig=<ID=5,length=180915260,assembly=b37>
    ##contig=<ID=6,length=171115067,assembly=b37>
    ##contig=<ID=7,length=159138663,assembly=b37>
    ##contig=<ID=8,length=146364022,assembly=b37>
    ##contig=<ID=9,length=141213431,assembly=b37>
    ##contig=<ID=10,length=135534747,assembly=b37>
    ##contig=<ID=11,length=135006516,assembly=b37>
    ##contig=<ID=12,length=133851895,assembly=b37>
    ##contig=<ID=13,length=115169878,assembly=b37>
    ##contig=<ID=14,length=107349540,assembly=b37>
    ##contig=<ID=15,length=102531392,assembly=b37>
    ##contig=<ID=16,length=90354753,assembly=b37>
    ##contig=<ID=17,length=81195210,assembly=b37>
    ##contig=<ID=18,length=78077248,assembly=b37>
    ##contig=<ID=19,length=59128983,assembly=b37>
    ##contig=<ID=20,length=63025520,assembly=b37>
    ##contig=<ID=21,length=48129895,assembly=b37>
    ##contig=<ID=22,length=51304566,assembly=b37>
    ##contig=<ID=X,length=155270560,assembly=b37>
    ##contig=<ID=Y,length=59373566,assembly=b37>
    ##contig=<ID=MT,length=16569,assembly=b37>
    ##contig=<ID=GL000207.1,length=4262,assembly=b37>
    ##contig=<ID=GL000226.1,length=15008,assembly=b37>
    ##contig=<ID=GL000229.1,length=19913,assembly=b37>
    ##contig=<ID=GL000231.1,length=27386,assembly=b37>
    ##contig=<ID=GL000210.1,length=27682,assembly=b37>
    ##contig=<ID=GL000239.1,length=33824,assembly=b37>
    ##contig=<ID=GL000235.1,length=34474,assembly=b37>
    ##contig=<ID=GL000201.1,length=36148,assembly=b37>
    ##contig=<ID=GL000247.1,length=36422,assembly=b37>
    ##contig=<ID=GL000245.1,length=36651,assembly=b37>
    ##contig=<ID=GL000197.1,length=37175,assembly=b37>
    ##contig=<ID=GL000203.1,length=37498,assembly=b37>
    ##contig=<ID=GL000246.1,length=38154,assembly=b37>
    ##contig=<ID=GL000249.1,length=38502,assembly=b37>
    ##contig=<ID=GL000196.1,length=38914,assembly=b37>
    ##contig=<ID=GL000248.1,length=39786,assembly=b37>
    ##contig=<ID=GL000244.1,length=39929,assembly=b37>
    ##contig=<ID=GL000238.1,length=39939,assembly=b37>
    ##contig=<ID=GL000202.1,length=40103,assembly=b37>
    ##contig=<ID=GL000234.1,length=40531,assembly=b37>
    ##contig=<ID=GL000232.1,length=40652,assembly=b37>
    ##contig=<ID=GL000206.1,length=41001,assembly=b37>
    ##contig=<ID=GL000240.1,length=41933,assembly=b37>
    ##contig=<ID=GL000236.1,length=41934,assembly=b37>
    ##contig=<ID=GL000241.1,length=42152,assembly=b37>
    ##contig=<ID=GL000243.1,length=43341,assembly=b37>
    ##contig=<ID=GL000242.1,length=43523,assembly=b37>
    ##contig=<ID=GL000230.1,length=43691,assembly=b37>
    ##contig=<ID=GL000237.1,length=45867,assembly=b37>
    ##contig=<ID=GL000233.1,length=45941,assembly=b37>
    ##contig=<ID=GL000204.1,length=81310,assembly=b37>
    ##contig=<ID=GL000198.1,length=90085,assembly=b37>
    ##contig=<ID=GL000208.1,length=92689,assembly=b37>
    ##contig=<ID=GL000191.1,length=106433,assembly=b37>
    ##contig=<ID=GL000227.1,length=128374,assembly=b37>
    ##contig=<ID=GL000228.1,length=129120,assembly=b37>
    ##contig=<ID=GL000214.1,length=137718,assembly=b37>
    ##contig=<ID=GL000221.1,length=155397,assembly=b37>
    ##contig=<ID=GL000209.1,length=159169,assembly=b37>
    ##contig=<ID=GL000218.1,length=161147,assembly=b37>
    ##contig=<ID=GL000220.1,length=161802,assembly=b37>
    ##contig=<ID=GL000213.1,length=164239,assembly=b37>
    ##contig=<ID=GL000211.1,length=166566,assembly=b37>
    ##contig=<ID=GL000199.1,length=169874,assembly=b37>
    ##contig=<ID=GL000217.1,length=172149,assembly=b37>
    ##contig=<ID=GL000216.1,length=172294,assembly=b37>
    ##contig=<ID=GL000215.1,length=172545,assembly=b37>
    ##contig=<ID=GL000205.1,length=174588,assembly=b37>
    ##contig=<ID=GL000219.1,length=179198,assembly=b37>
    ##contig=<ID=GL000224.1,length=179693,assembly=b37>
    ##contig=<ID=GL000223.1,length=180455,assembly=b37>
    ##contig=<ID=GL000195.1,length=182896,assembly=b37>
    ##contig=<ID=GL000212.1,length=186858,assembly=b37>
    ##contig=<ID=GL000222.1,length=186861,assembly=b37>
    ##contig=<ID=GL000200.1,length=187035,assembly=b37>
    ##contig=<ID=GL000193.1,length=189789,assembly=b37>
    ##contig=<ID=GL000194.1,length=191469,assembly=b37>
    ##contig=<ID=GL000225.1,length=211173,assembly=b37>
    ##contig=<ID=GL000192.1,length=547496,assembly=b37>
    ##contig=<ID=NC_007605,length=171823,assembly=b37>
    ##contig=<ID=hs37d5,length=35477943,assembly=b37>
    ##reference=file:///scratch/gatk_bundle/2.8/b37/human_g1k_v37_decoy.fasta
    #CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  SAMPLE-1    SAMPLE-2    SAMPLE-3
    1   1017273 .   C   A   23.89   LowQual AC=2;AF=1.00;AN=2;DP=2;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=11.95;SOR=0.693 GT:AD:DP:GQ:PL  1/1:0,2:2:6:49,6,0  ./.:0,0:0   ./.:0,0:0
    1   1017341 .   G   T   142.60  .   AC=3;AF=0.750;AN=4;BaseQRankSum=0.736;ClippingRankSum=-7.360e-01;DP=7;ExcessHet=3.0103;FS=0.000;MLEAC=3;MLEAF=0.750;MQ=55.55;MQRankSum=-7.360e-01;QD=23.77;ReadPosRankSum=-7.360e-01;SOR=2.303  GT:AD:DP:GQ:PL  ./.:1,0:1   1/1:0,3:3:9:110,9,0 0/1:1,2:3:24:61,0,24
    1   1017587 .   C   T   6109.90 .   AC=3;AF=0.500;AN=6;BaseQRankSum=0.629;ClippingRankSum=0.585;DP=332;ExcessHet=1.5490;FS=3.980;MLEAC=3;MLEAF=0.500;MQ=60.00;MQRankSum=0.053;QD=20.71;ReadPosRankSum=-5.900e-01;SOR=0.970  GT:AD:DP:GQ:PL  0/1:69,48:117:99:1013,0,1482    1/1:0,178:178:99:5133,527,0 0/0:36,0:36:99:0,102,1018
    1   1018144 .   T   C   2759.16 .   AC=2;AF=0.333;AN=6;BaseQRankSum=-3.470e-01;ClippingRankSum=0.371;DP=301;ExcessHet=3.9794;FS=0.000;MLEAC=2;MLEAF=0.333;MQ=60.00;MQRankSum=-2.930e-01;QD=10.99;ReadPosRankSum=1.15;SOR=0.706  GT:AD:DP:GQ:PL  0/1:45,45:90:99:960,0,996   0/0:45,0:45:99:0,102,1530   0/1:83,78:161:99:1831,0,1860
    1   1018562 .   C   T   414.16  .   AC=2;AF=0.333;AN=6;BaseQRankSum=0.163;ClippingRankSum=1.29;DP=70;ExcessHet=3.9794;FS=0.000;MLEAC=2;MLEAF=0.333;MQ=60.00;MQRankSum=0.922;QD=8.81;ReadPosRankSum=1.96;SOR=0.489   GT:AD:DP:GQ:PL  0/1:6,10:16:99:181,0,114    0/0:23,0:23:57:0,57,855 0/1:15,16:31:99:265,0,368
    1   1018704 .   A   G   20.92   LowQual AC=2;AF=0.500;AN=4;DP=4;ExcessHet=0.7918;FS=0.000;MLEAC=3;MLEAF=0.750;MQ=48.99;QD=10.46;SOR=0.693   GT:AD:DP:GQ:PL  1/1:0,2:2:6:49,6,0  0/0:1,0:1:3:0,3,30  ./.:1,0:1
    1   1019175 .   C   G   8666.13 .   AC=5;AF=0.833;AN=6;BaseQRankSum=1.31;ClippingRankSum=-1.855e+00;DP=319;ExcessHet=3.0103;FS=5.413;MLEAC=5;MLEAF=0.833;MQ=60.00;MQRankSum=-1.447e+00;QD=27.42;ReadPosRankSum=-1.014e+00;SOR=1.108 GT:AD:DP:GQ:PGT:PID:PL  1/1:0,96:96:99:1|1:1019175_C_G:3803,287,0   1/1:0,111:111:99:.:.:3099,326,0 0/1:60,49:109:99:0|1:1019175_C_G:1795,0,2287
    1   1019180 .   T   C   3431.16 .   AC=2;AF=0.333;AN=6;BaseQRankSum=1.13;ClippingRankSum=1.21;DP=348;ExcessHet=3.9794;FS=4.997;MLEAC=2;MLEAF=0.333;MQ=60.00;MQRankSum=0.326;QD=15.25;ReadPosRankSum=0.399;SOR=1.036 GT:AD:DP:GQ:PGT:PID:PL  0/1:38,68:106:99:0|1:1019175_C_G:1540,0,776 0/0:121,0:121:99:.:.:0,120,1800 0/1:63,56:119:99:0|1:1019175_C_G:1923,0,2304
    1   1020406 .   T   C   53.42   .   AC=2;AF=1.00;AN=2;DP=4;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=51.96;QD=17.81;SOR=1.179 GT:AD:DP:GQ:PL  1/1:0,3:3:9:79,9,0  ./.:1,0:1   ./.:0,0:0
    1   1021346 .   A   G   6060.90 .   AC=3;AF=0.500;AN=6;BaseQRankSum=0.638;ClippingRankSum=-4.400e-01;DP=380;ExcessHet=1.5490;FS=6.494;MLEAC=3;MLEAF=0.500;MQ=60.00;MQRankSum=1.33;QD=18.48;ReadPosRankSum=3.00;SOR=0.402    GT:AD:DP:GQ:PL  0/1:79,81:160:99:1764,0,1624    1/1:0,168:168:99:4333,494,0 0/0:50,0:50:99:0,120,1800
    1   1021415 .   A   G   5465.13 .   AC=5;AF=0.833;AN=6;BaseQRankSum=-2.020e-01;ClippingRankSum=1.62;DP=259;ExcessHet=3.0103;FS=11.453;MLEAC=5;MLEAF=0.833;MQ=60.00;MQRankSum=-1.665e+00;QD=21.43;ReadPosRankSum=0.499;SOR=1.451 GT:AD:DP:GQ:PL  1/1:0,79:79:99:2236,234,0   1/1:0,74:74:99:2227,219,0   0/1:57,45:102:99:1033,0,1340
    1   1021583 .   A   C   356.25  .   AC=3;AF=0.500;AN=6;BaseQRankSum=2.36;ClippingRankSum=0.387;DP=33;ExcessHet=1.5490;FS=0.000;MLEAC=3;MLEAF=0.500;MQ=60.00;MQRankSum=0.466;QD=22.27;ReadPosRankSum=0.717;SOR=1.609 GT:AD:DP:GQ:PL  0/1:5,6:11:99:201,0,112 1/1:0,5:5:15:191,15,0   0/0:16,0:16:42:0,42,630
    1   1023145 .   G   A   130.19  .   AC=3;AF=0.500;AN=6;BaseQRankSum=1.23;ClippingRankSum=-1.231e+00;DP=10;ExcessHet=1.5490;FS=0.000;MLEAC=3;MLEAF=0.500;MQ=60.00;MQRankSum=0.358;QD=16.27;ReadPosRankSum=0.358;SOR=0.269    GT:AD:DP:GQ:PL  0/1:2,3:5:41:82,0,41    1/1:0,3:3:9:81,9,0  0/0:2,0:2:6:0,6,67
    1   1023444 .   C   G   51.42   .   AC=2;AF=1.00;AN=2;DP=3;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=17.14;SOR=1.179 GT:AD:DP:GQ:PL  ./.:0,0:0   ./.:0,0:0   1/1:0,3:3:9:77,9,0
    1   1025301 .   T   C   23.89   LowQual AC=2;AF=1.00;AN=2;DP=3;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=48.99;QD=11.95;SOR=0.693 GT:AD:DP:GQ:PL  ./.:0,0:0   ./.:1,0:1   1/1:0,2:2:6:49,6,0
    1   1026707 .   C   A   1965.92 .   AC=3;AF=0.500;AN=6;BaseQRankSum=-3.083e+00;ClippingRankSum=0.990;DP=241;ExcessHet=6.9897;FS=2.898;MLEAC=3;MLEAF=0.500;MQ=60.00;MQRankSum=-8.640e-01;QD=8.16;ReadPosRankSum=0.325;SOR=0.485  GT:AD:DP:GQ:PL  0/1:37,37:74:99:669,0,854   0/1:41,27:68:99:394,0,1021  0/1:49,50:99:99:933,0,1156
    1   1026801 .   T   A   8572.16 .   AC=4;AF=0.667;AN=6;BaseQRankSum=-3.266e+00;ClippingRankSum=1.40;DP=694;ExcessHet=3.9794;FS=4.378;MLEAC=4;MLEAF=0.667;MQ=60.00;MQRankSum=0.665;QD=12.57;ReadPosRankSum=1.92;SOR=0.447    GT:AD:DP:GQ:PL  1/1:1,210:211:99:4675,594,0 0/1:111,91:202:99:1478,0,2359   0/1:118,151:269:99:2451,0,2446
    1   1027008 .   G   A   916.14  .   AC=2;AF=0.500;AN=4;BaseQRankSum=1.70;ClippingRankSum=-2.910e-01;DP=97;ExcessHet=1.5490;FS=5.971;MLEAC=3;MLEAF=0.750;MQ=41.08;MQRankSum=0.873;QD=30.54;ReadPosRankSum=1.04;SOR=3.767 GT:AD:DP:GQ:PL  ./.:34,0:34 0/0:33,0:33:0:0,0,581   1/1:2,28:30:35:944,35,0
    
  • Geraldine_VdAuweraGeraldine_VdAuwera Cambridge, MAMember, Administrator, Broadie admin

    Sorry for the late response. These records look ok; the second sample does not have phasing information because it is hom-ref at the second site. We don't output phasing information at hom-ref sites. To have phasing info for the second sample you would have to have another site where it has a variant allele.

  • frankfengfrankfeng Member
    edited February 2016

    @Geraldine_VdAuwera said:
    Sorry for the late response. These records look ok; the second sample does not have phasing information because it is hom-ref at the second site. We don't output phasing information at hom-ref sites. To have phasing info for the second sample you would have to have another site where it has a variant allele.

    Hi @Geraldine_VdAuwera

    Thanks for your response. It is just a feature request and I think it will make the output better. In the output of GenotypeGVCFs, there is . but no 0|0 for PGT. And we know . can stand for missing data (e.g. due to zero useful coverage at the position).

    For example, after I used GATK's VariantsToTable to extract PGT and save them in TSV format (see below), I saw so many NA for PGT, and I don't know whether they are 0|0 or missing data. So it will be better if 0|0 can be output for PGT too. Thanks!

    CHROM   POS     TYPE    SAMPLE-1.PID    SAMPLE-1.PGT    SAMPLE-2.PID    SAMPLE-2.PGT    SAMPLE-3.PID    SAMPLE-3.PGT
    1       1017273 SNP     NA      NA      NA      NA      NA      NA
    1       1017341 SNP     NA      NA      NA      NA      NA      NA
    1       1017587 SNP     NA      NA      NA      NA      NA      NA
    1       1018144 SNP     NA      NA      NA      NA      NA      NA
    1       1018562 SNP     NA      NA      NA      NA      NA      NA
    1       1018704 SNP     NA      NA      NA      NA      NA      NA
    1       1019175 SNP     1019175_C_G     1|1     NA      NA      1019175_C_G     0|1
    1       1019180 SNP     1019175_C_G     0|1     NA      NA      1019175_C_G     0|1
    1       1020406 SNP     NA      NA      NA      NA      NA      NA
    1       1021346 SNP     NA      NA      NA      NA      NA      NA
    1       1021415 SNP     NA      NA      NA      NA      NA      NA
    1       1021583 SNP     NA      NA      NA      NA      NA      NA
    1       1023145 SNP     NA      NA      NA      NA      NA      NA
    1       1023444 SNP     NA      NA      NA      NA      NA      NA
    1       1025301 SNP     NA      NA      NA      NA      NA      NA
    1       1026707 SNP     NA      NA      NA      NA      NA      NA
    1       1026801 SNP     NA      NA      NA      NA      NA      NA
    1       1027008 SNP     NA      NA      NA      NA      NA      NA
    
Sign In or Register to comment.