The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

#### ☞ Get notifications!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Did you remember to?

1. Search using the upper-right search box, e.g. using the error message.
2. Try the latest version of tools.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Did we ask for a bug report?

Then follow instructions in Article#1894.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks (  ) each to make a code block as demonstrated here.

##### Jump to another community
Picard 2.9.4 is now available. Download and read release notes here.
GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.

# Probable serious bug in VariantsToBinaryPed causing incorrect mapping of sample to genotype

Member

VariantsToBinaryPed would seem to expect the fam file (first six cols of ped file) to describe the samples in the same order as the input VCF file: if they are not in the same order, it would appear to not correctly map sample IDs with the genotypes in the output binary PED.

I found this issue because I converted trio VCF files to binary PED, and then computed kinship coefficients using the binary PED file which showed that the relationships were wrong. If I fixed the .fam file so that the sample IDs were in the same order as the .vcf file and re-run the conversion to binary PED, then the kinship coefficients are as they should be given the pedigree.

This also made me wonder whether PhaseByTransmission has the same problem, but initial tests would seem to indicate that PhaseByTransmission may handle correctly the scenario where the sample order differs between the .fam file and the .vcf files.

## Answers

• Cambridge, MAMember, Administrator, Broadie

Hi Tim, can you confirm that you're using the latest version of GATK?

• Member

Hi Geraldine,

I am using version 2.5 (haven't upgraded to 2.6 because I don't have java 1.7).

I have worked around by ensuring the FAM file respects the VCF sample order.

Should be pretty quick for the developer of VariantsToBinaryPed to check whether there is a routine for SampleID matching between the fam file and the vcf in the tool code....

• Member

A quick test can easily be done by feeding a trio VCF file to VariantsToBinaryPed with a .fam file that is correct but where the samples are described in a different order form the VCF file.

Then feed the produced .bed file to KING http://people.virginia.edu/~wc9c/KING/manual.html

king -b myFile.bed --kinship
`

The last column of the output file king.kin tells of any discrepancies between the declared and the empirical kinships.

• Cambridge, MAMember, Administrator, Broadie

That's fine, just making sure you're on an at least somewhat recent version. Even short checks add up when there's a lot of them, so it's worth filtering by version.

Will check and let you know.

• Member

You are welcome! Pleased to be able to make my own tiny "contribution" to this great software that I use so much.

Sign In or Register to comment.