The current GATK version is 3.7-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

GATK 3.7 is here! Be sure to read the Version Highlights and optionally the full Release Notes.
Register now for the upcoming GATK Best Practices workshop, Feb 20-22 in Leuven, Belgium. Open to all comers! More info and signup at http://bit.ly/2i4mGxz

# Wrong number of fields in PED files in PhaseByTransmission

Member Posts: 5
edited January 2013

Hello,all

while using the walker PhaseByTransmission I always get this error:

##### ERROR ------------------------------------------------------------------------------------------
##### ERROR A USER ERROR has occurred (version 2.1-12-ga99c19d):
##### ERROR The invalid arguments or inputs must be corrected before the GATK can proceed
##### ERROR Please do not post this error to the GATK forum
##### ERROR
##### ERROR See the documentation (rerun with -h) for this tool to view allowable command-line arguments.
##### ERROR
##### ERROR MESSAGE: File associated with name java.io.FileReader@5cf7c5b5 is malformed: Bad PED line 1: wrong number of fields
##### ERROR ------------------------------------------------------------------------------------------


my conmmand is :

java -jar GenomeAnalysisTK-2.1-12-ga99c19d/GenomeAnalysisTK.jar -T  PhaseByTransmission -R GRCh37.fasta -V trios_457.chr22.vcf -ped trios_457.chr22.ped -pedValidationType SILENT -o o1.vcf


and my ped file is like this:

fam1    s_4     0       0       1       1       C       C       C       C       G       G
fam1    s_5     0       0       2       2       T       T       T       T       G       G
fam1    s_7     s_4     s_5     2       2       C       T       C       T       G       G


I do counted my vcf ped and map files and the result is:

-bash-4.1$head -1 trios_457.chr22.ped |wc -w 1892 #( 6 columns for info + 943*2 columns for alleles ) -bash-4.1$ wc -l trios_457.chr22.map
943
-bash-4.1$grep -v "#" trios_457.chr22.vcf | wc -l 943  My question is what's wrong with my my PED line? Tagged: ## Best Answers • Member Posts: 5 edited January 2013 Answer ✓ Thank you, Geraldine! I've solved the problem. For the first question, I checked the source code in Github and found it is because this PED file is not the PED file in PLINK. It contains only the first 6 columns in a PLINK format PED file, and no alleles, like a FAM file in PLINK. So I suggest the team to make that clear in the work-flows For my second question, that's because of the mismatch of @RGs in my BAM header and my RG tags in the records. Thanks for your help! ## Answers • Member Posts: 5 edited January 2013 Besides, I created my ped file by using vcftools vcftools --vcf trios_457.chr22.vcf --plink --out trios_457.chr22.ped I changed some columns so the PED file like this: fam1 s_4 0 0 1 1 C C C C G G …… fam1 s_5 0 0 2 2 T T T T G G ....... fam1 s_7 s_4 s_5 2 2 C T C T G G ...... • Administrator, Dev Posts: 11,029 admin Hi there, could you please try again with the latest version of GATK? This may be a bug that was fixed since 2.1. Geraldine Van der Auwera, PhD • Member Posts: 5 Oh I tried again with 2.3-5, and the error message is the same. Is there anything wrong with my PED file and can I generate that using vcftools? • Member Posts: 5 In addition Geraldine, I met with another problem using ReadBackedPhasing. Did it mean that there are some problems in my vcf file? java -jar GenomeAnalysisTK-2.3-5-g49ed93c/GenomeAnalysisTK.jar -T ReadBackedPhasing -R GRCh37.fasta -I 457.sort.bam --variant trios_457.all.vcf -L trios_457.all.vcf -o RBPphased_all.vcf --phaseQualityThresh 20.0 ##### ERROR ------------------------------------------------------------------------------------------ ##### ERROR stack trace java.lang.NoClassDefFoundError: com/sun/javadoc/ProgramElementDoc at org.broadinstitute.sting.utils.exceptions.UserException$ReadMissingReadGroup.(UserException.java:281)
at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
... 16 more

##### ERROR ------------------------------------------------------------------------------------------

Thanks!

• Member Posts: 5

Thank you, Geraldine! I've solved the problem.
For the first question, I checked the source code in Github and found it is because this PED file is not the PED file in PLINK. It contains only the first 6 columns in a PLINK format PED file, and no alleles, like a FAM file in PLINK. So I suggest the team to make that clear in the work-flows

For my second question, that's because of the mismatch of @RGs in my BAM header and my RG tags in the records. Thanks for your help!

Thanks for reporting your solution; I'll update the documentation accordingly.

Geraldine Van der Auwera, PhD

• SouthKoreaMember Posts: 2

Hi to all

I got the same error msg. But i tried with the above solutions. But i am getting error msg.

same as like above, i created the ped file with VCFtools and i used the same

Still the error status is same..

Is there any alternative solutions ....

Hi Murthi,

Geraldine Van der Auwera, PhD

• SouthKoreaMember Posts: 2
edited September 2013

M41 G6 0 0 1 1 Korean
M41 G5 0 0 2 1 Korean
M41 G4 G6 G5 1 2 Korean

The file created by VCFtools

G4.variant G4.variant 0 0 0 0 0 0 G A C T A G A G 0 0......
G5.variant2 G5.variant2 0 0 0 0 0 0 0 0 C T 0 0 0 0 AC A ......
G6.variant3 G6.variant3 0 0 0 0 C C 0 0 C T 0 0 A G 0 0........

Post edited by Geraldine_VdAuwera on