The current GATK version is 3.8-0
Examples: Monday, today, last week, Mar 26, 3/26/04

#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

You can opt in to receive email notifications, for example when your questions get answered or when there are new announcements, by following the instructions given here.

#### ☞ Got a problem?

1. Search using the upper-right search box, e.g. using the error message.
3. Include tool and Java versions.
4. Tell us whether you are following GATK Best Practices.
5. Include relevant details, e.g. platform, DNA- or RNA-Seq, WES (+capture kit) or WGS (PCR-free or PCR+), paired- or single-end, read length, expected average coverage, somatic data, etc.
6. For tool errors, include the error stacktrace as well as the exact command.
7. For format issues, include the result of running ValidateSamFile for BAMs or ValidateVariants for VCFs.
8. For weird results, include an illustrative example, e.g. attach IGV screenshots according to Article#5484.
9. For a seeming variant that is uncalled, include results of following Article#1235.

#### ☞ Formatting tip!

Wrap blocks of code, error messages and BAM/VCF snippets--especially content with hashes (#)--with lines with three backticks ( ` ) each to make a code block as demonstrated here.

GATK version 4.beta.3 (i.e. the third beta release) is out. See the GATK4 beta page for download and details.

# Basic question about CombineVariants and the ESP 6500 exomes

Member

Hi!
Sorry for my ignorance, but the ESP files contains 24 vcd in total. I tried to include them all on CombineVariants, but gets an error.
I have tried to merge the 24 file to one, both with vcftools and picard, failing with both..

Somebody knows?

Thank you!

Tagged:

• Member

ERROR MESSAGE: Input files ESP6500SI-V2-SSA137.dbSNP138-rsIDs.snps_indels.vcf/ESP6500SI-V2-SSA137.updatedRsIds.chr1.snps_indels.vcf and reference have incompatible contigs: No overlapping contigs found.

• Member

I figured it out!

Use this:

perl -pe 's/^([^#])/chr\1/' file.vcf > out.vcf

This adds "chr". My ref is chr1 etc. And the ESP was only 1 etc.

• Member

Sorry for lots of comments, but wired errors on some:

##### ERROR MESSAGE: Sequence name contains invalid character: chr13

This is true for chr14, ch16 and a lot more..

And this one:

##### ERROR reference contigs = [chrM, chr1, chr2, chr3, chr4, chr5, chr6, chr7, chr8, chr9, chr10,

And this one:

ERROR MESSAGE: Line 32: there aren't enough columns for line chr15 (we expected 9 tokens, and saw 1 )

Also true for chr10..

• Member

@Geraldine_VdAuwera said:
myoglu, when the contig names are different it's a strong indication that the variants and alignments were generated using different reference builds. There can be important differences between reference builds that will affect your results. You can only use the solution you posted if you are absolutely sure that the references are equivalent.

I have googled for hours now, seams many are struggling with the same.

The reference builds are the same, should be ok.

I had to manually recheck all chromosome names, replacing i.e "chr11" with the EXACT SAME THING (chr11).
Sometimes also manually fix wired spacing errors. I also had to replace all "A" with "A" in the GATK output file (from variant annotator). Very wired, they are the same, but still not found by the "find" function in text edit.

Finally I ran VCFtools validation getting these errors for every of the 24 ESP files:

i.e.

The header tag 'reference' not present. (Not required but highly recommended.)
The header tag 'contig' not present for CHROM=chrY. (Not required but highly recommended.)

So, it looks ok, but still I get the famous error when running CombineVariants:

##### ERROR MESSAGE: Invalid command line: No tribble type was provided on the command line and the type of the file could not be determined dynamically. Please add an explicit type tag :NAME listing the correct type from among the supported types:

So, I checked the VCF version of the ESP files, they are VCF4.1, should be good.. But Im stuck.

What is wrong with these ESP files?

Sorry to bother the GATK team, but does any of you other users know this? Ive seem more people struggling.

This sounds like a file encoding error -- are you using a Windows PC, or did you get the files from someone using Windows? We've seen encoding issues pop up when files are written or edited on a Windows machine.

• Member

No, only OSX mountain lion all the way.

I finally figured it out:

It was something off the the VariantAnnotator vcf from GATK, I re-ran it and used the new file, I also deleted the old index file.

The manual correction of the ESP VCF works! Its a bit boring, but it does the job.

Hope this help anyone else getting this error!