I have been given ~2000 gVCFs generated by Illumina (one sample per gVCF). Though they are in standard gVCF format, they were generated by an Illumina pipeline (https://support.basespace.illumina.com/knowledgebase/articles/147078-gvcf-file if you're really curious) and not the Haplotype Caller. As a result (I think ... ), the GATK doesn't want to process them (I have tried CombineGVCFs and GenotypeGVCFs to no avail). Is there a GATK walker or some other tool that will make my gVCFs GATK-friendly? I need to be able to merge this data together to make it analyze-able because in single-sample VCF format it's pretty useless at the moment.
My only other thought has been to expand all the ref blocks of data and then merge everything together, but this seems like it will result in the creation of a massive amount of data.
Any suggestions you may have are greatly appreciated!!!