Ambiguous error in GenomicDBImport
in the HaplotypeCaller step I use 404 intervals with 100bp padding.
Then, when I'm going to create the DB wiht GenomicDBImport , with the same range of intervals, only in some in some cases I have the following error, not in all intervals!
Interval 001 without error:
14:03:59.867 INFO ProgressMeter - Traversal complete. Processed 6 total batches in 0.2 minutes.
14:03:59.867 INFO GenomicsDBImport - Import of all batches to GenomicsDB completed!
14:03:59.956 INFO GenomicsDBImport - Shutting down engine
[March 27, 2018 2:03:59 PM CEST] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 0.24 minutes.
Interval 007 with error:
terminate called after throwing an instance of 'VCF2TileDBException'
what(): VCF2TileDBException : Incorrect cell order found - cells must be in column major order. Previous cell: [ 1, 117999971 ] current cell: [ 1, 11799997
The most likely cause is unexpected data in the input file:
(a) A VCF file has two lines with the same genomic position
(b) An unsorted CSV file
(c) Malformed VCF file (or malformed index)
See point 2 at: https://github.com/Intel-HLS/GenomicsDB/wiki/Importing-VCF-data-into-GenomicsDB#organizing-your-data
I don't understand why in some cases/intevals I don't have errors while in other cases/intervals I have errors, during the same analysis... considering that I start from the same gVCF !!! Is it relative to the padding regions during HaplotypeCaller?