What input files does the GATK accept / require?
Analyses done with the GATK typically involve several (though not necessarily all) of the following inputs:
- Reference genome sequence in FASTA format
- Unmapped sequencing data in uBAM format (alternative to FASTQ)
- Mapped sequencing data in SAM, BAM or CRAM format
- List of intervals
- Variant calls in VCF format or GVCF format (can be gzipped)
- Supplementary resources (e.g. known variants) as documented by the relevant tools