The Frontline Support team will be slow to respond December 17-18 due to an institute-wide retreat and offline December 22- January 1, while the institute is closed. Thank you for your patience during these next few weeks. Happy Holidays!
Keep "species" info from BAM to VCF
I am using HaplotypeCaller (GATK v3.5) with an input BAM file which has a header line like this (just a fake example):
@SQ SN:chr1 LN:100000 SP:Arabis thal AS:2 M5:8668a646eada2f4 UR:file:refgenome_Atha_v2.fa
But the output VCF only has a subset of this information:
Is there a way to obtain something like this instead? (i.e. also indicate species, assembly and MD5 sum)
The information in the BAM file initially comes from a "dict" file generated by Picard CreateSequenceDictionary. So I tried to feed this "dict" file with the VCF file to Picard UpdateVcfSequenceDictionary, but it didn't give me species nor mD5 sum:
Thank you in advance,