We've moved!
This site is now read-only. You can find our new documentation site and support forum for posting questions here.
Be sure to read our welcome blog!

GVCF generated from lane-wise bam and merged bam

fazulurfazulur hyderabadMember

Dear GATK Team,

I ran GATK4 variant calling as per best practices on one WGS sample sequenced in lanes.

Steps followed to get MergedBAM : Aligned lane wise fastq separately, remove duplicates, merge lane bam and again Markduplicates. Variant calling on mergebam. I followed the below reference.


Also I ran variant calling on lane-wise bam separately in order to compare 2 lane g.vcf files with merged bam g.vcf
When I compare gvcf generated from individual lane bam and merged bam. it is huge difference in size.

Sample # of lines GVCF Size in GB
lane-1 658655987 7.6G
lane-2 442845977 5.6G
Merged 83563153 1.3G

I have seen less difference when I convert gvcf to vcf using gvcftools extract_variants.But at g.vcf level I am not sure why I am getting this much difference in file sizes.

Could you please help me.

Thanks In Advance
Fazulur Rehaman

Best Answer


Sign In or Register to comment.