fastest way of getting total number of variants in VCF via picard?

biogreenbiogreen LondonMember

What is the fastest way of getting the total number of variants in a VCF file? (using picard-tools-1.119, via SplitVcfs.jar.)

So far the fastest way I could have done it was this:

private static int getNumVariants(VCFFileReader reader) {
int totalVariants = 0;
final CloseableIterator iterator = reader.iterator();
while (iterator.hasNext()) {iterator.next(); totalVariants++; }
iterator.close();

  return totalVariants;

}

  • but this appears to iterate through the entire VCF file which for large files seems very inefficient...

I am thinking that there must be a faster way. After all, the number of variants is simply:
total number of lines in file - number of lines in header?

Any way to get this?

Thanks
Martin

Tagged:

Answers

Sign In or Register to comment.