Attention:
The frontline support team will be unavailable to answer questions on April 15th and 17th 2019. We will be back soon after. Thank you for your patience and we apologize for any inconvenience!

fastest way of getting total number of variants in VCF via picard?

biogreenbiogreen LondonMember

What is the fastest way of getting the total number of variants in a VCF file? (using picard-tools-1.119, via SplitVcfs.jar.)

So far the fastest way I could have done it was this:

private static int getNumVariants(VCFFileReader reader) {
int totalVariants = 0;
final CloseableIterator iterator = reader.iterator();
while (iterator.hasNext()) {iterator.next(); totalVariants++; }
iterator.close();

  return totalVariants;

}

  • but this appears to iterate through the entire VCF file which for large files seems very inefficient...

I am thinking that there must be a faster way. After all, the number of variants is simply:
total number of lines in file - number of lines in header?

Any way to get this?

Thanks
Martin

Tagged:

Answers

Sign In or Register to comment.