Bug Bulletin: The GenomeLocPArser error in SplitNCigarReads has been fixed; if you encounter it, use the latest nightly build.

Clover coverage analysis with ant [RETIRED]

Mark_DePristoMark_DePristo Posts: 153Administrator, GATK Developer admin
edited April 4 in Developer Zone

Introduction

This document describes the workflow we use within GSA to do coverage analysis of the GATK codebase. It is primarily meant as an internal reference for team members, but are making it public to provide an example of how we work. There are a few mentions of internal server names etc.; please just disregard those as they will not be applicable to you.

Build the GATK, and run tests with clover

ant clean with.clover unittest

Note that you have to explicitly disable scala (due to a limitation in how it's currently integrated in build.xml). Note you can use things like -Dsingle="ReducerUnitTest" as well.

It seems that clover requires a lot of memory, so a few things are necessary:

setenv ANT_OPTS "-Xmx8g"

There's plenty of memory on gsa4, so it's not a problem to require so much memory

Getting more detailed reports

You can add the argument -Dclover.instrument.level=statement if you want line-level resolution on the report, but note this is astronomically expensive for the entire unit test suite. It's fine though if you want to run specific run tests.

Generate the report

> ant clover.report
Buildfile: /Users/depristo/Desktop/broadLocal/GATK/unstable/build.xml

clover.report:
[clover-html-report] Clover Version 3.1.8, built on November 13 2012 (build-876)
[clover-html-report] Loaded from: /Users/depristo/Desktop/broadLocal/GATK/unstable/private/resources/clover/lib/clover.jar
[clover-html-report] Clover: Community License registered to Broad Institute.
[clover-html-report] Loading coverage database from: '/Users/depristo/Desktop/broadLocal/GATK/unstable/.clover/clover3_1_8.db'
[clover-html-report] Writing HTML report to '/Users/depristo/Desktop/broadLocal/GATK/unstable/clover_html'
[clover-html-report] Done. Processed 132 packages in 20943ms (158ms per package).
    [mkdir] Created dir: /Users/depristo/private_html/report/clover
     [copy] Copying 4545 files to /Users/depristo/private_html/report/clover

BUILD SUCCESSFUL

The clover files are present in a subdirectory clover_html as well as copied to your private_html/report directory. Note this can be very expensive given our large number of tests. For example, I've been waiting for the report to generate for nearly an hour on gsa4.

Doing it all at once

ant clean with.clover unittest clover.report

will clean the source, rebuild with clover engaged, run the unit tests, and generate the clover report. Note that currently unittests may be failing due to classcast and other exceptions in the clover run. We're looking into it. But you can still run clover.report after the failed run, as the db contains all of the run information, even through it failed (though failed methods won't be counted).

Here's a real-life example of assessing coverage in all BQSR utilities at once:

ant clean with.clover unittest -Dclover.instrument.level=statement -Dsingle="recalibration/*UnitTest" clover.report

Current annoyance

Clover can make the tests very slow. Currently we are run in method count only mode (we don't have line number resolution (looking into fixing this). Also note that running with clover over the entire unittest set requires 32G of RAM (set automatically by ant).

This produces an HTML report that looks like the following screenshots

image image

Using clover to make better unittests

This workflow is appropriate for developing unit tests for a single package or class. The turn-around time for clover on a single package is very fast, even with statement-level coverage. The overall workflow looks like:

  1. run unittests with clover enabled for your package or class.
  2. explore clover HTML report, noting places where test coverage is lacking
  3. expand unit tests
  4. repeat until satisfied

Here's a concrete example. Right now I'm looking at the unit test coverage for GenomeLoc, one of the earliest and most important classes in the GATK. I really want good unit test coverage here. So I start by running GenomeLoc unit tests specifically:

ant clean with.clover unittest -Dsingle="GenomeLocUnitTest" -Dclover.instrument.level=statement clover.report

Next, I open up the clover coverage report in clover_html/index.html in my GATK directory, and landing on the Dashboard. Everything looks pretty bad, but that's because I only ran the GenomeLoc tests, and it displays the entire project coverage. I click on the "Coverage" link in the upper-left frame, and scroll down to the package where GenomeLoc lives (org.broadinstitute.sting.utils). At the bottom of this page I find my two classes, GenomeLoc and GenomeLocParser.CachingSequenceDictionary:

image

These have ~50% statement-level coverage each. Not ideal, really.

Let's dive into GenomeLoc itself a bit more. Clicking on the GenomeLoc link brings up to the code coverage page. Here you can see a few things very quickly.

image image

  • Some of the methods are greyed out. This is because they are considered by our clover report as trivial getter/setter methods, and shouldn't be counted.
  • Some methods have reasonably good test coverage, such as disjointP with thousands of tests.
  • Some methods have some tests, but a very limited number, such as contiguousP which only has 2 tests. Now maybe that's enough, but it's worth thinking about whether 2 tests would really cover all of the test cases for this method.
  • Some methods (such as intersect) have good coverage on some branches but no coverage on what looks like an important branch (the unmapped handling code).
  • Some methods just don't have any tests at all (subtract), which is very dangerous if this method is an important one used throughout the GATK.

For methods with poor test coverage (branches or overall) I'd look into their uses, and try to answer a few questions:

  • How widely used this is function? Is this method used at all? Perhaps it's just unused code that can be deleted. Perhaps its only used in one specific class, and it's not worth my time testing it (a dangerous statement, as basically any untested code can assumed to be broken now, or some point in the future). If it's widely used, I should design some unit tests for it.
  • Are the uses simpler than the full code itself? Perhaps a simpler function can be extracted, and it tested.

If the code needs tests, I would design specific unit tests (or data providers that cover all possible cases) for these function. Once that newly-written code is in place, I would rerun the ant tasks above to get updated coverage information, and continue until I'm satisfied.

Screen Shot 2012-12-20 at 2.48.02 PM.png
1920 x 1200 - 480K
Screen Shot 2012-12-20 at 2.51.53 PM.png
1920 x 1200 - 552K
Screen Shot 2012-12-31 at 9.35.08 AM.png
2880 x 1800 - 807K
Screen Shot 2012-12-31 at 9.38.08 AM.png
2880 x 1800 - 872K
Screen Shot 2012-12-31 at 9.41.44 AM.png
2880 x 1800 - 913K
Post edited by Geraldine_VdAuwera on

-- Mark A. DePristo, Ph.D. Co-Director, Medical and Population Genetics Broad Institute of MIT and Harvard

Sign In or Register to comment.