Reducing GATK/Picard tools Docker image size
I'm optimizing calls in a WDL (run on Google cloud), and while looking through the logs I realized that it takes full 3 minutes to pull
broadinstitute/gatk image, which is now 3GB in size. I'm using this image solely for Picard tools right now, as the official
broadinstitute/picard image does not play well with Cromwell due to its use of ENTRYPOINT.
Could something be done to optimize the time it takes to pull the image? Ideally, we'd like this to take <1 min, because our computational tasks will be short (~3-5 min), so another 3 min spent on pulling the image is a significant increase to the overall task time. I'm leaning towards using an (unofficial?) image at https://quay.io/repository/biocontainers/picard, because those are only ~120 MB in size, so pulling them is done in seconds. We could also build our own images, but that would add to the maintenance overhead. Another workaround is to run
openjdk image and then
wget picard JAR from GitHub at run time (that still feels hacky however).