Notice:
If you happen to see a question you know the answer to, please do chime in and help your fellow community members. We appreciate your help!

Test-drive the GATK tools and Best Practices pipelines on Terra


Check out this blog post to learn how you can get started with GATK and try out the pipelines in preconfigured workspaces (with a user-friendly interface!) without having to install anything.

Picard in Docker

Chris HChris H HullMember
edited August 2015 in Ask the GATK team

Dear GATK/Picard team,
I am new to Picard and was just about to build it, when I noticed that there is a Picard Docker image available - great idea! Is this functional yet? I couldn't find any instructions on how to use it. I gave it a try and did the following:
pulled the picard image docker pull broadinstitute/picard
then I am trying to run it like so:
docker run -i -t broadinstitute/picard FastqToSam -h

This gives me the usage for FastqToSam so I thought I was on the right track.

However, when running a full fetched command like:
sudo docker run -i -t broadinstitute/picard FastqToSam F1=009_S1_L001_R1_001.fastq.gz F2=009_S1_L001_R2_001.fastq.gz O=test.bam SM=female

I get the following error:

[Tue Aug 25 20:09:49 UTC 2015] picard.sam.FastqToSam FASTQ=009_S1_L001_R1_001.fastq.gz FASTQ2=F1=009_S1_L001_R2_001.fastq.gz OUTPUT=test.bam SAMPLE_NAME=female USE_SEQUENTIAL_FASTQS=false READ_GROUP_NAME=A SORT_ORDER=queryname MIN_Q=0 MAX_Q=93 STRIP_UNPAIRED_MATE_NUMBER=false ALLOW_AND_IGNORE_EMPTY_LINES=false VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false GA4GH_CLIENT_SECRETS=client_secrets.json
[Tue Aug 25 20:09:49 UTC 2015] Executing as [email protected] on Linux 3.13.0-54-generic amd64; Java HotSpot(TM) 64-Bit Server VM 1.8.0_45-b14; Picard version: 1.138() JdkDeflater
[Tue Aug 25 20:09:49 UTC 2015] picard.sam.FastqToSam done. Elapsed time: 0.00 minutes.
Runtime.totalMemory()=504889344
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" htsjdk.samtools.SAMException: Cannot read non-existent file: /usr/picard/009_S1_L001_R1_001.fastq.gz
at htsjdk.samtools.util.IOUtil.assertFileIsReadable(IOUtil.java:326)
at picard.sam.FastqToSam.doWork(FastqToSam.java:226)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:206)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:95)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:105)

It seems as if Picard in the container can't find my input files. How would I pass my files correctly to the container?
Any help would be much appreciated!!

Thanks in advance for your time!

cheers,
Christoph

Issue · Github
by Sheila

Issue Number
140
State
closed
Last Updated
Assignee
Array
Milestone
Array
Closed By
chandrans

Answers

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Chris H
    Hi Christoph,

    I am not sure if we support Picard in Docker. I will have to check with the team. Can you try running Picard without Docker, and let me know if that works?

    Thanks,
    Sheila

  • Chris HChris H HullMember

    Hi Sheila,

    Thanks for your response!
    I forked the Picard Github repository and made a few small changes to the Dockerfile and the docker_helper.sh script (see here). Then I build the image from the new Dockerfile (docker build -t custompicard .). Now I can use it like an executable.

    E.g. assuming my bam files are in the current working directory I run:
    sudo docker run -i -t -v $(pwd):/usr/working custompicard MergeSamFiles I=first.bam I=second.bam O=merged.bam

    Don't know if that was the initial intention for the Docker image, but it works - thanks for making it available!!

    cheers,
    Christoph

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Chris H
    Hi Christoph,

    Wonderful news! Thank you for reporting your solution :smile:

    -Sheila

  • Chris HChris H HullMember

    Hi Sheila,

    No problem! Do you want me to send a pull-request to the Picard Github master branch?

    cheers,
    Christoph

  • SheilaSheila Broad InstituteMember, Broadie, Moderator admin

    @Chris H
    Hi Christoph,

    Sure. That would be perfect.

    Thanks,
    Sheila

  • bradtbradt Member, Broadie, Dev

    @Chris H @Sheila

    Thanks for the fixups to the docker image! I had a question or two on the github PR, after which I will merge it. Chris, we appreciate your contribution.

    A couple things I want to capture here for posterity:

    One thing I do notice is that, looking at the example docker run command Chris supplied, I don't believe the -i and -t flags are necessary. My understanding is they are used for interactive sessions, as when running /bin/bash. For example, I was able to successfully execute the following command:
    docker run -v $(pwd):/usr/working custompicard SamToFastq I=ubams_testing/newnewnew.bam F=test.fastq F2=test2.fastq

    Also, unfortunately it turns out docker's ENTRYPOINT command does not play nice with some systems we are building internally. So we have an agenda item to remove it. At that point, you would specify the full java invocation as an argument to the container run. In place of the entrypoint command will probably be a CMD simply echoing a usage message with instructions on composing an appropriate run command. This will likely happen soon. In the interim, Chris' command will be the appropriate way to run the container as an executable.

    I'll update this thread with a new example call as soon as any changes are made to the access pattern.

  • dinvladdinvlad Member, Broadie, Dev

    Hi Brad,

    Do you know if there are still plans to remove ENTRYPOINT from broadinstitute/picard?

    Thanks

  • bradtbradt Member, Broadie, Dev

    Hi @dinvlad,

    Sorry for not seeing your note. I've actually moved from the Picard team over to a new team. I'm not sure if anyone is actively maintaining this image. @jcarey may have information here. I'll ask him if there are any plans to remove ENTRYPOINT on broadinstitute/picard. We tend to use a different image in our operations: https://hub.docker.com/r/broadinstitute/genomes-in-the-cloud/

    That's a big image that includes picard, GATK, samtools, BWA, and some other things.

    Sorry I can't be more helpful

    Thanks
    Brad

  • dinvladdinvlad Member, Broadie, Dev

    Thanks @bradt,

    Yes we've also used genomes-in-the-cloud, however it's rarely updated and the latest version of Picard from it leads to exceptions in our workflows (this is not the case with the most recent version from an unofficial repo). I'll check with the green team if that could be updated.

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    Hi @dinvlad,

    Please note that GATK4 now wraps Picard tools. Each GATK4 release updates the version of Picard to the latest available and is made available as a Docker image at https://hub.docker.com/r/broadinstitute/gatk/.

  • dinvladdinvlad Member, Broadie, Dev

    Thanks @shlee, we're aware of that but are cautious about the size of GATK4 image, as discussed here

  • shleeshlee CambridgeMember, Broadie ✭✭✭✭✭

    @dinvlad, yes we're aware of the large size. You are free to make your own Docker images for our open-source programs Picard and GATK.

Sign In or Register to comment.