Latest Release: 8/9/18
Release Notes can be found here.

Errors related to connection issues internet proxy

I have aggregate_bismark_output_grch38 method configuration in aryee-merkin/dna-methylation-pipeline-paper workspace. It uses some bioconductor packages to access annotation information from their server. In the past few weeks I am experiencing this issue with large number of samples. Following is the error message that leads to failure URL 'http://bioconductor.org/BiocInstaller.dcf': status was 'Couldn't resolve host name'. Based on biostar forum (https://support.bioconductor.org/p/104731/) it seems like a problem related to network connection. I have been experiencing this issue in the past few weeks and it was working fine before also I only see this error for large sample sets. I am not sure whether this is something related to FireCloud or something else. So I just want to see of it is something related to network connection issues in FireCloud.

Answers

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Broadie, Moderator

    Hi @DivyKangeyan - to confirm, when you run this method config on small sample sets, it gets the packages just fine and this error does not appear? If so, how many samples are in the large sample set and the small?

  • DivyKangeyanDivyKangeyan Member, Broadie

    Hi @Tiffany_at_Broad The method config works fine for 100 samples but when I try 1000 samples I am getting this error.

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Broadie, Moderator

    @DivyKangeyan - ok thanks for this info & sharing your workspace. We will look more into this on Monday.

  • DivyKangeyanDivyKangeyan Member, Broadie

    Hi @Tiffany_at_Broad, are there any updates on this issue?

  • Tiffany_at_BroadTiffany_at_Broad Cambridge, MAMember, Broadie, Moderator

    Hi @DivyKangeyan we believe because you are hitting an external web server rapidly in your workflows, you're probably getting rate limited by those servers. Do you know if they released any infrastructure changes that are impacting your ability to call large sample sets?

  • DivyKangeyanDivyKangeyan Member, Broadie

    I will look into whether there any infrastructure changes in the bioconductor eco system. Can you explain a bit on what you meant by workflows hitting external web server rapidly? Is there way I can prevent that in my workflow.

  • abaumannabaumann Broad DSDEMember, Broadie

    So depending on the site you are hitting, it may not be able to handle the load you are throwing at it with 1000 workflows (like it could with 100), or it might actually throttle you or stop responding to you to limit how many requests you can give it. When you launch workflows in FireCloud, we try to launch as many in parallel as we can (up to certain points).

    Unfortunately it would be difficult to limit how much activity you are throwing at their servers across 1000 parallel running workflows, but you could break up your 1000 workflows into batches of sizes that are small enough that you don't hit these errors. If you made for instance 10 sample sets and ran them each 1 at a time, then you could get through 1000 without error.

  • DivyKangeyanDivyKangeyan Member, Broadie

    Hi @abaumann The method configuration uses participant set as the submission entity. So when I say 100 vs 1000 samples it aggregates across that many samples but there is only 1 job running. Based on what I see the job only access the server once after aggregating all the samples. So I am not sure if it has something to do with the job load.

  • abaumannabaumann Broad DSDEMember, Broadie

    It seems really odd that this is affected by a larger single job workflow vs a smaller single job workflow - is this network call sending or returning a larger amount of information when it's a bigger workflow? Is this still consistently failing like it was? If it's still failing on larger workflows we would need to dig deeper into what call it is making.

Sign In or Register to comment.