miked

About

Username
miked
Joined
Visits
23
Last Active
Roles
Member
Points
33
Badges
7

Comments

  • Hello, I'm able to run Oncotator v1.5.1.0 successfully on many samples. I'm using database source oncotator_v1_ds_Jan262015.tar.gz The input file is a VCF and I'm using SIMPLE_TSV format as output. I'm not able to find the COSMIC ID for the varian…
  • @Geraldine_VdAuwera, I double checked the gVCFs from CombineGVCFs and the SB tag is present (my mistake earlier). I'm not able to see the SOR annotation in my final GenotypeGVCFs multi-sample pVCF. I looked at the documentation for GenotypeGVCFs a…
    in gvcf fields Comment by miked July 2014
  • Thanks. I looked at VariantAnnotator however I don't believe it fits well with the HC gVCF multi-sample pipeline. It's great to be able to generate a gVCF and remove/archive the BAM files as a way to overcome their significant storage requirements.…
    in gvcf fields Comment by miked July 2014
  • Thanks for the response.. I broke up the human genome into 25MB chunks and I'm running all in parallel. I'm not sure it's working as it should. For example, one of the last chunks on chromosome 7 ends up being 9,138,656 MB. Here is what the output …
  • Hello, How can I include the information in the SB tag in the multi-sample VCF generated by GenotypeGVCFs? I'm running version v3.2-2-gec30cee and it's not included in each sample column. This would be essential for our variant QC processes. Any he…
    in gvcf fields Comment by miked July 2014
  • Also, which tools do I use to merge back into whole chromosome multi-sample VCFS? CatVariants? something else? Thanks
  • I have access to more compute nodes.. I'm processing WGS data so parallelizing by exome intervals wouldn't really work. Can I split the GenotypeGVCFs step by say ~25 MB chunks.. This will require around 120 nodes to process in parallel. Do I need t…
  • Hello, Can somebody from the GATK support team confirm if the BI/BD tags are being used with HC ? I recall running HC on some BAMs without the tags ( we removed the tags to reduce the footprint ) and it didn't complain. Any response is appreciated.
  • Can I process a gVCF generated by HC v3.1 downstream with CombineGVCFs and GenotypeGVCFs v3.2 ? Does this cause backwards incompatibility: "Reads are now realigned to the most likely haplotype before being used by the annotations, so AD and DP…
  • Thanks for the response. In the HC gVCF the genotype call was 0/4 and after GenotypeGVCF the call became 0/1. Any reason why that is? I understand the genotypes are collapsed after CombineGVCFs and the joint-calling process kicks in during Genotyp…
  • Hello, I would like to understand how HC is storing genotype information in the gVCF . For a single-sample gVCF : 3 128672437 rs146586501 T C,TCC,TCCCTCCCCCTCC, 0 . DB;DP=4;MLEAC=0,0,0,1;MLEAF=0.00,0.00,0.00,0.…
  • Thanks for the response. It's very helpful to know that DP and AD are depth values coming out of the original BAM and not after reassembly. For this discussion, I'm focusing on the VCF that is coming out of GenotypeGVCFs. * Does this mean every va…
  • @Geraldine_VdAuwera, I'm running the gVCF reference model pipeline as follows: HC using v3.1 CombineGVCFs nightly build GenotypeGVCFs nightly build I'm trying to understand the difference between 0/0 and ./. for the GT tag when looking at the fina…
  • @Geraldine_VdAuwera, I have ~850 30X WGS gVCFs that were generated individually using HC version 3.1-1-g07a4bf8 . I'm now running CombineGVCFs in batches of 200. I'm getting long estimated runtimes: INFO 11:32:36,984 ProgressMeter - Loca…
  • @Geraldine_VdAuwera, I made a post to this thread yesterday and I got the message that it was 'awaiting moderator approval'.. It still does not appear on this thread. Should I repost with the questions that I have? It could have been thrown into th…
  • Geraldine, I'm experience the same issue. When working with a WGS BAM PrintReads writes approximately 900GB of temp files after I point it to a different location for temp using -Djava.io.tmpdir=/path/to/tmpdir This is fine for a handful of sampl…
  • Hello, I have been running GATK v1.6.2 on several samples. It seems the way I had initially had run GATK for indel-realignment and quality re-calibration steps are reversed. For example, in order of processing, I ran: * MarkDuplicates * Count Cova…