It looks like you're new here. If you want to get involved, click one of these buttons!
Trying to downsample in an orderly fashion in the name of experimentation, and in doing so would like to specify just one chromosome for the experiment - so I picked chromosome 17 with -L and a coverage of 30x with -dcov 30. This came up:
I was hoping to poke through results from using the HaplotypeCaller with many different simulated depths of coverage for several samples. I read that one can use -dfrac instead, and that it might even be more appropriate, though I was hoping to find out what level of coverage led to what level of results and using -dfrac feels much less specific as it appears to toss a fraction of however many reads where at a given position, rather then tossing reads over a certain coverage. Thus with -dfrac, I could say that my sample had an average of 30x for this chromosome and I tossed half so theoretically I've simulated 15x depth of coverage...
Which approach would be more representative of reality? Using -dfrac to simulate a certain depth of coverage, or -dcov assuming I didn't have the 200 restriction?
Thanks for any help/discussion! -Tristan